-
Notifications
You must be signed in to change notification settings - Fork 791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testset generator not working using AzureOpenai key. #636
Comments
Any updates or workaround over above problem? |
Hey @rahul1-995 sorry for the late reply, are you able to make the evaluation work with azure open ai? |
Hi @shahules786 , I am not facing problem while evaluating using Azureopenai, I am facing problem with testset generation using azure, I have given code snippet above, please refer error below: |
Hi Rahul can you explain how you have used the evaluation using azure openai if you haven't got the test data generated I am also facing the same problem did you generated the test data from any other method than please tell I also need to create the synthetic test data |
@Pranshul200, I am currently using openai api key for testset generation. |
Hey @rahul1-995 did you try updating langchain-core as requested? |
Yes @shahules786 , I have tried updating langchain-core, still not able to run the testset generator... |
@rahul1-995 Can you try using the version of #670 (not merged yet)? git clone https://github.com/mspronesti/ragas/
cd ragas
pip install . The usage with Azure OpenAI would be from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
import os
os.environ["AZURE_OPENAI_API_KEY"] = "..."
os.environ["AZURE_OPENAI_ENDPOINT"] = "..."
os.environ["OPENAI_API_VERSION"] = "2023-12-01-preview"
generator_llm = AzureChatOpenAI(deployment_name="...")
critic_llm = AzureChatOpenAI(deployment_name="...")
embeddings = AzureOpenAIEmbeddings(deployment="...")
generator = TestsetGenerator.from_langchain(
generator_llm,
critic_llm,
embeddings
) |
I'm not the original poster but I had the same problem and it disappeared in this version. Thanks :) |
@wikp Thanks for the confirmation! |
…ngs (#670) ## **User description** The current version of `with_openai` contains a hardcoded instantiation of `langchain_openai.chat_models.ChatOpenAI`, which makes `TestsetGenerator` very limited and not compatible with completion models, Azure OpenAI models, and open-source models. This PR extends `TestsetGenerator` to any `BaseLanguageModel` and `Embeddings` from langchain for versatility, addressing #230, #342, #635, and #636. Lastly, I've removed all the occurrences of mutable default arguments (bad antipattern, read [here](https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments)). --------- Co-authored-by: Shahules786 <Shahules786@gmail.com> Co-authored-by: jjmachan <jamesjithin97@gmail.com>
which version? |
I am trying to generate synthetic data using azure openai api, taking long time to run and after that getting error.
Ragas version: 0.1.1
Python version: 3.10
Code to Reproduce
import os
os.environ["AZURE_OPENAI_API_KEY"] = "AZURE_OPENAI_API_KEY"
azure_configs_gen = {
"base_url": "",
"model_deployment": "gpt-35-turbo-16k",
"model_name": "gpt-35-turbo-16k",
"embedding_deployment": "text-embedding-ada-002",
"embedding_name": "text-embedding-ada-002",
}
azure_configs_critic = {
"base_url": "",
"model_deployment": "gpt-4",
"model_name": "gpt-4",
"embedding_deployment": "text-embedding-ada-002",
"embedding_name": "text-embedding-ada-002",
}
generator_llm = AzureChatOpenAI(
openai_api_version="2023-05-15",
azure_endpoint=azure_configs_gen["base_url"],
azure_deployment=azure_configs_gen["model_deployment"],
model=azure_configs_gen["model_name"],
validate_base_url=False,
)
generator_llm = LangchainLLMWrapper(generator_llm)
critic_llm = AzureChatOpenAI(
openai_api_version="2023-05-15",
azure_endpoint=azure_configs_critic["base_url"],
azure_deployment=azure_configs_critic["model_deployment"],
model=azure_configs_critic["model_name"],
validate_base_url=False,
)
critic_llm = LangchainLLMWrapper(generator_llm)
embed_model = AzureOpenAIEmbeddings(
openai_api_version="2023-05-15",
azure_endpoint=azure_configs_gen["base_url"],
azure_deployment=azure_configs_gen["embedding_deployment"],
model=azure_configs_gen["embedding_name"],
)
embed_model = LangchainEmbeddingsWrapper(embed_model)
pdf_path = r"machinelearning-lecture01.pdf"
documents = SimpleDirectoryReader(input_files=[pdf_path]).load_data()
#type(documents)
splitter = TokenTextSplitter(chunk_size=2000, chunk_overlap=100)
keyphrase_extractor = KeyphraseExtractor(llm=generator_llm)
docstore = InMemoryDocumentStore(
splitter=splitter,
embeddings=embed_model,
extractor=keyphrase_extractor,
)
from ragas.testset import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context
test_generator = TestsetGenerator(
generator_llm=generator_llm,
critic_llm=critic_llm,
embeddings=embed_model,
docstore=docstore,
)
testset = test_generator.generate_with_llamaindex_docs(documents=documents[:5],
test_size=3,distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})
Error trace
Exception in thread Thread-7:
Traceback (most recent call last):
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\threading.py", line 1045, in _bootstrap_inner
self.run()
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\executor.py", line 75, in run
results = self.loop.run_until_complete(self._aresults())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\asyncio\base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\executor.py", line 63, in _aresults
raise e
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\executor.py", line 58, in _aresults
r = await future
^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\asyncio\tasks.py", line 615, in _wait_for_one
return f.result() # May raise f.exception().
^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\executor.py", line 91, in wrapped_callable_async
return counter, await callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\testset\evolutions.py", line 150, in evolve
) = await self.aevolve(current_tries, current_nodes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\testset\evolutions.py", line 253, in aevolve
passed = await self.node_filter.filter(current_nodes.root_node)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\testset\filters.py", line 54, in filter
results = await self.llm.generate(prompt=prompt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\llms\base.py", line 92, in generate
return await agenerate_text_with_retry(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\tenacity_asyncio.py", line 88, in async_wrapped
return await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\tenacity_asyncio.py", line 47, in call
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\tenacity_init.py", line 325, in iter
raise retry_exc.reraise()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\tenacity_init.py", line 158, in reraise
raise self.last_attempt.result()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\concurrent\futures_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\concurrent\futures_base.py", line 401, in __get_result
raise self._exception
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\tenacity_asyncio.py", line 50, in call
result = await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rabalasa\Anaconda3\envs\genai\Lib\site-packages\ragas\llms\base.py", line 177, in agenerate_text
result = await self.langchain_llm.agenerate_prompt(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'LangchainLLMWrapper' object has no attribute 'agenerate_prompt'. Did you mean: 'agenerate_text'?
ExceptionInRunner Traceback (most recent call last)
Cell In[4], line 18
9 from ragas.testset.evolutions import simple, reasoning, multi_context
11 test_generator = TestsetGenerator(
12 generator_llm=generator_llm,
13 critic_llm=critic_llm,
14 embeddings=embed_model,
15 docstore=docstore,
16 )
---> 18 testset = test_generator.generate_with_llamaindex_docs(documents=documents[:5],
19 test_size=3,distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})
File ~\Anaconda3\envs\genai\Lib\site-packages\ragas\testset\generator.py:128, in TestsetGenerator.generate_with_llamaindex_docs(self, documents, test_size, distributions, with_debugging_logs, is_async, raise_exceptions, run_config)
113 def generate_with_llamaindex_docs(
114 self,
115 documents: t.Sequence[LlamaindexDocument],
(...)
122 ):
123 # chunk documents and add to docstore
124 self.docstore.add_documents(
125 [Document.from_llamaindex_document(doc) for doc in documents]
126 )
--> 128 return self.generate(
129 test_size=test_size,
130 distributions=distributions,
131 with_debugging_logs=with_debugging_logs,
132 is_async=is_async,
133 run_config=run_config,
134 raise_exceptions=raise_exceptions,
135 )
File ~\Anaconda3\envs\genai\Lib\site-packages\ragas\testset\generator.py:246, in TestsetGenerator.generate(self, test_size, distributions, with_debugging_logs, is_async, raise_exceptions, run_config)
244 test_data_rows = exec.results()
245 if test_data_rows == []:
--> 246 raise ExceptionInRunner()
248 except ValueError as e:
249 raise e
ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass
raise_exception=False
incase you want to show only a warning message instead.Expected behavior
It should generate the test dataset from the input pdf..
Additional context
Same error occur sometimes when we are using openai key instead of azure openai key.
The text was updated successfully, but these errors were encountered: