Clarification on Azure OpenAI usage, CSV file support, and troubleshooting a clustering error in graphrag/graphrag-accelerator #78
-
Hello, I'm new to graphrag and graphrag-accelerator, and I have a few questions after going through the advanced_getting_started.ipynb notebook and attempting to use graphrag with my own data:
Thank you in advance for any clarification you can provide! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @shipaleks - first and foremost, thanks for checking out graphrag and kicking the tires! I'll try to answer what I can here or steer you in the right direction. (1) Think of the It serves as an example of how you could do it, but many of the pieces could be swapped or even removed (e.g., CosmosDB is used to store metadata and status info, among other things, but another datastore could be used). The Accelerator puts an APIM service (and API) in front of the GraphRAG library, and this APIM API key is what the Accelerator notebooks use to access GraphRAG. You can absolutely use Azure OpenAI and (2) The accelerator is hardcoded to support TXT files but as you noted the library itself can support CSV. I'm not sure all of the places where you'd need to adjust things, but I would take a look at the backend container configuration in pipeline-settings.yaml and logic in the data.py file. Then the notebooks would need to be adjusted. I haven't attempted any of this yet, so if you do please report back! (3) This question is probably best suited to be a Discussion (possibly an issue) in the graphrag library repo. Hope this helps put you on a useful path! |
Beta Was this translation helpful? Give feedback.
Hi @shipaleks - first and foremost, thanks for checking out graphrag and kicking the tires!
I'll try to answer what I can here or steer you in the right direction.
(1) Think of the
graphrag-accelerator
as the reference implementation for how you could deploy an end-to-endgraphrag
solution in Azure. It is a set of Infra as Code (IaC) and other scripts that deploy additional cloud services that help flesh out a full solution and host GraphRAG.It serves as an example of how you could do it, but many of the pieces could be swapped or even removed (e.g., CosmosDB is used to store metadata and status info, among other things, but another datastore could be used). The Accelerator puts an APIM …