-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to read the knowledge base in an MD document and use OpenAPI Q&A #5
Labels
enhancement
New feature or request
Comments
Can you provide a detailed description of the requirements? I'm a bit unclear about what needs to be done. |
HamaWhiteGG
added a commit
that referenced
this issue
Jul 13, 2023
@ltc17681102655 Supported, you can view the example code in RetrievalMarkdownExample public class RetrievalMarkdownExample {
public static final String NAMESPACE = "markdown";
public static void main(String[] args) {
// Load Notion page as a markdown file
String path = "docs/extras/use_cases/question_answering/notion_db/";
var loader = new NotionDirectoryLoader(path);
var docs = loader.load();
var mdFile = docs.get(0).getPageContent();
// Let's create groups based on the section headers in our page
List<Pair<String, String>> headersToSplitOn = List.of(Pair.of("###", "Section"));
MarkdownHeaderTextSplitter markdownSplitter = new MarkdownHeaderTextSplitter(headersToSplitOn);
List<Document> mdHeaderSplits = markdownSplitter.splitText(mdFile);
// Define our text splitter
var textSplitter = RecursiveCharacterTextSplitter.builder()
.chunkSize(500)
.chunkOverlap(0)
.keepSeparator(true)
.build();
var allSplits = textSplitter.splitDocuments(mdHeaderSplits);
// Build pinecone and keep the metadata
var vectorStore = initializePineconeIndex(NAMESPACE, allSplits);
// Define our metadata
var metadataFieldInfo = List.of(
new AttributeInfo("Section", "Part of the document that the text comes from",
"string or list[string]"));
var documentContentDescription = "Major sections of the document";
// Define self query retriever
var llm = OpenAI.builder().temperature(0).requestTimeout(30).build().init();
var retriever = SelfQueryRetriever.fromLLM(llm, vectorStore, documentContentDescription, metadataFieldInfo);
// create chat or Q+A apps that are aware of the explicit document structure.
var chat = ChatOpenAI.builder().temperature(0).build().init();
var qaChain = RetrievalQa.fromChainType(chat, retriever);
var result = qaChain.run("Summarize the Testing section of the document");
println(result);
}
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
How to read the knowledge base in an MD document and use OpenAPI Q&A
The text was updated successfully, but these errors were encountered: