How to read the knowledge base in an MD document and use OpenAPI Q&A #5

ltc17681102655 · 2023-06-14T08:22:40Z

How to read the knowledge base in an MD document and use OpenAPI Q&A

HamaWhiteGG · 2023-06-15T16:55:23Z

Can you provide a detailed description of the requirements? I'm a bit unclear about what needs to be done.

…nAPI Q&A

HamaWhiteGG · 2023-07-13T12:01:55Z

@ltc17681102655 Supported, you can view the example code in RetrievalMarkdownExample

public class RetrievalMarkdownExample {

    public static final String NAMESPACE = "markdown";

    public static void main(String[] args) {
        // Load Notion page as a markdown file
        String path = "docs/extras/use_cases/question_answering/notion_db/";
        var loader = new NotionDirectoryLoader(path);
        var docs = loader.load();
        var mdFile = docs.get(0).getPageContent();

        // Let's create groups based on the section headers in our page
        List<Pair<String, String>> headersToSplitOn = List.of(Pair.of("###", "Section"));
        MarkdownHeaderTextSplitter markdownSplitter = new MarkdownHeaderTextSplitter(headersToSplitOn);
        List<Document> mdHeaderSplits = markdownSplitter.splitText(mdFile);

        // Define our text splitter
        var textSplitter = RecursiveCharacterTextSplitter.builder()
                .chunkSize(500)
                .chunkOverlap(0)
                .keepSeparator(true)
                .build();
        var allSplits = textSplitter.splitDocuments(mdHeaderSplits);

        // Build pinecone and keep the metadata
        var vectorStore = initializePineconeIndex(NAMESPACE, allSplits);

        // Define our metadata
        var metadataFieldInfo = List.of(
                new AttributeInfo("Section", "Part of the document that the text comes from",
                        "string or list[string]"));
        var documentContentDescription = "Major sections of the document";

        // Define self query retriever
        var llm = OpenAI.builder().temperature(0).requestTimeout(30).build().init();
        var retriever = SelfQueryRetriever.fromLLM(llm, vectorStore, documentContentDescription, metadataFieldInfo);

        // create chat or Q+A apps that are aware of the explicit document structure.
        var chat = ChatOpenAI.builder().temperature(0).build().init();
        var qaChain = RetrievalQa.fromChainType(chat, retriever);
        var result = qaChain.run("Summarize the Testing section of the document");
        println(result);
    }
}

HamaWhiteGG mentioned this issue Jul 7, 2023

Support MarkdownHeaderTextSplitter #32

Merged

HamaWhiteGG added a commit that referenced this issue Jul 13, 2023

refs #5: How to read the knowledge base in an MD document and use Ope…

44d3cbc

…nAPI Q&A

HamaWhiteGG added the enhancement New feature or request label Jul 13, 2023

HamaWhiteGG self-assigned this Jul 13, 2023

HamaWhiteGG closed this as completed Jul 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to read the knowledge base in an MD document and use OpenAPI Q&A #5

How to read the knowledge base in an MD document and use OpenAPI Q&A #5

ltc17681102655 commented Jun 14, 2023

HamaWhiteGG commented Jun 15, 2023

HamaWhiteGG commented Jul 13, 2023

How to read the knowledge base in an MD document and use OpenAPI Q&A #5

How to read the knowledge base in an MD document and use OpenAPI Q&A #5

Comments

ltc17681102655 commented Jun 14, 2023

HamaWhiteGG commented Jun 15, 2023

HamaWhiteGG commented Jul 13, 2023