Skip to content

It is an LLM powered assistant (MultiModal RAG) that can read directories, even images, to gather information about companies. You can use HuggingFace or Ollama models,Claude,OpenAI and Google LLMs.I am inspired by IBM Discovery.

License

Notifications You must be signed in to change notification settings

g-hano/DiscoVery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiscoVery


Introduction

This project introduces an advanced Language Model (LLM) agent designed to understand and analyze company data, inspired by the capabilities seen in IBM's discovery solutions. It can read a variety of file formats and utilizes state-of-the-art local and cloud-based LLMs for data processing and analysis.

DiscoVery Chart

Features

  • Supported File Types: Handles various file types including CSV, TXT, Excel, PDF, Python scripts, Jupyter notebooks, and Markdown files.
  • Data Extraction: Capable of extracting text and images from PDFs, utilizing the Llava multimodal LLM for interpreting images, graphs, and charts.
  • Analysis and Output: Analyzes extracted data and saves the findings in a structured format.
  • Memory and Custom Tools: Incorporates its own memory system and custom tools tailored for data analysis.

Installation

Clone the repository:

git clone https://github.com/g-hano/DiscoVery

Install necessary libraries using the provided requirements.txt file:

cd DiscoVery
pip install -r requirements.txt

Usage

To start analyzing your data with the model, follow these steps:

from tools import PDFExtractor, AskVisionModel, StructuredFileReader, Report

# Create the Agent
llm = Ollama(model="llama3", request_timeout=360)
pdf_tool = FunctionTool.from_defaults(fn=PDFExtractor)
askvision_tool = FunctionTool.from_defaults(fn=AskVisionModel)
directory_tool = FunctionTool.from_defaults(fn=StructuredFileReader)
report_tool = FunctionTool.from_defaults(fn=Report)

with open("prompts/llama_prompt.txt", "r") as f:
    prompt = f.read()

agent = ReActAgent.from_tools(llm=llm,
                               tools=[pdf_tool, askvision_tool, directory_tool, report_tool],
                               verbose=True,
                               memory=ChatMemoryBuffer.from_defaults(llm=llm),
                               chat_history=[
                                   {"role":"system", "content": prompt}])

Supported Models

  • Local LLMs: Support Ollama models, for example META's Llama 3 and Llama 2 models.
  • Anropic Models: Integration with Anthropic's Claude models for advanced data understanding.

Model Verbose Example

Below is a verbose example from the model, demonstrating its capabilities in processing and analyzing data from a PDF document:

Thought: I need to use a tool to help me answer the question.
Action: 📜 PDFExtractor
Action Input: {'path': 'company_stats.pdf'}
Opened document: company_stats.pdf
Saved image to Extracted/image1_1.png
Saved image to Extracted/image2_1.png
Observation: {'Extracted/image1_1.png': 'Company Profile Document \nCompany Information \nName: Nexus Innovations Inc. Founded: 2012
Location: 450 Tech Valley Way, San Jose, USA Industry: Technology & Software Development CEO: Elizabeth Tran Company Overview Nexus Innovations Inc.
is a pioneering software development company specializing in innovative solutions for the healthcare and education sectors. Since its \ninception in 2012,
Nexus has been at the forefront of integrating artificial intelligence with user-frterfaces to enhance operational efficiency and user engagement.
Mission Statement "To empower sectors with cutting-edge technology that simplifies complexity and enhances dations."
However, to create a complete analysis I still need more specific data on the company's financial performance and market position.

Action: 📁 StructuredFileReader
Action Input: {'question': 'What are the key financial metrics and market share data for Nexus Innovations Inc.?'}
Observation: Error: No files found in Extracted.
Thought: It seems there are no structured data files available to answer the question about financial metrics and market share. I will try using the AskVisionModel tool to see if ract any additional relevant information from the images.

Action: 👁‍🗨 AskVisionModel
Action Input: {'content': "What information in the extracted images relates to the company's financial performance or market position?"}

Loaded image as bytes: Extracted\image1_1.png
Loaded image as bytes: Extracted\image2_1.png
Observation:  The image you've provided is a screenshot of an analytics dashboard for a company.
It includes key performance indicators (KPIs) and financial metrics that can be usess the company's financial performance or market position.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

It is an LLM powered assistant (MultiModal RAG) that can read directories, even images, to gather information about companies. You can use HuggingFace or Ollama models,Claude,OpenAI and Google LLMs.I am inspired by IBM Discovery.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages