Skip to content

Commit

Permalink
docs: clarify the training docs
Browse files Browse the repository at this point in the history
  • Loading branch information
gventuri committed Apr 19, 2024
1 parent b735b0a commit 0dd9c4a
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 1 deletion.
40 changes: 40 additions & 0 deletions docs/train.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ For example, you might want the LLM to be aware that your company's fiscal year

To train PandasAI with instructions, you can use the `train` method on the `Agent`, `SmartDataframe` or `SmartDatalake`, as it follows:

The training uses by default the `BambooVectorStore` to store the training data, and it's accessible with the API key.

As an alternative, if you want to use a local vector store (enterprise only for production use cases), you can use the `ChromaDB` or `Qdrant` vector stores (see examples below).

```python
from pandasai import Agent

Expand Down Expand Up @@ -79,6 +83,42 @@ print(response)

Also in this case, your training data is persisted, so you only need to train the model once.

## Training with local Vector stores

If you want to train the model with a local vector store, you can use the local `ChromaDB` or `Qdrant` vector stores. Here's how to do it:

```python
from pandasai import Agent
# An enterprise license might be required for using the vector stores locally
from pandasai.ee.vectorstores import ChromaDB
from pandasai.ee.vectorstores import Qdrant

# Instantiate the vector store
vector_store = ChromaDB()
# or with Qdrant
# vector_store = Qdrant()

# Instantiate the agent with the custom vector store
agent = Agent("data.csv", vectorstore=vector_store)

# Train the model
query = "What is the total sales for the current fiscal year?"
response = """
import pandas as pd
df = dfs[0]
# Calculate the total sales for the current fiscal year
total_sales = df[df['date'] >= pd.to_datetime('today').replace(month=4, day=1)]['sales'].sum()
result = { "type": "number", "value": total_sales }
"""
agent.train(queries=[query], codes=[response])

response = agent.chat("What is the total sales for the last fiscal year?")
print(response)
# The model will use the information provided in the training to generate a response
```

## Troubleshooting

In some cases, you might get an error like this: `No vector store provided. Please provide a vector store to train the agent`. It means no API key has been generated to use the `BambooVectorStore`.
Expand Down
2 changes: 1 addition & 1 deletion examples/from_googlebigquery.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

from pandasai import SmartDataframe

# A license might be required for using Snowflake with PandasAI
# A license might be required for using BigQuery with PandasAI
from pandasai.ee.connectors import GoogleBigQueryConnector

# ENV's
Expand Down

0 comments on commit 0dd9c4a

Please sign in to comment.