Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: fix all docs snippets instantiations #825

Merged
merged 1 commit into from
Dec 16, 2023

Conversation

mspronesti
Copy link
Contributor

@mspronesti mspronesti commented Dec 16, 2023

Hi @gventuri,
this PR fixes all the code snippets in the documentation. Most were either outdated or incorrect. Mostly:

  • config needs to be passed as a named parameter (it's positionally not the second)
  • the examples under docs/examples were not properly instantiated and contained some typos.

Summary by CodeRabbit

  • Documentation

    • Updated guides to reflect new configuration parameter for SmartDataframe class instantiation.
    • Renamed DataBricksConnector to DatabricksConnector in import statements for clarity.
    • Modified examples to showcase the integration of the OpenAI module with SmartDataframe.
    • Revised function parameter names from name and salary to names and salaries in plot_salaries function for consistency.
  • New Features

    • Introduced a new configuration pattern for SmartDataframe objects, enhancing flexibility in data processing.
  • Refactor

    • Standardized the use of a config parameter across various SmartDataframe instantiations for improved code maintainability.
  • Style

    • Adjusted naming conventions for better readability and alignment with industry standards.

Copy link
Contributor

coderabbitai bot commented Dec 16, 2023

Walkthrough

The recent updates primarily involve standardizing the configuration of SmartDataframe objects across various documentation files by introducing a config parameter. This shift enhances uniformity and potentially streamlines the handling of configurations. Additionally, a minor connector renaming and parameter updates in a function reflect a focus on clarity and consistency.

Changes

Files Change Summary
docs/LLMs/llms.md, docs/custom-... (multiple), docs/examples.md, docs/getting-started.md Standardized SmartDataframe instantiation with config parameter.
docs/connectors.md Renamed DataBricksConnector to DatabricksConnector.
docs/skills.md Renamed function parameters from name and salary to names and salaries.

🐇✨📜✨🐇

In codes and docs, we hop and leap,
With config params, the changes seep.
A connector renamed, clarity's creed,
A rabbit's touch in every deed.
🎉📊🐾🎉

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • You can reply to a review comment made by CodeRabbit.
  • You can tag CodeRabbit on specific lines of code or files in the PR by tagging @coderabbitai in a comment.
  • You can tag @coderabbitai in a PR comment and ask one-off questions about the PR and the codebase. Use quoted replies to pass the context for follow-up questions.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

@codecov-commenter
Copy link

codecov-commenter commented Dec 16, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (3e7d975) 85.11% compared to head (2e490e8) 85.11%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #825   +/-   ##
=======================================
  Coverage   85.11%   85.11%           
=======================================
  Files          88       88           
  Lines        3809     3809           
=======================================
  Hits         3242     3242           
  Misses        567      567           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 3e7d975 and 8ba331d.
Files selected for processing (9)
  • docs/LLMs/llms.md (2 hunks)
  • docs/connectors.md (6 hunks)
  • docs/custom-head.md (1 hunks)
  • docs/custom-instructions.md (1 hunks)
  • docs/custom-prompts.md (3 hunks)
  • docs/custom-whitelisted-dependencies.md (1 hunks)
  • docs/examples.md (14 hunks)
  • docs/getting-started.md (5 hunks)
  • docs/skills.md (1 hunks)
Additional comments: 34
docs/LLMs/llms.md (2)
  • 58-64: The changes to the instantiation of SmartDataframe using the config parameter as a named argument are correctly applied and consistent with the PR objectives to fix outdated or incorrect code snippets in the documentation.

  • 175-179: The update to the instantiation of SmartDataframe with the config parameter as a named argument is correctly applied here as well, ensuring that the llm configuration is passed correctly.

docs/connectors.md (7)
  • 41-41: The import statement correctly reflects the renaming of DataBricksConnector to DatabricksConnector as per the PR objectives and the AI-generated overview.

  • 41-44: The config parameter is correctly passed as a named argument to the PostgreSQLConnector, which aligns with the PR's objective to ensure that the config parameter is passed as a named argument.

  • 71-74: The config parameter is correctly passed as a named argument to the MySQLConnector, which aligns with the PR's objective to ensure that the config parameter is passed as a named argument.

  • 101-104: The config parameter is correctly passed as a named argument to the SqliteConnector, which aligns with the PR's objective to ensure that the config parameter is passed as a named argument.

  • 152-155: The import statement correctly reflects the renaming of DataBricksConnector to DatabricksConnector as per the PR objectives and the AI-generated overview. However, the class name SnowFlakeConnector should be checked for consistency with the rest of the codebase, as it does not match the expected SnowflakeConnector naming convention.

Verification successful

The use of SnowFlakeConnector in the docs/connectors.md file is consistent with its usage across the codebase, as verified by the search results. There is no need to change the class name in the documentation based on the current naming convention in the code.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify that the class name `SnowFlakeConnector` is consistent across the codebase.
rg 'SnowFlakeConnector' --files-with-matches

Length of output: 218

* 183-183: The import statement correctly reflects the renaming of `DataBricksConnector` to `DatabricksConnector` as per the PR objectives and the AI-generated overview.
  • 209-212: The config parameter is not used with the YahooFinanceConnector, which is inconsistent with the PR's objective to ensure that the config parameter is passed as a named argument. However, this may be due to the nature of the YahooFinanceConnector which might not require a config dictionary. It's important to verify if this is the expected usage or if it's an oversight.
Verification successful

To verify if the YahooFinanceConnector typically requires a config parameter, we should look at the constructor signature in the source code. The file pandasai/connectors/yahoo_finance.py is likely to contain the class definition, so we'll check there first. We can use ast-grep to search for the class definition and its constructor to see if a config parameter is expected.


The previous ast-grep command failed due to a syntax error in the pattern. It seems that the pattern was not correctly formatted for ast-grep. Let's correct the syntax and try again to find the constructor of the YahooFinanceConnector class.


The ast-grep command failed again due to a syntax error. It appears that the pattern used is still not compatible with ast-grep. Since we are having difficulty with ast-grep, let's switch to a different approach to inspect the constructor of the YahooFinanceConnector class. We can use rg (ripgrep) to search for the __init__ method within the file that defines the YahooFinanceConnector class. We know from the previous rg output that the class is defined in pandasai/connectors/yahoo_finance.py.


The output from rg shows that the __init__ method of the YahooFinanceConnector class is calling super().__init__ with a parameter yahoo_finance_config. However, this output does not show the full signature of the __init__ method, which means we still don't have the complete information about whether a config parameter is expected or not.

To get the full signature, we should look at the lines around the __init__ method call. We can adjust the rg command to include some lines of context around the match.


The new output from rg provides the full signature of the __init__ method for the YahooFinanceConnector class. It shows that the constructor accepts a stock_ticker parameter and a config parameter, which can be either an instance of YahooFinanceConnectorConfig or a dictionary. The config parameter is optional, as indicated by the default value None.

Given this information, the usage of YahooFinanceConnector in the docs/connectors.md file without a config parameter is correct, as the config parameter is not mandatory. This aligns with the PR's objective to ensure that the config parameter is passed as a named argument where required, and in this case, it is not required.

Therefore, the usage of YahooFinanceConnector without the config parameter in the documentation is consistent with the class definition and does not require any changes.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify if the `YahooFinanceConnector` typically requires a `config` parameter or if this usage is correct.
rg 'YahooFinanceConnector' --files-with-matches

Length of output: 233



Script:

#!/bin/bash
# Search for the constructor of the YahooFinanceConnector class to see if it includes a `config` parameter.
ast-grep --lang python --pattern $'class YahooFinanceConnector($_) {
  def __init__($$$) {
    $$$
  }
}'

Length of output: 598



Script:

#!/bin/bash
# Correct the syntax and search for the constructor of the YahooFinanceConnector class to see if it includes a `config` parameter.
ast-grep --lang python --pattern $'class YahooFinanceConnector {
  def __init__($$$) {
    $$$
  }
}'

Length of output: 590



Script:

#!/bin/bash
# Use ripgrep to search for the __init__ method of the YahooFinanceConnector class.
rg "__init__" pandasai/connectors/yahoo_finance.py

Length of output: 115



Script:

#!/bin/bash
# Use ripgrep to search for the __init__ method of the YahooFinanceConnector class with some lines of context.
rg -C 3 "__init__" pandasai/connectors/yahoo_finance.py

Length of output: 471

docs/custom-head.md (1)
  • 16-18: The update to instantiate SmartDataframe with the config parameter as a named argument is correct and aligns with the PR objective to standardize the instantiation across documentation.
docs/custom-instructions.md (1)
  • 12-14: The change to the SmartDataframe instantiation correctly uses the config parameter as a named argument, which aligns with the PR's objective to standardize instantiation patterns in the documentation.
docs/custom-prompts.md (3)
  • 34-37: The update to pass the config parameter as a named argument is correct and aligns with the PR's objective to standardize the instantiation of SmartDataframe.

  • 61-64: The change to pass the config parameter as a named argument in the instantiation of SmartDataframe using MyCustomFileBasedPrompt is also correct and consistent with the PR's objective.

  • 88-90: The instantiation of SmartDataframe without explicitly setting the my_custom_value attribute in the config parameter is consistent with the PR's objective to correct the documentation. However, it's important to ensure that MyCustomPrompt can handle cases where my_custom_value is not provided.

docs/custom-whitelisted-dependencies.md (1)
  • 7-9: The updated code snippet correctly demonstrates how to instantiate SmartDataframe with the config parameter, including custom_whitelisted_dependencies. This is in line with the PR's objective to ensure that the config parameter is passed as a named argument.
docs/examples.md (14)
  • 13-13: The import statement for OpenAI is correctly added according to the PR description and AI-generated summaries. This change is consistent across all examples.

  • 22-22: The instantiation of OpenAI with an API token is correct and follows the pattern described in the PR description. However, it's important to ensure that YOUR_API_TOKEN is a placeholder and that actual API tokens are not exposed in the documentation.

  • 25-25: The SmartDataframe instantiation now includes the config parameter with the llm key set to the OpenAI instance, which aligns with the PR's objective to pass the config parameter as a named argument.

  • 27-27: The chat method is used correctly to demonstrate the conversational capabilities of the SmartDataframe. The output comment is informative and provides an example of the expected result.

  • 322-322: The plot_salaries function is correctly renamed and updated to accept a DataFrame parameter. The comment within the function provides a clear description of its purpose and usage.

  • 326-327: The plotting code within the plot_salaries function is correct, but it's important to ensure that the path temp_chart.png is intended to be a temporary file or if it should be saved to a user-defined path as demonstrated in other examples.

  • 215-215: The instantiation of SmartDataframe with the config parameter is repeated in multiple examples. While this repetition is necessary to show different use cases, it's important to ensure that the documentation clearly explains the purpose of the config parameter to avoid confusion.

  • 216-216: The example demonstrates chaining commands using the chat method. It's important to verify that the output comment accurately reflects the capabilities of the SmartDataframe and that the example is clear and understandable for users.

  • 256-258: The Agent class is instantiated with a config parameter that includes the llm key set to an instance of OpenAI, which is consistent with the changes made to SmartDataframe and SmartDatalake instantiations.

  • 160-165: The SmartDataframe instantiation includes multiple configuration options, demonstrating the flexibility of the config parameter. It's important to ensure that the save_charts_path is explained clearly in the documentation and that the example uses a placeholder or a generic path that users can easily adapt.

  • 198-198: The instantiation of SmartDatalake with the config parameter is consistent with the changes made to SmartDataframe. This example helps to demonstrate the use of the library with multiple dataframes.

  • 299-299: The addition of custom functions to the Agent class is an advanced feature that should be explained clearly in the documentation. The example should demonstrate how users can define their own skills and add them to the agent.

  • 322-322: The plot_salaries skill function is correctly annotated with the @skill decorator, indicating that it's a custom skill that can be added to the Agent. The function's docstring provides a clear explanation of its purpose.

  • 326-327: The plotting code within the plot_salaries function uses matplotlib to generate a bar chart. It's important to ensure that the code follows best practices for plotting and that the saved chart is handled securely and appropriately.

docs/getting-started.md (4)
  • 47-47: The addition of the OpenAI import statement is consistent with the PR objectives to ensure proper instantiation and usage of the OpenAI module in the examples.

  • 58-60: The instantiation of the SmartDataframe with the config parameter as a named argument is a good practice and aligns with the PR objectives to correct parameter passing in the documentation.

  • 128-128: The import of the OpenAI module here is consistent with the PR objectives and ensures that the Agent class is instantiated with the correct configuration.

  • 138-140: The instantiation of the Agent class with the config parameter including an llm key set to an instance of OpenAI is in line with the PR objectives and the AI-generated overview. This change ensures that the Agent is configured correctly.

docs/skills.md (1)
  • 86-98: The parameter names in the plot_salaries function have been updated from name and salary to names and salaries. This change is consistent with the PR objectives and the AI-generated overview.

docs/skills.md Outdated Show resolved Hide resolved
docs/custom-whitelisted-dependencies.md Show resolved Hide resolved
Most of the snippets were either outdated or didn't work at all
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 10

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 3e7d975 and 2e490e8.
Files selected for processing (10)
  • docs/LLMs/llms.md (2 hunks)
  • docs/connectors.md (6 hunks)
  • docs/custom-head.md (1 hunks)
  • docs/custom-instructions.md (1 hunks)
  • docs/custom-prompts.md (3 hunks)
  • docs/custom-whitelisted-dependencies.md (1 hunks)
  • docs/examples.md (14 hunks)
  • docs/getting-started.md (5 hunks)
  • docs/skills.md (3 hunks)
  • examples/skills_example.py (1 hunks)
Files skipped from review due to trivial changes (1)
  • docs/LLMs/llms.md
Files skipped from review as they are similar to previous changes (7)
  • docs/connectors.md
  • docs/custom-head.md
  • docs/custom-instructions.md
  • docs/custom-prompts.md
  • docs/custom-whitelisted-dependencies.md
  • docs/getting-started.md
  • docs/skills.md
Additional comments: 5
docs/examples.md (3)
  • 13-13: The addition of the OpenAI import from the pandasai.llm package is consistent with the PR's objective to integrate the OpenAI module in examples.

  • 22-25: The SmartDataframe instantiation now correctly includes the config parameter as a named argument, aligning with the PR's objective to fix instantiation patterns.

  • 27-27: The chat method is correctly used with the SmartDataframe object, demonstrating the interactive capabilities as intended in the PR.

examples/skills_example.py (2)
  • 24-30: The function signature and docstring for plot_salaries have been updated to use plural parameter names, which is consistent with the PR objective to correct parameter naming in the documentation.

  • 24-36: Ensure that the plot_salaries function is tested with the new parameter names to confirm that the agent's skill addition and subsequent chat interaction work as expected with the updated function signature.

Args:
name (list[str]): Employee name
salaries (list[int]): Salaries
Displays the bar chart having name on x-axis and salaries on y-axis using streamlit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the docstring of the plot_salaries function; "gpd" should be corrected to "gdp".

-    Displays the bar chart having name on x-axis and salaries on y-axis using streamlit
+    Displays the bar chart having name on x-axis and salaries on y-axis using matplotlib

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
Displays the bar chart having name on x-axis and salaries on y-axis using streamlit
Displays the bar chart having name on x-axis and salaries on y-axis using matplotlib

@@ -222,7 +253,7 @@ With the chat agent, you can engage in dynamic conversations where the agent ret

Feel free to initiate conversations, seek clarifications, and explore explanations to enhance your interactions with the chat agent!

```
```python
import pandas as pd
from pandasai import Agent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement for the Agent class is missing, which is necessary for the code snippet to work correctly.

+ from pandasai import Agent

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
from pandasai import Agent
from pandasai import Agent

@@ -288,12 +319,9 @@ salaries_df = pd.DataFrame(salaries_data)


@skill
def plot_salaries(merged_df: pd.DataFrame) -> str:
def plot_salaries(merged_df: pd.DataFrame):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plot_salaries function is missing the merged_df parameter in the agent.add_skills method call. This will cause an error when the skill is invoked.

- agent.add_skills(plot_salaries)
+ agent.add_skills(plot_salaries(merged_df))

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
def plot_salaries(merged_df: pd.DataFrame):
def plot_salaries(merged_df: pd.DataFrame):

The plot_salaries function does not return any value. If the function is intended to be used as a skill for the Agent, it should return the path to the saved plot or the plot object itself.

+    return "temp_chart.png"

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
def plot_salaries(merged_df: pd.DataFrame):
def plot_salaries(merged_df: pd.DataFrame):
return "temp_chart.png"

@@ -222,7 +253,7 @@ With the chat agent, you can engage in dynamic conversations where the agent ret

Feel free to initiate conversations, seek clarifications, and explore explanations to enhance your interactions with the chat agent!

```
```python
import pandas as pd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement for OpenAI is duplicated in the "Working with Agent" section. It should be removed to avoid redundancy.

- from pandasai.llm.openai import OpenAI

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
import pandas as pd
import pandas as pd

@@ -265,7 +296,7 @@ print(response)

You can add customs functions for the agent to use, allowing the agent to expand its capabilities. These custom functions can be seamlessly integrated with the agent's skills, enabling a wide range of user-defined operations.

```
```python
import pandas as pd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement for OpenAI is duplicated in the "Add Skills to the Agent" section. It should be removed to avoid redundancy.

- from pandasai.llm.openai import OpenAI

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
import pandas as pd
import pandas as pd

Comment on lines 326 to 327
import matplotlib.pyplot as plt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plot_salaries function uses a hardcoded filename for the saved plot. Consider using a dynamic filename or returning the plot object for better flexibility.

- plt.savefig("temp_chart.png")
+ plt.savefig("salaries_chart_{}.png".format(datetime.now().strftime("%Y%m%d%H%M%S")))

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
import matplotlib.pyplot as plt


df = SmartDataframe("data/Loan payments data.csv")
llm = OpenAI(api_token="YOUR_API_TOKEN")
df = SmartDataframe("data/Loan payments data.csv", config={"llm": llm})

# We filter by males only
from_males_df = df.chat("Filter the dataframe by women")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable from_males_df is assigned the result of a chat command that filters by women, which is contradictory. The variable name should reflect the filter applied.

- from_males_df = df.chat("Filter the dataframe by women")
+ from_women_df = df.chat("Filter the dataframe by women")

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
from_males_df = df.chat("Filter the dataframe by women")
from_women_df = df.chat("Filter the dataframe by women")

"""
Displays the bar chart having name on x axis and salaries on y axis using streamlit
Args:
name (list[str]): Employee name
names (list[str]): Employees' names
salaries (list[int]): Salaries
"""
import matplotlib.pyplot as plt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving the matplotlib.pyplot import to the top of the module to avoid repeated imports if plot_salaries is called multiple times, which can negatively impact performance.

salaries (list[int]): Salaries
"""
import matplotlib.pyplot as plt

plt.bar(name, salary)
plt.bar(names, salaries)
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plt.xticks(rotation=45) call appears to be outside the scope of the plot_salaries function, which might be an error. If this line is intended to be part of the function, it should be indented accordingly.

@gventuri
Copy link
Collaborator

Thanks a lot @mspronesti, super helpful! Documentation is becoming more and more a priority. Some many outdated parts!

@gventuri gventuri merged commit 57e5b4b into Sinaptik-AI:main Dec 16, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants