Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move docs to docstrings and generate documentation with sphinx-autodoc #367

Merged
merged 5 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Use autodoc in api.rst
  • Loading branch information
Priyansh121096 committed Oct 10, 2024
commit beee6c88144eeb4db0b8fd1b6d22a966b9868f57
54 changes: 2 additions & 52 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
@@ -1,55 +1,5 @@
API Reference
=============

.. py:function:: frame_to_hyper(df: pd.DataFrame, database: Union[str, pathlib.Path], *, table: Union[str, tableauhyperapi.Name, tableauhyperapi.TableName], table_mode: str = "w", not_null_columns: Optional[Iterable[str]] = None, json_columns: Optional[Iterable[str]] = None, geo_columns: Optional[Iterable[str]] = None) -> None:

Convert a DataFrame to a .hyper extract.

:param df: Data to be written out.
:param database: Name / location of the Hyper file to write to.
:param table: Table to write to.
:param table_mode: The mode to open the table with. Default is "w" for write, which truncates the file before writing. Another option is "a", which will append data to the file if it already contains information.
:param not_null_columns: Columns which should be considered "NOT NULL" in the target Hyper database. By default, all columns are considered nullable
:param json_columns: Columns to be written as a JSON data type
:param geo_columns: Columns to be written as a GEOGRAPHY data type
:param process_params: Parameters to pass to the Hyper Process constructor.

.. py:function:: frame_from_hyper(source: Union[str, pathlib.Path, tab_api.Connection], *, table: Union[str, tableauhyperapi.Name, tableauhyperapi.TableName], return_type: Literal["pandas", "pyarrow", "polars"] = "pandas")

Extracts a DataFrame from a .hyper extract.

:param source: Name / location of the Hyper file to be read or Hyper-API connection.
:param table: Table to read.
:param return_type: The type of DataFrame to be returned
:param process_params: Parameters to pass to the Hyper Process constructor.


.. py:function:: frames_to_hyper(dict_of_frames: Dict[Union[str, tableauhyperapi.Name, tableauhyperapi.TableName], pd.DataFrame], database: Union[str, pathlib.Path], *, table_mode: str = "w", not_null_columns: Optional[Iterable[str]] = None, json_columns: Optional[Iterable[str]] = None, geo_columns: Optional[Iterable[str]] = None,) -> None:

Writes multiple DataFrames to a .hyper extract.

:param dict_of_frames: A dictionary whose keys are valid table identifiers and values are dataframes
:param database: Name / location of the Hyper file to write to.
:param table_mode: The mode to open the table with. Default is "w" for write, which truncates the file before writing. Another option is "a", which will append data to the file if it already contains information.
:param not_null_columns: Columns which should be considered "NOT NULL" in the target Hyper database. By default, all columns are considered nullable
:param json_columns: Columns to be written as a JSON data type
:param geo_columns: Columns to be written as a GEOGRAPHY data type
:param process_params: Parameters to pass to the Hyper Process constructor.

.. py:function:: frames_from_hyper(source: Union[str, pathlib.Path, tab_api.Connection], *, return_type: Literal["pandas", "pyarrow", "polars"] = "pandas") -> dict:

Extracts tables from a .hyper extract.

:param source: Name / location of the Hyper file to be read or Hyper-API connection.
:param return_type: The type of DataFrame to be returned
:param process_params: Parameters to pass to the Hyper Process constructor.


.. py:function:: frame_from_hyper_query(source: Union[str, pathlib.Path, tab_api.Connection], query: str, *, return_type: Literal["pandas", "polars", "pyarrow"] = "pandas",)

Executes a SQL query and returns the result as a pandas dataframe

:param source: Name / location of the Hyper file to be read or Hyper-API connection.
:param query: SQL query to execute.
:param return_type: The type of DataFrame to be returned
:param process_params: Parameters to pass to the Hyper Process constructor.
.. automodule:: pantab
:members:
13 changes: 13 additions & 0 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
import os
import sys
from typing import List

sys.path.insert(0, os.path.abspath(os.path.join("..", "..", "src")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to do this without having to change sys.path? Also, I think it would be better if you used a path relative to this file instead of the current working directory of the the shell, so something like:

srcdir = pathlib.Path(__file__).resolve().parent.parent / "src"
# do whatever needs to be done with srcdir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd there's two standard ways to work with autodoc as per the official docs.

  1. Do an install of the package in your env and then run sphinx.
  2. Add the sources to the path and then run sphinx.

The latter just seemed less frictional to me as an outside contributor and hence I went ahead with it. I'm not sure if there's a CI workflow to publish the docs or if they're automatically generated. If it's the former and I can assume that the package will be installed in the environment before the docs are generated, I can remove the sys.path.insert. Both workflows are perfectly standard however.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if there's a CI workflow to publish the docs or if they're automatically generated

This is all handled by readthedocs.com, which looks for changes to this repo and republishes the docs when pushes are made to main. If you sign up for an account there I'd be happy to add you to the project; its probably helpful for what you are trying to do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. "Priyansh121096" is my readthedocs username.


# -- Project information -----------------------------------------------------

project = "pantab"
Expand All @@ -13,6 +17,8 @@
extensions = [
"sphinx_rtd_theme",
"sphinxext.opengraph",
"sphinx.ext.autodoc",
"sphinx_autodoc_typehints",
]

templates_path = ["_templates"]
Expand All @@ -35,3 +41,10 @@
ogp_site_url = "https://pantab.readthedocs.io/"
ogp_use_first_image = False
ogp_image = "https://pantab.readthedocs.io/en/latest/_static/pantab_logo.png"

# -- Options for autodoc -----------------------------------------------------

autodoc_mock_imports = ["pantab.libpantab"]
autodoc_typehints = "none"
typehints_use_signature = "true"
typehints_use_signature_return = "true"
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ dependencies:
- pyarrow
- python
- pytest
- pytest_xdist
- scikit-build-core
- sphinx
- sphinx-autodoc-typehints
- pre-commit
- sphinx_rtd_theme
- sphinxext-opengraph
Expand Down