Pandas-Wizard (pandaswizard
) is a simple Python module for providing
utility functions and wrappers for the pandas
module. The module is kept simple and use of external dependencies is minimized
unless needed to enhance performance.
This is a relatively new repository, and if you find any performance or improvement scope please check the contributing guidelines for the organization. All help and criticism are appreciated. If you find any additional use cases please create a pull request or submit for a new feature.
The source code is currently hosted at GitHub: sharkutilities/pandas-wizard. The binary installers for the latest release are available at the Python Package Index (PyPI).
pip install -U pandas-wizard
The list of changes between each release is available here.
The purpose of the below guide is to illustrate the main features of pandas-wizard and assume the working knowledge of
the pandas
module and use cases. The below example calculates the percentile of
pandas.DataFrameGroupBy
object using np.percentile
.
import pandaswizard as pdw # attempt to create an ubiquitous naming
# let's calculate the 50th-percentile, i.e. the median for each group
percentiles = df.groupby("group").agg({"A" : pdw.percentile(50)})
percentiles.head()
# or, preferred usage is to use in conjunture with other aggregation function like
statistics = df.groupby("group").agg({"A" : [sum, pdw.percentile(50), pdw.quantile(0.95)]})
statistics.head()
The above function calculates the 50th percentile, i.e., the median of the feature "A" based on the grouped column "group" from the data frame.
Footnote: The favicon is designed from the original pandas
logo and no
copyright infringement is intended. Since the main objective is to provide a utility function for pandas
the logo is re-used and developed
using canva.