Skip to content

Public utility python package (hip-data-ml-utils) for data/ML use cases

License

Notifications You must be signed in to change notification settings

hipagesgroup/data-ml-utils

Repository files navigation

data-ml-utils

A utility python package that covers the common libraries we use.

Installation

Since this is hosted privately on git, you will need to be under VPN, then run

pip install git+ssh://git@github.com/hipagesgroup/data-ml-utils@v0.2.2

Feature

Pyathena client initialisation

Almost one liner

import os
from data_ml_utils.pyathena_client.client import PyAthenaClient

os.environ["AWS_ACCESS_KEY_ID"] = "xxx"
os.environ["AWS_SECRET_ACCESS_KEY"] = "xxx"

pyathena_client = PyAthenaClient()

Pyathena client initialisation

Pyathena query

Almost one liner

query = """
    SELECT
        *
    FROM
        dev.test_tutorial_table
    LIMIT 10
"""

df_raw = pyathena_client.query_as_pandas(final_query=query)

Pyathena query

Boto3 client

Visit link

More to Come

  • You suggest, raise a feature request issue and we will review!

Tutorials

Pyathena

There is a jupyter notebook to show how to use the package utility package for pyathena: notebook

Boto3 EMR

There is a jupyter notebook to show how to use the package utility package for EMR: notebook

Boto3 Sagemaker

There is a jupyter notebook to show how to use the package utility package for Sagemaker: notebook