TechChallenge #1: Writing SQL without Knowing SQL

by tsdocode from CoE

Resources

Use KMS email to access resources

Source Tree

.
├── README.md
├── Training
│   ├── Example.ipynb                   # Example notebook for Colab Training
│   ├── README.md
│   ├── __init__.py
│   └── train.py                        # Train GPT model
│   ├── build-dataset.py                # Apply preprocessing and build txt dataset from JSON
│   ├── cloud.py                        # Using OwnCloud services
│   ├── model.py                        # GPT model
│   ├── preprocess.py                   # Apply preprocessing
│   ├── requirements.txt                # Requirements Python package
├── WebApp
│   ├── API
│   │   ├── main.py
│   │   ├── requirements.txt
│   │   └── server
│   │       ├── copy-saved-model-here      
│   │       ├── app.py
│   │       ├── text2sql
│   │       │   ├── __init__.py
│   │       │   ├── gpt_model.py            # GPT model 
│   │       │   ├── postprocess.py          # Postprocess for inference
│   │       │   ├── preprocess.py           # Preprocess for inference
│   │       │   └── text2sql.py             # Text2SQL model
│   │       ├── models                      # API model
│   │       │   ├── GPT.py                  # GPT request format
│   │       │   └── reponse.py              # Response format for API
│   │       └── routes
│   │           ├── GPT.py                  # GPT routes
│   │           ├── __init__.py
│   └── UI
│       ├── app.py                          # UI with Streamlit
│       ├── log.txt                         # log file
│       └── utils.py                        # utils.py is used to generate UI
└── requirements.txt

1. Installation

1.1 Install Miniconda

Follow this instruction to install miniconda. Skip it if you using Google Colab or sth else.

If you using Google Colab, please add ! symbol before instruction below

Example:

!pip install -r requirements.txt

1.2 Install Library

git clone 
cd 
pip install -r requirements.txt

2. Data preparing

Sample data format

{
    'data' : [
        {
            'schema' : "",
            'question' : "",
            'sql' : ""
        },
        {
            'schema' : "",
            'question' : "",
            'sql' : ""
        }

    ]
}

Collect and annotate your data with this format.

3. Turn json dataset into trainable txt

cd Training
python build-dataset.py -i <input_file> -o <output_file>

if you don't provide --output-file, file name will be the same as --input-file

You can also custom txt dataset by modify make_prompt() function in dataset.py

4. Train GPT

python train.py -i <input_data> -o <output_model_folder> -m <model_name> -p <pretrained_model_path> -e <epochs> -l <learning_rate>

default:

model_name: 125M
epochs: 1
learning_rate: 5e-5

if you want to continued training from a previous model, you can use -p to specify the path of the previous model.

5. Start WebApp

cd WebApp

Copy your model into

WebApp/API/server/copy-saved-model-here

Add your model path in WebApp/API/server/gpt/text2sql.py

model = GPTModel(model_path="./API/server/copy-saved-model-here")

In terminal enter:

python API/main.py & streamlit run UI/app.py

API port: 8001 UI port: 8000

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Training		Training
WebApp		WebApp
assets		assets
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TechChallenge #1: Writing SQL without Knowing SQL

Resources

Source Tree

1. Installation

2. Data preparing

3. Turn json dataset into trainable txt

4. Train GPT

5. Start WebApp

About

Releases

Packages

Languages

kms-technology/techchallenge1-sample

Folders and files

Latest commit

History

Repository files navigation

TechChallenge #1: Writing SQL without Knowing SQL

Resources

Source Tree

1. Installation

2. Data preparing

3. Turn json dataset into trainable txt

4. Train GPT

5. Start WebApp

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages