A voice2voice chatgpt assistant
Constructed by using OpenAI Whisper + OpenAI ChatGPT API + Google Text2Speech Service
- Speech2Text through OpenAI's Whisper Model (currently using local CPU)
- Chat with ChatGPT through its API
- Text2Speech through Google's Text2Speech Service
- Related tools
- sox: play the .mp3 files
- arecord: record your voices through microphone (ubuntu default toolset)
- lame: transform arecord's raw data to .mp3 file
- We can now ask ChatGPT to reset the session for us. Therefore it will clear out the current session, preventing spend the quota on unrelated history messages.
- Whisper would automatically download model for the first time
- Make sure use a python virtual env before start
- Currently, only 1 background session available at any time
Run the following command manually or using scripts/install.sh
$ pip3 insntall -r requirements.txt
$ apt install sox libsox-fmt-all lame
$ mkdir record private audio
Get your api key here: https://platform.openai.com/account/api-keys
$ echo "{CHATGPT_ACCESS_KEY}" > private/api_keys
You can input text and send to ChatGPT through API
Then, you can hear the response
$ ./scripts/run_simple.sh
Start/Restart a ChatGPT session (wait for your voice audio file in the background)
$ ./scripts/start_background_session.sh
Stop the previous ChatGPT session if there is one
$ ./scripts/stop_background_session.sh
Start to record voice after it runs, ctrl+c when finished
$ ./scripts/record_audio.sh
Under Construction ...
- keyboard shortcut to record the user's voice
- keyboard shortcut to restart the ChatGPT session
- be able to load previous session from history
- ...
- OpenAI ChatGPT API Keys
- OpenAI ChatGPT Python Chat Completions
- Google Translate
- OpenAI Whisper