Skip to content

a server and related tools for generating text to speech and associated timestamped visemes

License

Notifications You must be signed in to change notification settings

kkoch986/ai-skeletons-speech-generation

Repository files navigation

Text to Speech and Viseme Assignment

This is a module of my AI powered animatronics pipeline. It is responsible for taking input text and generating an audio file as well as timestamped markers for each viseme change.

The visemes can be used to translate directly to jaw movements on the animatronics.

Elevenlabs

Currently the whole thing is only powered by ElevenLabs. You will need an account and an API token on your environment as ELEVENLABS_TOKEN.

Starting the Server

You can run the server with ./main.py api and make a request like this:

curl -L http://127.0.0.1:5000/generate \
  -X POST \
  -d voiceName="[ElevenVoices] American Female Teen" \
  -d text="this is a test" \
  -d name="test 1"

response:

{
  "audio": "<base64 encoded hex string of the mp3 audio>",
  "audioLength": 1.1493877551020408,
  "emitRatio": 0.7,
  "mp3File": "/home/ken/projects/ai-skeletons/phone-generation/test 1.mp3",
  "outputName": "test 1",
  "prompt": "this is a test",
  "results": [
    ["0.060", "T"],
    ["0.100", "i"],
    ["0.190", "s"],
    ["0.250", "i"],
    ["0.340", "s"],
    ["0.380", "@"],
    ["0.480", "t"],
    ["0.530", "e"],
    ["0.700", "s"],
    ["0.810", "t"]
  ],
  "voiceID": "FxXx1SvSMrk96HmqFCUS",
  "voiceName": "[ElevenVoices] American Female Teen"
}

The results are an array containing the timestamp (in seconds) and the viseme symbol based on this table.

The audio can be decoded by reversing the encoding:

jq -r '.audio' test.json \
  | xxd -r -p \
  | base64 -d \
> test-1-decoded.mp3

Running Manually

You can also use the tool for generation without standing up the api using ./main.py generateFull

See the options here.

About

a server and related tools for generating text to speech and associated timestamped visemes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published