Skip to content

zakaton/Pink-Trombone

Β 
Β 

Repository files navigation

Fabien
Sound is generated in the glottis (at the bottom left), then filtered by the shape of the vocal tract. The voicebox controls the pitch and intensity of the initial sound - Neil Thapen

πŸ—£οΈ Pink Trombone - Bare-handed Speech Synthesis

A programmable version of Neil Thapen's famous and wonderful Pink Trombone

πŸ“š Table of Contents

πŸ“¦ Setting Up

πŸ‘„ Producing Sound

πŸ‘€ Enabling and Disabling the UI

πŸŽ›οΈ Audio Parameters

🎺 Manipulating Vocal Tract Constrictions

πŸ‘… Common Phonemes

πŸ† Developer Showcase

πŸ™ Developer Wishlist

πŸ“– Bibliography

πŸ“¦ Setting Up

  1. Save a local copy of pink-trombone.min.js and pink-trombone-worklet-processor.min.js and make sure they're both in the same relative location (the first will import the other as a Audio Worklet Processor)

  2. In your HTML <head></head> element, insert the file in a script element as a module:

<script src="pink-trombone.min.js" type="module"></script>
  1. In your HTML <body></body> element, insert the following custom element:
<pink-trombone></pink-trombone>
  1. In your JavaScript code, grab the <pink-trombone></pink-trombone> element:
var pinkTromboneElement = document.querySelector("pink-trombone");
  1. Add a load eventListener to the pinkTromboneElement element:
pinkTromboneElement.addEventListener("load", myCallback);
  1. In the "load" callback, assign an Audio Context using .setAudioContext(myAudioContext) (if none is specified, an Audio Context instance is created for you):
function myCallback(event) {
  pinkTromboneElement.setAudioContext(myAudioContext)
}

This method returns a Promise once the AudioWorkletProcessor module is loaded.

  1. In the promise resolution, a Pink Trombone audio node is created, which you can connect to other audio nodes from the scope of the <pink-trombone></pink-trombone> element:
function myCallback(event) {
  pinkTromboneElement.setAudioContext(myAudioContext)
    .then(() => {
      const audioContext = pinkTromboneElement.audioContext
      pinkTromboneElement.connect(audioContext.destination);
    });
}

πŸ‘„ Producing Sound

πŸ˜ƒ To start generating sound, run the .start() method:

pinkTromboneElement.start();

🀐 To stop generating sound, run the .stop() method:

pinkTromboneElement.stop();

πŸ‘€ Enabling and Disabling the UI

πŸ™‚ To show the interactive visualization:

pinkTromboneElement.enableUI();

✍️ To start animating the visualization:

pinkTromboneElement.startUI();

πŸ›‘ To stop animating the visualization:

pinkTromboneElement.stopUI();

😊 To hide the interactive visualization:

pinkTromboneElement.disableUI();

πŸŽ›οΈ Audio Parameters

The audio parameters of the Pink Trombone audio node can be accessed from the <pink-trombone></pink-trombone> element's scope:

🎚️ Intensity

pinkTromboneElement.intensity;

🎡 Frequency

pinkTromboneElement.frequency;

πŸ‘„ Tenseness

pinkTromboneElement.tenseness;

πŸ“’ Loudness

pinkTromboneElement.loudness;

〰️ Vibrato

pinkTromboneElement.vibrato.frequency;
pinkTromboneElement.vibrato.gain;
pinkTromboneElement.vibrato.wobble;

πŸ‘… Tongue

// 'index' and 'diameter' refer to the tongue's location in the mouth
pinkTromboneElement.tongue.index;
pinkTromboneElement.tongue.diameter;

To change the voiceness between voiced and voiceless, change the .tenseness and .loudness audio parameters as follows:

function setVoiceness(voiceness) {
  const tenseness = 1 - Math.cos((voiceness) * Math.PI * 0.5);
  const loudness = Math.pow(tenseness, 0.25);
  
  pinkTromboneElement.tenseness.value = tenseness;
  pinkTromboneElement.loudness.value = loudness;
}

// voiced
setVoiceness(1);

// voiceless
setVoiceness(0);

Later on I may add a .voiceness audio parameter that automates this - for now I'm just adopting the original version

🎺 Manipulating Vocal Tract Constrictions

Vocal Tract constrictions comprise of an object containing .index and .diameter Audio Parameter properties that are implicitly connected to the Pink Trombone audio node

To add a vocal tract constriction:

var myConstriction = pinkTromboneElement.newConstriction(indexValue, diameterValue);

To set a vocal tract constriction:

myConstriction.index.value = newIndexValue;
myConstriction.diameter.value = newDiameterValue;

To remove a vocal tract constriction:

pinkTromboneElement.removeConstriction(myConstriction);

πŸ‘… Common Phonemes

For reference, here are some preset index & diameter preset values for some phonemes:

πŸ‘… Tongue phonemes:

Γ¦ [pat]

  • index : 14.93
  • diameter : 2.78

Ι‘ [part]

  • index : 2.3
  • diameter : 12.75

Ι’ [pot]

  • index : 12
  • diameter : 2.05

Ι” [port (rounded)]

  • index : 17.7
  • diameter : 2.05

Ιͺ [pit]

  • index : 26.11
  • diameter : 2.87

i [peat]

  • index : 27.2
  • diameter : 2.2

e [pet]

  • index : 19.4
  • diameter : 3.43

ʌ [put]

  • index : 17.8
  • diameter : 2.46

u [poot (rounded)]

  • index : 22.8
  • diameter : 2.05

Ι™ [pert]

  • index : 20.7
  • diameter : 2.8

🎺 Vocal Tract Constriction phonemes: voiced and voiceless consonants share the same values, differing in voiceness

  • Fricatives

    • (Κ’, Κƒ) ["s" in "pleasure"]

      • index : 31
      • diameter : 0.6
    • (z, s) ["z" in "zoo"]

      • index : 36
      • diameter : 0.6
    • (v, f) ["v" in "very"]

      • index : 41
      • diameter : 0.5
  • Stops

    • (g, k) ["g" in "go"]

      • index : 20
      • diameter : 0
    • (d, t) ["d" in "den"]

      • index : 36
      • diameter : 0
    • (b, p) ["b" in "bad"]

      • index : 41
      • diameter : 0
  • Nasals

    • (Ε‹) ["ng" in "hang"]

      • index : 20
      • diameter : -1
    • (n) ["n" in "not"]

      • index : 36
      • diameter : -1
    • (m) ["m" in "woman"]

      • index : 41
      • diameter : -1

πŸ† Developer Showcase

Send us an email at zack@ukaton.com if you have a cool application made with our api!
Zack
Zack
Zack
Zack
Zack
Zack

πŸ™ Developer Wishlist

Our time is limited, so we'd greatly appreciate it if you guys could implement some of these ideas:

  • IPA Speak n' See πŸ—£οΈπŸ’¬ - Take input speech from the user using the Media Recording API and approximate their articulation using the Pink Trombone, allowing speakers to visualize how they speak.
  • Phonetic Voice Editor πŸŽΉπŸ‘„βŒ¨οΈ - Create a cross between a Text Editor and a Digita Audio Workstation, where the user can type in phonemes instead of characters, with automation to programmatically adjust the cadence, pitch, and other features over time.
  • SSML Simulator πŸ“πŸ’¬ - Implement a Speech Synthesis Markup Language emulator that can take an utterance and process the speech request using Pink Trombone's audio processing

πŸ“– Bibliography