Music Triangle

“Without music, life would be a mistake” ---Friedrich Nietzsche

Super-efficient and capable MIDI music AI encoding/tokenization strategy proposal

Basic Concepts

Break down standard MIDI composition into pitches/time-shifts sequences
We only use pitches values and delta start-times to encode all music info
Once the model is created and output sequence is generated, music info is restored back by using a fuzzy music triangle ratio calculations

Music Triangle Diagram

LEGEND

[delta start-time] dt == t - pt | range(0-127)
[MIDI pitch] P == P | range(0-127)
[duration] D == sqrt(dt^2 + P^2) | range(0-127)
[velocity] V == P | range(0-127) or you can do whatever here (i.e inversion or cosine...)

Proposed Encoding/Tokenization Sequence Info

Variable sequence length
Pitch/time-shift data is encoded as (0-127) for notes/chords pitches and (128-255) for time-shift/EOS tokens
I.e. [21, 64, 60, 57, 215, 33, 64, 60, 55, 215, 36, 72, 183, 64, 161]

Model output decoding example

if len(out) != 0:
  song = []
  sng = []
  for o in tqdm.tqdm(out):
    
    if o < 128:
      sng.append(o)
    else:
      if len(sng) > 0:
        sng.append(o-127)
        song.append(sng)
        sng = []


  song_f = []
  time = 0
  for s in tqdm.tqdm(song):
    for ss in s[:-1]:
        
        song_f.append(['note', time, s[-1] * 15, 0, ss, ss])
    time += (s[-1] * 5)

  detailed_stats = TMIDIX.Tegridy_SONG_to_MIDI_Converter(song_f,
                                                        output_signature = 'Project Los Angeles',  
                                                        output_file_name = 'Music-Triangle-Composition', 
                                                        track_name='Tegridy Code 2021', 
                                                        number_of_ticks_per_quarter=500)
  print('Done!')
    
detailed_stats

Dataset/detokenization ratios/restore coefficient calculation

Multiplication coefficients are dataset dependent
Coefficients from a large dataset can be transposed to a smaller subset with good results
Coefficients, in this case, are dt and D averages for the entire dataset
For example, coefficients(k) for the provided sample INTs dataset are dtk == 12, Dk == 72
Or namely dtk == 1, Dk == 6
Since sample INTs dataset timings were dividied by 5, final coefficients are: dtk == 5, Dk == 30, which will produce nice sustained durations

Citation

@inproceedings{lev2021musictriangle,
    title       = {Music Triangle},
    author      = {Aleksandr Lev},
    booktitle   = {GitHub},
    year        = {2021},
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
Music Triangle-Diagram.png		Music Triangle-Diagram.png
Music-Triangle-Sample-INTs-Dataset.pickle		Music-Triangle-Sample-INTs-Dataset.pickle
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Triangle

“Without music, life would be a mistake” ---Friedrich Nietzsche

Super-efficient and capable MIDI music AI encoding/tokenization strategy proposal

Basic Concepts

Music Triangle Diagram

LEGEND

Proposed Encoding/Tokenization Sequence Info

Model output decoding example

Dataset/detokenization ratios/restore coefficient calculation

Citation

Project Los Angeles

Tegridy Code 2021

About

License

asigalov61/Music-Triangle

Folders and files

Latest commit

History

Repository files navigation

Music Triangle

“Without music, life would be a mistake” ---Friedrich Nietzsche

Super-efficient and capable MIDI music AI encoding/tokenization strategy proposal

Basic Concepts

Music Triangle Diagram

LEGEND

Proposed Encoding/Tokenization Sequence Info

Model output decoding example

Dataset/detokenization ratios/restore coefficient calculation

Citation

Project Los Angeles

Tegridy Code 2021

About

Topics

Resources

License

Stars

Watchers

Forks