frechet_audio_distance

Frechet Audio Distance

This repository provides supporting code used to compute the Fréchet Audio Distance (FAD), a reference-free evaluation metric for audio generation algorithms, in particular music enhancement.

For more details about Fréchet Audio Distance and how we verified it please check out our paper:

K. Kilgour et. al., Fréchet Audio Distance: A Metric for Evaluating Music Enhancement Algorithms,

Useage

FAD depends on:

and also requires downloading a VGG model checkpoint file:

VGGish model checkpoint

Example installation and use

Get the FAD code

$ git clone https://github.com/google-research/google-research.git
$ cd google-research

Install dependencies

Create a virtualenv to isolate from everything else and activate it first.

# Python 2
$ virtualenv fad
# or Oython 3
$ python3 -m venv fad # (apache-beam does not yet support Python 3)
# activate the virtualenv
$ source fad/bin/activate
# Upgrade pip
$ python -m pip install --upgrade pip
# Install dependences
$ pip install apache-beam numpy scipy tensorflow

Clone TensorFlow models repo into a 'models' directory.

$ mkdir tensorflow_models
$ touch tensorflow_models/__init__.py
$ svn export https://github.com/tensorflow/models/trunk/research/audioset tensorflow_models/audioset/
$ touch tensorflow_models/audioset/__init__.py

Download data files into a data directory

$ mkdir -p data
$ curl -o data/vggish_model.ckpt https://storage.googleapis.com/audioset/vggish_model.ckpt

Create test files and file lists

This will generate a set of background test files (sine waves at different frequencies). And two test sets of sine waves with distortions.

$ python -m frechet_audio_distance.gen_test_files --test_files "test_audio"

#Add them to file lists:
$ ls --color=never test_audio/background/*  > test_audio/test_files_background.cvs
$ ls --color=never test_audio/test1/*  > test_audio/test_files_test1.cvs
$ ls --color=never test_audio/test2/*  > test_audio/test_files_test2.cvs

Compute embeddings and eastimate multivariate Gaussians

$ mkdir -p stats
$ python -m frechet_audio_distance.create_embeddings_main --input_files test_audio/test_files_background.cvs --stats stats/background_stats
$ python -m frechet_audio_distance.create_embeddings_main --input_files test_audio/test_files_test1.cvs --stats stats/test1_stats
$ python -m frechet_audio_distance.create_embeddings_main --input_files test_audio/test_files_test2.cvs --stats stats/test2_stats

Compute the FAD from the stats

$ python -m frechet_audio_distance.compute_fad --background_stats stats/background_stats --test_stats stats/test1_stats
$ python -m frechet_audio_distance.compute_fad --background_stats stats/background_stats --test_stats stats/test2_stats

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
__init__.py		__init__.py
audioset_model.py		audioset_model.py
compute_fad.py		compute_fad.py
create_embeddings_beam.py		create_embeddings_beam.py
create_embeddings_main.py		create_embeddings_main.py
fad_utils.py		fad_utils.py
gen_test_files.py		gen_test_files.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frechet_audio_distance

frechet_audio_distance

README.md

Frechet Audio Distance

Useage

Example installation and use

Get the FAD code

Install dependencies

Clone TensorFlow models repo into a 'models' directory.

Download data files into a data directory

Create test files and file lists

Compute embeddings and eastimate multivariate Gaussians

Compute the FAD from the stats

Files

frechet_audio_distance

Directory actions

More options

Directory actions

More options

Latest commit

History

frechet_audio_distance

Folders and files

parent directory

README.md

Frechet Audio Distance

Useage

Example installation and use

Get the FAD code

Install dependencies

Clone TensorFlow models repo into a 'models' directory.

Download data files into a data directory

Create test files and file lists

Compute embeddings and eastimate multivariate Gaussians

Compute the FAD from the stats