- Overview
- Supported operated systems
- System requirements
- Known issues
- How to use
- Troubleshooting
- Contribution
- Working with the source
This is a fork of w-okada voice changer that performs real-time voice conversion using various voice conversion algorithms.
Important
This version works only with Retrieval-based Voice Conversion (RVC).
The fork aims to improve the overall performance for any backend, and at the same time introducing new features and improving user experience.
The following videos demonstrate how the voice changer works and performs with AMD graphics cards (including integrated GPU!):
Amd.iGPU.webm
Amd.Dgpu.Rx6600m.webm
And this one demonstrates how the voice changer works and performs with Nvidia GeForce GTX 1650 laptop:
Nvidia.Dgpu.Gtx.1650.webm
- Windows 10 or later.
- Linux.
- macOS 12 Monterey or later. With Apple Silicon or Intel CPU.
Important
Minimum requirement means that you will be able to run ONLY the voice changer. Voice conversion and gaming at the same time will not provide satisfying experience with minimum requirements in most cases.
RAM: at least 6GB.
Disk space: at least 6GB of free disk space. For fast model loading, SSD is recommended.
Minimum requirement: Intel Core i5-4690K or AMD FX-6300.
Recommended requirement: Intel Core i5-10400F or AMD Ryzen 5 1600X.
Minimum VRAM required: 2GB (in FP32 mode), ~1GB (in FP16 mode, if supported).
Minimum requirement:
- An integrated graphics card: AMD Radeon Vega 7 (with AMD Ryzen 5 5600G) or later.
- A dedicated graphics card: Nvidia GeForce GTX 900 Series or later, or AMD Radeon RX 400 series or later, or Intel Arc A300 series or later.
Note
It is also possible to use Nvidia GeForce GTX 700 series GPUs. However, they can be used only with DirectML version.
Warning
The voice changer does not perform well with integrated Intel GPUs. This is a known issue that may be addressed in the future. You may proceed at your own risk and report issues or successful usage.
Recommended requirement:
A dedicated graphics card Nvidia GeForce RTX 20 Series or later, or AMD Radeon RX 6000 series or later, or Intel Arc A500 series or later.
- Mozilla Firefox ESR may not display audio devices.
-
When changing Chunk, Extra or Crossfade size settings, you must switch device to CPU then back to your GPU. Otherwise, performance issues can be observed.
-
Only
rmvpe_onnx
,fcpe_onnx
,crepe_tiny_onnx
andcrepe_full_onnx
are available in the list of F0 Det.. -
When using a laptop with integrated GPU and dedicated GPU, severely degraded performance (up to 50% reduction) can be observed when running the voice changer on built-in display.
-
Slightly degraded performance (up to 25% reduction) can be observed with multi-GPU setups.
-
AMD Radeon RX 7000 series may be unable to achieve low latency (below 256ms).
- When starting voice conversion for the first time, it may take up to 5-7 seconds to start outputting the converted voice.
- Only "perf" metric is reported in server audio mode with
rest
protocol.
-
[If not installed] Download and install VAC Lite by Muzychenko.
-
Navigate to the releases section.
-
Open Task Manager > Performance.
-
Click CPU, check and note the processor model on the right. An example: AMD Ryzen 7 5800H with Radeon Graphics.
-
Check and note graphics card models under GPU. An example:
-
GPU 0: AMD Radeon RX 6600M.
-
GPU 1: AMD Radeon(TM) Graphics.
-
Tip
For AMD users, the recommended driver version is 24.6.1
or later.
-
Download the
voice-changer-windows-amd64-dml.zip
ZIP file. -
Right-click the ZIP file. In the opened action menu select 7-Zip > Extract to "voice-changer-windows-amd64-dml\".
-
Make sure your Nvidia driver version is
528.33
or later. Click here to learn how to check your driver version. -
Download the
voice-changer-windows-amd64-cuda.zip.001
andvoice-changer-windows-amd64-cuda.zip.002
ZIP files and place them in the same folder. -
Right-click the
voice-changer-windows-amd64-cuda.zip.001
ZIP file. In the opened action menu select 7-Zip > Extract to "voice-changer-windows-amd64-cuda\". This will unpack both files, no need to unpack them separately.
The following examples demonstrate the unpacking process:
-
Open the extracted folder (
voice-changer-windows-amd64-dml
orvoice-changer-windows-amd64-cuda
) >MMVCServerSIO
. -
Run
MMVCServerSIO.exe
.
When running the voice changer for the first time, it will start downloading necessary files. Do not close the window until the download finishes.
Once the download is finished, the voice changer will open the user interface using your default web browser.
Important
macOS support is experimental.
-
Download the
voice-changer-macos-arm64-cpu.tar.gz
file. -
Double-click the file. The voice changer will unpack and the
MMVCServerSIO
folder will appear.
Note
The voice changer would work best if your Intel-based machine has AMD graphics. If your machine has only Intel integrated graphics, only CPU will be utilized.
-
Download the
voice-changer-macos-amd64-cpu.tar.gz
file. -
Double-click the file. The voice changer will unpack and the
MMVCServerSIO
folder will appear.
Warning
Currently, this step is mandatory. Otherwise, the voice changer will fail to start with an error related to Python.framework being damaged. This may be improved in the future.
-
Open Terminal.
-
Run the following command:
xattr -dr com.apple.quarantine <Path to extracted MMVCServerSIO folder>
For example, if you extracted the voice changer to your desktop, the command may look as follows:
xattr -dr com.apple.quarantine ~/Desktop/MMVCServerSIO
-
Open the extracted
MMVCServerSIO
folder. -
Double-click
MMVCServerSIO
to run the voice changer.
Refer to corresponding Colab or Kaggle notebooks in this repository and follow their instructions.
Tip
When any issue with the voice changer occurs, check the command line window (the one that opens during the start) for errors.
Either the remote files have changed or your files were corrupted. The error will show which files are affected above the error:
[WeightDownloader] 'pretrain/content_vec_500.onnx failed to pass hash verification check. Got 1931e237626b80d65ae44cbacd4a5197, expected ab288ca5b540a4a15909a40edf875d1e'
[WeightDownloader] 'pretrain/rmvpe.onnx failed to pass hash verification check. Got 65030149d579a65f15aa7e85769c32f1, expected b6979bf69503f8ec48c135000028a7b0'
Find and delete the mentioned files from the voice changer folder and restart the voice changer. Deleted files will be re-downloaded.
-
Make sure that you have given the permission to access the microphone.
-
If you are using Mozilla Firefox ESR, there may be an issue with audio devices. Use other web browser (preferably Chrome or Chromium-based).
-
Make sure you have selected correct input and output audio devices.
-
Make sure your input device is not muted. Check the microphone volume in the system settings or hardware switch on your headset (usually a button, if present).
In the voice changer, make sure passthru is not on (indicated by blinking red color). Click it to switch it off (indicated by solid green color).
-
Make sure you are using VAC by Muzychenko (indicated by the Line 1 audio device name).
-
In Windows Sound Control Panel, make sure that the sample rate of your microphone matches the sample rate of the virtual cable.
The following example shows the configuration of the virtual cable and the microphone:
-
If nothing helped, in Task Manager > Details, find "audiodg.exe" process and do the folowing:
-
Right-click "audiodg.exe" > Set priority > High.
-
Right-click "audiodg.exe" > Set affinity. Uncheck every option, then only select CPU 2.
-
-
If you changed chunk when voice conversion was on, click Stop then Start again.
-
Make sure the perf time is smaller than Chunk. Increase Chunk or reduce Extra and Crossfade size.
At the moment, the fork does not accept any code contributions. However, feel free to report any issues you encounter during usage.
-
[If not installed] Download and install Python 3.10.
-
[If not installed] Download and install git.
-
Open a command line.
-
Verify your Python version by running the following command:
python --version Python 3.10.8
-
Clone the repository.
-
Navigate to the
server
folder.
-
[If not set up] Set up virtual environment with the following command:
python -m venv venv
-
Activate virtual environment using one of the following commands:
-
For Windows:
.\venv\Scripts\activate.ps1
-
For Linux/macOS:
source ./venv/bin/activate
-
-
Install the requirements using one of the following commands:
-
For AMD/Intel/CPU (Windows only):
pip install -r requirements-common.txt -r requirements-dml.txt
-
For Nvidia (any OS):
pip install -r requirements-common.txt -r requirements-cuda.txt
-
For AMD ROCm (Linux only):
pip install -r requirements-common.txt -r requirements-rocm.txt
-
For CPU (Linux/macOS only):
pip install -r requirements-common.txt -r requirements-cpu.txt
-
Run the server by executing main.py
.
python ./main.py
This will run the server with default settings. Note that it will not open the web browser by default, copy the address from command line.
-
[If not installed] Install
pyinstaller
with the following command:pip install --upgrade pip wheel setuptools pyinstaller
-
Run the following command to build an executable:
pyinstaller --clean -y --dist ./dist --workpath /tmp MMVCServerSIO.spec
This will output the resulting executable in the
dist
folder.