RAVE Google Colab Server useage advice

For those who want to use RAVE from IRCAM (Realtime Audio Variational autoEncoder), you need a specific GPU for things to be acceptable in terms of training.
This can be expensive if you don't have it, both in money but more importantly in time.
I was trying to take part in the RAVE Model Challenge for the fun and my DrilX experiments.

Thanks to Antoine Caillon we have the encoder and thanks to Moisés Horta we have a Google Colab implementation which lets you use free resources that are probably way faster than your hardware if you don't have the right Nvidia chips:
https://colab.research.google.com/drive/13qIV7txhkfkj3VPa-hrPPimO9HIiO-rE#scrollTo=HOxU6HKzQ3UM

But you can also try this Colab: https://colab.research.google.com/drive/1aK8K186QegnWVMAhfnFRofk_Jf7BBUxl?usp=sharing

But even with the nice guides both on YouTube and other resources, there were a few tricks I will write down here hoping it will help you get it work for you too (because it did take me a bit to finally kind of get it).

I hope this document might serve you as a static note to remember what is what if you, like me, tend to find the web or terminal interfaces a bit rough.. ;)

First, you might want to check the most understandable video from IRCAM which is here on YouTube. Then is what I had to write down as notes to have it work on Google Colab:

1 - You need your audio files you want to use for training in a folder ( I will refer to it as 'theNameOfTheFolderWhereTheAudioFilesAre' ). Wav, AIFF files work, seemingly independently of the sampling frequency in my experience.

2 - Either install the necessary software locally, on a server, or on Google Colab, or the three. The previous video is a good guide. But the install lines for Colab are (you can type them and run them in a code block):

!curl -L https://repo.anaconda.com/miniconda/Miniconda3-py39_4.12.0-Linux-x86_64.sh -o miniconda.sh
!chmod +x miniconda.sh
!sh miniconda.sh -b -p /content/miniconda
!/content/miniconda/bin/pip install --quiet acids-rave
!/content/miniconda/bin/pip install --quiet --upgrade ipython ipykernel
!/content/miniconda/bin/conda install ffmpeg

Beware there might be a prompt for you to say 'y' to (yes to continuing installation).

2 - You should connect your Google Colab to your Google Drive now not to loose your data when a session ends (not always in your control / of your willing). You can then resume a training. To do so you click on the small icon on the top of the files section which is a file image with a small Google Drive icon on the top right corner. It will add a pre-filled code section in the main page section that shows:

from google.colab import drive
drive.mount('/content/drive')

Just run this section and follow the instruction to give access to your Google Drive (which will be usually /content/drive/MyDrive/ ).

3 - Preprocess the collection of audio files either on your local machine, server or on Colab (not very CPU/GPU consuming). You will get three files in a separate folder : dat.mdb, lock.mdb, metadata.yaml . These will be the source on which the training will retrieve its information to build the model, so they have to be accessible from your console (e.g. terminal command window or Google Colab page - this is one single line). The Google Colab code block should be (again no break line):
!/content/miniconda/bin/rave preprocess --input_path /content/drive/MyDrive/theNameOfTheFolderWhereTheAudioFilesAre --output_path /content/drive/MyDrive/theNameOfTheFolderWhereYouWantToHavePreparedTrainingDataWrittenIn --channels 1

3 (optional if error at the previous step) - I had to do that in order for the training to run after, it was doing an error otherwise before:

!apt-get update && apt-get install -y sox libsox-dev libsox-fmt-all

This was the error I got at the first training run before this install:
OSError: libsox.so: cannot open shared object file: No such file or directory

4 - Start the data training process, it can be stopped and resumed if some of the training files are stored on your drive, so beware on the saving parameters your ask for. The Google Colab code block should be:

!/content/miniconda/bin/rave train --name aNameYouWantToGiveItThatWillGenerateAFolderWithItAndACodeAfter --db_path /content/drive/MyDrive/theNameOfTheFolderWhereYouWantToHavePreparedTrainingDataWrittenIn/ --out_path /content/drive/MyDrive/theNameOfAFolderWhereYouWantToSaveTheDataCreated --config v2 --augment mute --augment compress --augment gain --save_every 10000 --channels 1

The --save_every argument (a number) is the number of iterations after which is created a temporary checkpoint file (named epoch_theNumber.ckpt). There might be independently other ckpt files created with the name epoch-epoch=theEpochNumberWhenItWasCreated . An epoch represents a complete cycle through your data set and thus a number of iterations (variable depending on the dataset).

5 - Stop the process by stopping the code block, you can resume only if the files are stored somewhere you can access again. Don't forget that and to note the names of your folders (it can get messy).

6 - Resume the training process if for whatever reason it stopped. Your preprocessed data should already be there, so you shouldn't need to reprocess the original audio files. Be careful with the --out_path as if you repeat the name of the autogenerated folder name, it will create a subfolder inside the original with duplication of the config.gin file (and have no idea of the impact on your training). The Google Colab code block should be:

!/content/miniconda/bin/rave train --config $config --db_path theNameOfTheFolderWhereYouWantToHavePreparedTrainingDataWrittenIn --out_path /content/drive/MyDrive/ --name aNameYouWantToGiveItThatYouGaveBeforeAsANameForTraining --ckpt /content/drive/MyDrive/ aNameYouWantToGiveItThatWillGenerateAFolderWithItAndACodeAfter/ version_theNumberOfTheLatestVersionThatWasRunningUsuallyAddsAfterEachResumeAndIs0TheFirstTime/checkpoints/theLatestCheckpointFileNamedEpochWith.ckpt --val_every 1000 --channels 1 --batch 2 --save_every 3000

7 - Create the file for your RAVE decoder (VST) which is named .ts . The Google Colab code block should be:

!/content/miniconda/bin/rave export --run /content/drive/MyDrive/aNameYouWantToGiveItThatWillGenerateAFolderWithItAndACodeAfter/ --streaming TRUE --fidelity 0.98

I will add asap the lines for the TensorBoard monitoring of the data model being created.

@ 1995-2025 zzz.ch - 30+ years of Ze Zee Zone