The DNR dataset is build from three, well-established, audio datasets; Librispeech, Free Music Archive (FMA), and Freesound Dataset 50k (FSD50K). We offer our dataset in both 16kHz and 44.1kHz sampling-rate along time-stamped annotations for each of the classes (genre for ‘music’, audio-tags for ‘sound-effects’, and transcription for ‘speech’). We provide below more informations on how the dataset is build and what it’s consists of exactly. We also go over the process of building the dataset from scratch for the cases it needs to.
- Dataset Overview
- Get the DNR Dataset
- Dataset Analysis
- Audio Examples and Samples
- Resources and Support
Dataset Overview
The Divide and Remix (DNR) dataset is a dataset aiming at providing research support for a relatively unexplored case of source separation with mixtures involving music, speech, and sound-effects (SFX) as their sources. The dataset is build from three, well-established, datasets. Consequently if one wants to build DNR from scratch, the aforementioned datasets will have to be downloaded first. Alternatively, DNR is also available on Zenodo
Get the DNR Dataset
In order to obtain DNR, several options are available depending on the task at hand:
Download
- DNR-HQ (44.1kHz) is available on Zenodo at the following or simply run:
    wget link_to_zenodo_dnr
- Alternatively, if DNR-16kHz is needed, please first download DNR-HQ locally. You can then downsample the dataset (either in-place or not) by cloning the dnr-utils repository and running:
    python dnr_utils.py --task=downsample --inplace=True
Building DNR From Scratch
Since DNR is directly drawn from FSD50K, LibriSpeech/LibriVox, and *FMA, we first need to download these datasets. Please head to the following links for more details on how to get them:
Datasets Downloads
 FSD50K
 FSD50K  
 FMA-Medium Set
 FMA-Medium Set  
 LibriSpeech/LibriVox
 LibriSpeech/LibriVox  
Please note that for FMA, the medium set only is required. In addition to the audio files, the metadata should also be downloaded. For LibriSpeech DNR uses dev-clean, test-clean, and train-clean-100. DNR will use the folder structure as well as metadata from LibriSpeech, but ultimately will build the LibriSpeech-HQ dataset off the original LibriVox mp3s, which is why we need them both for building DNR.
After download, all four datasets are expected to be found in the same root directory. Our root tree may look something like that. As the standardization script will look for specific file name, please make sure that all directory names conform to the ones described below:
root
├── fma-medium
│   ├── fma_metadata
│   │   ├── genres.csv
│   │   └── tracks.csv
│   ├── 008
│   ├── 008
│   ├── 009
│   └── 010
│   └── ...
├── fsd50k
│   ├── FSD50K.dev_audio
│   ├── FSD50K.eval_audio
│   └── FSD50K.ground_truth
│   │   ├── dev.csv
│   │   ├── eval.csv
│   │   └── vocabulary.csv
├── librispeech
│   ├── dev-clean
│   ├── test-clean
│   └── train-clean-100
└── librivox
    ├── 14
    ├── 16
    └── 17
    └── ...
Datasets Standardization
Once all four datasets are downloaded, some standardization work needs to be taken care of. The standardization process can be be executed by running standardization.py, which can be found in the dnr-utils repository. Prior to running the script you may want to install all the necessary dependencies included as part of the requirement.txt with pip install -r requirements.txt. 
Note: pydub uses ffmpeg under its hood, a system install of fmmpeg is thus required.
The standardization command may look something like:
python standardization.py --fsd50k-path=./FSD50K --fma-path=./FMA --librivox-path=./LibriVox --librispeech-path=./LibiSpeech  --dest-dir=./dest --validate-audio=True
DNR Dataset Compilation
Once the three resulting datasets are standardized, we are ready to finally compile DNR. At this point you should already have cloned the dnr-utils repository, which contains two key files:
- 
config.pycontains some configuration entries needed by the main script builder. You want to set all the appropriate paths pointing to your local datasets and ground truth files in there.
- The compilation for a given set (here, train,val, andeval) can be executed withcompile_dataset.py, for example by running the following commands for each set:python compile_dataset.py with cfg.trainpython compile_dataset.py with cfg.valpython compile_dataset.py with cfg.eval