Data Pre-processing
Convert from MusicXML
Navigate to the data folder
cd data/
Modify the
ORI_FOLDER
andDES_FOLDER
in1_batch_xml2abc.py
, then run this script:python 1_batch_xml2abc.py
This will conver the MusicXML files into standard ABC notation files.
Modify the
ORI_FOLDER
,INTERLEAVED_FOLDER
,AUGMENTED_FOLDER
, andEVAL_SPLIT
in2_data_preprocess.py
:ORI_FOLDER = '' # Folder containing standard ABC notation files INTERLEAVED_FOLDER = '' # Output interleaved ABC notation files that are compatible with CLaMP 2 to this folder AUGMENTED_FOLDER = '' # On the basis of interleaved ABC, output key-augmented and rest-omitted files that are compatible with NotaGen to this folder EVAL_SPLIT = 0.1 # Evaluation data ratio
then run this script:
python 2_data_preprocess.py
The script will convert the standard ABC to interleaved ABC, which is compatible with CLaMP 2. The files will be under
INTERLEAVED_FOLDER
.This script will make 15 key signature folders under the
AUGMENTED_FOLDER
, and output interleaved ABC notation files with rest bars omitted. This is the data representation that NotaGen adopts.This script will also generate data index files for training NotaGen. It will randomly split train and eval sets according to the proportion
EVAL_SPLIT
defines. The index files will be named as{AUGMENTED_FOLDER}_train.jsonl
and{AUGMENTED_FOLDER}_eval.jsonl
.
Data Post-processing
Preview Sheets in ABC Notation
We recommend EasyABC, a nice software for ABC Notation previewing, composing and editing.
It's needed to add a line "X:1" before each piece to present the score image in EasyABC :D
Convert to MusicXML
- Go to the data folder
cd data/
- Modify the
ORI_FOLDER
andDES_FOLDER
in3_batch_abc2xml.py
, then run this script:
This will conver the standard/interleaved ABC notation files into MusicXML files.python 3_batch_abc2xml.py