File size: 309 Bytes
4a2b912 |
1 |
This is a conditional unet model designed for music generation using mel spectrogram images. The model was trained on the alppo/music dataset, which includes 5 different genres. It accepts 512x512 images and 1x64 condition embeddings, which can be generated from my own variational autoencoder implementation. |