Variational Autoencoder Conditioned Diffusion Model

This model is designed to generate music tracks based on input playlists by extracting the "taste" from the playlists using a combination of a Variational Autoencoder (VAE) and a conditioned diffusion model.

Model Details

  • VAE: Learns a compressed latent space representation of the input data, specifically mel spectrogram images of audio samples.
  • Diffusion Model: Generates new data points by progressively refining random noise into meaningful data, conditioned on the VAE's latent space.
Downloads last month
17
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train alppo/vae-conditioned-diffusion-model_v2