A different audio tokenizer?

by yukiarimo - opened 11 days ago

11 days ago

Hello! I just tried the model. Seems good. Training works.

BUT: I would love to see a version with a model straightforward audio tokenizer like WavTokenizer or Mimi (for Kyutai like Sesame has) instead of SNAC.

Thanks!

kadirnar

Vyvo org 10 days ago

Hello! I just tried the model. Seems good. Training works.

BUT: I would love to see a version with a model straightforward audio tokenizer like WavTokenizer or Mimi (for Kyutai like Sesame has) instead of SNAC.

Thanks!

I will release models trained with different codecs in a month. WavTokenizer is nice, but its quality isn't great. Mimi seems good. However, I prefer better and faster codec architectures.

yukiarimo

10 days ago

Got it, thanks for your response! Would like to see codecs that have one flow of tokens, instead of three like in SNAC :)

yukiarimo

8 days ago

Also, please release a 48 kHz version and possible to fine-tune the codec! Thanks!

kadirnar

Vyvo org 8 days ago

Got it, thanks for your response! Would like to see codecs that have one flow of tokens, instead of three like in SNAC :)

I'm developing the CodecHub library to measure the performance of different Audio Codec models.
https://github.com/Vyvo-Labs/CodecHub

For example, this codec is good, but I don't know its quality. That's why I want to test many codec models and choose a faster one.
https://github.com/zhai-lw/SQCodec

kadirnar

Vyvo org 8 days ago

Also, please release a 48 kHz version and possible to fine-tune the codec! Thanks!

I haven't trained a model related to 48kHz. Gathering a suitable dataset for this might be difficult. The Emilia dataset is in 24kHz format. I could upscale it, but for 150k hours of data, this would take too long and be very costly.

yukiarimo

8 days ago

Nah, it’s fine. Even if your dataset is 24 kHz, just make sure the codec itself is 48 kHz, so people can fine-tune up to 48 kHz :)

yukiarimo

7 days ago

Solved

yukiarimo changed discussion status to closed 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment