Question

#7
by mrfakename - opened

Hi,
Thanks for releasing Canary!
I noticed that on the model card, it stated:

The canary-1b-flash model is trained on a total of 85K hrs of speech data. It consists of 31K hrs of public data, 20K hrs collected by Suno, and 34K hrs of in-house data. The datasets below include conversations, videos from the web and audiobook recordings.

Is the Suno dataset for transcribing music lyrics and is it publicly available?
Thanks!

NVIDIA org

No its ASR data

nithinraok changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment