Question
#7
by
mrfakename
- opened
Hi,
Thanks for releasing Canary!
I noticed that on the model card, it stated:
The canary-1b-flash model is trained on a total of 85K hrs of speech data. It consists of 31K hrs of public data, 20K hrs collected by Suno, and 34K hrs of in-house data. The datasets below include conversations, videos from the web and audiobook recordings.
Is the Suno dataset for transcribing music lyrics and is it publicly available?
Thanks!
No its ASR data
nithinraok
changed discussion status to
closed