Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
1
1
Miczu
Miczu
Follow
AI & ML interests
None yet
Recent Activity
upvoted
an
article
about 3 hours ago
G2P Shrinks Speech Models
reacted
to
hexgrad
's
post
with 🔥
about 3 hours ago
Wanted: Peak Data. I'm collecting audio data to train another TTS model: + AVM data: ChatGPT Advanced Voice Mode audio & text from source + Professional audio: Permissive (CC0, Apache, MIT, CC-BY) This audio should *impress* most native speakers, not just barely pass their audio Turing tests. Professional-caliber means S or A-tier, not your average bloke off the street. Traditional TTS may not make the cut. Absolutely no low-fi microphone recordings like Common Voice. The bar is much higher than last time, so there are no timelines yet and I expect it may take longer to collect such mythical data. Raising the bar means evicting quite a bit of old data, and voice/language availability may decrease. The theme is *quality* over quantity. I would rather have 1 hour of A/S-tier than 100 hours of mid data. I have nothing to offer but the north star of a future Apache 2.0 TTS model, so prefer data that you *already have* and costs you *nothing extra* to send. Additionally, *all* the new data may be used to construct public, Apache 2.0 voicepacks, and if that arrangement doesn't work for you, no need to send any audio. Last time I asked for horses; now I'm asking for unicorns. As of writing this post, I've currently got a few English & Chinese unicorns, but there is plenty of room in the stable. Find me over on Discord at `rzvzn`: https://discord.gg/QuGxSWBfQy
new
activity
9 days ago
hexgrad/Kokoro-82M:
Is v1.0 going to get onnx formatted models?
View all activity
Organizations
None yet
Miczu
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
10 days ago
hexgrad/Kokoro-82M
Text-to-Speech
•
Updated
8 days ago
•
271k
•
2.96k