Text-to-Speech
coqui
reubenm's picture
Model works best with 6 seconds of reference, not 3
c386dfb