Some Questions About the Data and Fine-Tuning Process

#2
by nmcuong - opened

Thank you for sharing this — I'm also interested in this problem for another language. So I have a few questions:

  1. How much data did you need to fine-tune this ByT5 model?
  2. Where can I find documentation or resources on how to fine-tune it?

Thank you very much!

Folx.it org

Hello!
The dataset used here was a subset of our bigger dataset with roughly 23k pairs in the format:

sentence,normalized_sentence
"Kto by pomyślał, że ona może ważyć 43 kg i mieć 152 cm wzrostu?","Kto by pomyślał, że ona może ważyć czterdzieści trzy kilogramy i mieć sto pięćdziesiąt dwa centymetry wzrostu?"

You can look for any T5 fine-tuning tutorial, for example for machine translation, and teach the model the task:

normalize Kto by pomyślał, że ona może ważyć 43 kg i mieć 152 cm wzrostu? -> Kto by pomyślał, że ona może ważyć czterdzieści trzy kilogramy i mieć sto pięćdziesiąt dwa centymetry wzrostu?

krzynio changed discussion status to closed
krzynio changed discussion status to open

Thank you. I was able to fine-tune the model for my own language using a fairly large dataset, even though the data was inconsistent and noisy. However, the accuracy was still within an acceptable range.

That said, I’ve noticed that the inference time of the model using the transformers library is quite poor (slow). This makes it difficult to ensure real-time performance in practical environments such as TTS or Chatbot tasks.
Do you have any suggestions regarding this deployment aspect? I appreciate your help.

Folx.it org

You can convert the model to Ctranslate2 format and it will speed things up, CT2 supports T5 models.

Thank you very much. I’ve finished the deployment and the speed has improved significantly.

Sign up or log in to comment