Sisisi Certo's picture

Sisisi Certo

nicccobb

AI & ML interests

None yet

Recent Activity

Organizations

None yet

nicccobb's activity

view reply

There are so many that I'm still debugging it since 2 days. I might come up with something new later on.

For the time being, I found these pitfalls:

  1. tokenizer = Wav2Vec2CTCTokenizer.from_pretrained("./", unk_token="[UNK]", pad_token="[PAD]", word_delimiter_token="|") is wrong. This raises an exception if you don't still have a tokenizer.

2)You don't pass the vocab_file as input, the tokenizer is not pretrained (as at that point you don't yet have it ). It should be

# TODO: make sure that tokenizer is restored if pretrained
    tokenizer = transformers.Wav2Vec2CTCTokenizer(
        vocab_file=vocab_path,
        unk_token = "[UNK]",
        pad_token = "[PAD]",
        word_delimiter_token = "|"
    ) 

but what do I know.

  1. tokenizer=processor.feature_extractor is deprecated, processing_class is needed.

  2. there is no fine tuning here: we're simply retraining everything (which is not the most intelligent thing to do: we're training from scratch the CTC head but with the same lr finetuning on the whole model).

Overall, it's not entirely your fault, since you're working with the badly organized and terribly documented HuggingFace library. You clearly didn't test the code, which is kind of a bummer. Maybe some of the errors are not due to your code, but you could have tested it before posting this.

view reply

This simply is a low-effort, copy-pasted, badly coded post. Of course the author had little to no understanding of anything at all when he coded this.