nicccobb (Sisisi Certo)

There are so many that I'm still debugging it since 2 days. I might come up with something new later on.

For the time being, I found these pitfalls:

tokenizer = Wav2Vec2CTCTokenizer.from_pretrained("./", unk_token="[UNK]", pad_token="[PAD]", word_delimiter_token="|") is wrong. This raises an exception if you don't still have a tokenizer.

2)You don't pass the vocab_file as input, the tokenizer is not pretrained (as at that point you don't yet have it ). It should be

# TODO: make sure that tokenizer is restored if pretrained
    tokenizer = transformers.Wav2Vec2CTCTokenizer(
        vocab_file=vocab_path,
        unk_token = "[UNK]",
        pad_token = "[PAD]",
        word_delimiter_token = "|"
    )

but what do I know.

tokenizer=processor.feature_extractor is deprecated, processing_class is needed.
there is no fine tuning here: we're simply retraining everything (which is not the most intelligent thing to do: we're training from scratch the CTC head but with the same lr finetuning on the whole model).

Overall, it's not entirely your fault, since you're working with the badly organized and terribly documented HuggingFace library. You clearly didn't test the code, which is kind of a bummer. Maybe some of the errors are not due to your code, but you could have tested it before posting this.

Sisisi Certo

AI & ML interests

Recent Activity

Organizations

Sisisi Certo

AI & ML interests

Recent Activity

Organizations

nicccobb's activity