Error trying to finetune this model via ragatouille

#1
by CitizenKayne - opened

Hello,

In order to save some time and avoid training a colbertv2 on french data, I want to finetune this model on my data. I am using ragatouille 0.0.8.post2 and python 3.12 and my training config looks like this :

trainer = RAGTrainer(model_name="test", pretrained_model_name="antoinelouis/colbertv2-camembert-L4-mmarcoFR", language_code="fr")

trainer.prepare_training_data(
        raw_data = pairs,
        all_documents = documents,
        num_new_negatives = 10,
        mine_hard_negatives= True,
        )

trainer.train(batch_size=32,
              nbits=4, # How many bits will the trained model use when compressing indexes
              maxsteps=500000, # Maximum steps hard stop
              use_ib_negatives=True, # Use in-batch negative to calculate loss
              dim=32, # How many dimensions per embedding. 128 is the default and works well.
              learning_rate=1e-5, # Learning rate, small values ([3e-6,3e-5] work best if the base model is BERT-like, 5e-6 is often the sweet spot)
              doc_maxlen=160, # Maximum document length. Because of how ColBERT works, smaller chunks (128-256) work very well.
              use_relu=False, # Disable ReLU -- doesn't improve performance
              warmup_steps=20000, # Defaults to 10%
             )

I keep getting the error ValueError: The state dictionary of the model you are trying to load is corrupted. Are you sure it was properly saved?

Does anyone have an idea on how to fix this please ?

Sign up or log in to comment