Report

#1
by ClovisSilva - opened

The file cvit2_deepfake_detection_ep_50.pth appears to be documented as if it contains a complete PyTorch model (i.e., saved via torch.save(model)), but in reality, it is a checkpoint dictionary saved via torch.save({...}) and contains a state_dict instead.

This leads to the following problem:

model = torch.load("cvit2_deepfake_detection_ep_50.pth")
model.eval() # πŸ”΄ Fails: 'dict' object has no attribute 'eval'

Upon inspection, the object contains:

dict_keys(['epoch', 'state_dict', 'optimizer', 'min_loss'])
There is no code available to recreate the original model architecture used to generate the state_dict, making it impossible to use the model unless the exact layer names and dimensions are guessed or reverse-engineered.

Thank you for sharing this model β€” it shows promising results. With just a few improvements to documentation and packaging, it can be much easier to use for the community.

The inference code is hosted on Hugging Face Spaces. You can explore it here: https://huggingface.co/spaces/mhamza-007/deepfake-video-detection/tree/main. The model's architecture is defined in modelfile.py

mhamza-007 changed discussion status to closed
mhamza-007 changed discussion status to open

Sign up or log in to comment