Report
The file cvit2_deepfake_detection_ep_50.pth appears to be documented as if it contains a complete PyTorch model (i.e., saved via torch.save(model)), but in reality, it is a checkpoint dictionary saved via torch.save({...}) and contains a state_dict instead.
This leads to the following problem:
model = torch.load("cvit2_deepfake_detection_ep_50.pth")
model.eval() # π΄ Fails: 'dict' object has no attribute 'eval'
Upon inspection, the object contains:
dict_keys(['epoch', 'state_dict', 'optimizer', 'min_loss'])
There is no code available to recreate the original model architecture used to generate the state_dict, making it impossible to use the model unless the exact layer names and dimensions are guessed or reverse-engineered.
Thank you for sharing this model β it shows promising results. With just a few improvements to documentation and packaging, it can be much easier to use for the community.
The inference code is hosted on Hugging Face Spaces. You can explore it here: https://huggingface.co/spaces/mhamza-007/deepfake-video-detection/tree/main. The model's architecture is defined in modelfile.py