Verity-1A: Florence-2 + FLODA Deepfake Detection Model

🎯 Model Description

Verity-1A is an advanced multimodal model combining Microsoft's Florence-2-base with the FLODA-deepfake LoRA adapter for enhanced AI-generated content detection. This fusion creates a specialized model optimized for identifying deepfakes and AI-generated images while maintaining Florence-2's powerful vision-language capabilities.

πŸ—οΈ Model Architecture

  • Base Model: Microsoft Florence-2-base (768d architecture)
  • Enhancement: FLODA-deepfake LoRA adapter
  • Model Size: ~447 MB
  • Optimization: PEFT-based fusion for efficient inference

πŸš€ Key Features

  • βœ… Deepfake Detection: Specialized for AI-generated content identification
  • βœ… Multimodal: Combines vision and language understanding
  • βœ… Compact: 6.7x smaller than Florence-2-large
  • βœ… Production-Ready: Fully validated and optimized

πŸ“Š Performance

  • Architecture: 768-dimensional embeddings
  • Parameters: ~232M parameters
  • Inference: Optimized for real-time detection
  • Compatibility: Full Transformers ecosystem support

πŸ› οΈ Usage

from transformers import AutoModelForCausalLM, AutoProcessor
import torch

# Load model
model = AutoModelForCausalLM.from_pretrained(
    "zelus82/verity-1A",
    torch_dtype=torch.float16,
    trust_remote_code=True
)

# Load processor
processor = AutoProcessor.from_pretrained(
    "zelus82/verity-1A",
    trust_remote_code=True
)

# Example usage for deepfake detection
def detect_deepfake(image, text_prompt="Is this image AI-generated?"):
    inputs = processor(text=text_prompt, images=image, return_tensors="pt")
    
    with torch.no_grad():
        generated_ids = model.generate(
            input_ids=inputs["input_ids"],
            pixel_values=inputs["pixel_values"],
            max_new_tokens=1024,
            num_beams=3
        )
    
    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
    return generated_text

πŸŽ“ Training Details

  • Base Training: Microsoft Florence-2-base foundation
  • Specialization: FLODA-deepfake LoRA fine-tuning
  • Fusion Method: PEFT merge_and_unload for optimal performance
  • Validation: Comprehensive 666-tensor validation passed

πŸ“‹ Model Card

Attribute Value
Model Type Multimodal Vision-Language
Base Architecture Florence-2
Specialization Deepfake Detection
Model Size 447 MB
Parameters ~232M
Precision Float16
License MIT

πŸ”§ Technical Specifications

  • Hidden Size: 768
  • Vocabulary Size: 51,289
  • Vision Encoder: Advanced transformer-based
  • Language Model: Optimized for detection tasks
  • LoRA Rank: 8 (optimal efficiency/performance)

⚠️ Limitations

  • Optimized specifically for deepfake detection tasks
  • Based on Florence-2-base architecture (768d)
  • Not compatible with Florence-2-large components
  • Requires trust_remote_code=True for full functionality

πŸ“„ Citation

@model{verity1a2024,
  title={Verity-1A: Florence-2 Enhanced Deepfake Detection},
  author={zelus82},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/zelus82/verity-1A}
}

🀝 Acknowledgments

  • Microsoft for the Florence-2 foundation model
  • FLODA team for the deepfake detection adapter
  • Hugging Face for the ecosystem and hosting

πŸ“ž Contact

For questions or collaborations, please reach out through the Hugging Face community discussions.


Built with ❀️ for safer AI content detection

Downloads last month
12
Safetensors
Model size
232M params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support