Verity-1A: Florence-2 + FLODA Deepfake Detection Model
π― Model Description
Verity-1A is an advanced multimodal model combining Microsoft's Florence-2-base with the FLODA-deepfake LoRA adapter for enhanced AI-generated content detection. This fusion creates a specialized model optimized for identifying deepfakes and AI-generated images while maintaining Florence-2's powerful vision-language capabilities.
ποΈ Model Architecture
- Base Model: Microsoft Florence-2-base (768d architecture)
- Enhancement: FLODA-deepfake LoRA adapter
- Model Size: ~447 MB
- Optimization: PEFT-based fusion for efficient inference
π Key Features
- β Deepfake Detection: Specialized for AI-generated content identification
- β Multimodal: Combines vision and language understanding
- β Compact: 6.7x smaller than Florence-2-large
- β Production-Ready: Fully validated and optimized
π Performance
- Architecture: 768-dimensional embeddings
- Parameters: ~232M parameters
- Inference: Optimized for real-time detection
- Compatibility: Full Transformers ecosystem support
π οΈ Usage
from transformers import AutoModelForCausalLM, AutoProcessor
import torch
# Load model
model = AutoModelForCausalLM.from_pretrained(
"zelus82/verity-1A",
torch_dtype=torch.float16,
trust_remote_code=True
)
# Load processor
processor = AutoProcessor.from_pretrained(
"zelus82/verity-1A",
trust_remote_code=True
)
# Example usage for deepfake detection
def detect_deepfake(image, text_prompt="Is this image AI-generated?"):
inputs = processor(text=text_prompt, images=image, return_tensors="pt")
with torch.no_grad():
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
return generated_text
π Training Details
- Base Training: Microsoft Florence-2-base foundation
- Specialization: FLODA-deepfake LoRA fine-tuning
- Fusion Method: PEFT merge_and_unload for optimal performance
- Validation: Comprehensive 666-tensor validation passed
π Model Card
Attribute | Value |
---|---|
Model Type | Multimodal Vision-Language |
Base Architecture | Florence-2 |
Specialization | Deepfake Detection |
Model Size | 447 MB |
Parameters | ~232M |
Precision | Float16 |
License | MIT |
π§ Technical Specifications
- Hidden Size: 768
- Vocabulary Size: 51,289
- Vision Encoder: Advanced transformer-based
- Language Model: Optimized for detection tasks
- LoRA Rank: 8 (optimal efficiency/performance)
β οΈ Limitations
- Optimized specifically for deepfake detection tasks
- Based on Florence-2-base architecture (768d)
- Not compatible with Florence-2-large components
- Requires trust_remote_code=True for full functionality
π Citation
@model{verity1a2024,
title={Verity-1A: Florence-2 Enhanced Deepfake Detection},
author={zelus82},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/zelus82/verity-1A}
}
π€ Acknowledgments
- Microsoft for the Florence-2 foundation model
- FLODA team for the deepfake detection adapter
- Hugging Face for the ecosystem and hosting
π Contact
For questions or collaborations, please reach out through the Hugging Face community discussions.
Built with β€οΈ for safer AI content detection
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support