Model weights for the paper "Data-Augmented Phrase-Level Alignment for Mitigating Object Hallucination"
Pritam Sarkar
pritamqu
·
AI & ML interests
multimodal learning with vision, language, and audio; generative modeling; large multimodal models (LMMs); multimodal LLMs (MLLMs); AI agents; alignments; representation learning; self-supervised and unsupervised learning; vision-language models; audio-visual models; foundation models; computer vision
Recent Activity
commented on
a paper
10 days ago
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large
Video Language Models
updated
a dataset
10 days ago
pritamqu/VCRBench
published
a dataset
11 days ago
pritamqu/VCRBench
Organizations
None yet
Collections
1
models
10

pritamqu/LongVU_Qwen2_7B-RRPO-16f
Updated
•
2

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-32f
Updated
•
1

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-16f
Updated
•
1

pritamqu/VideoChat2_stage3_Mistral_7B-RRPO-16f-LORA
Updated

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-32f-LORA
Updated
•
4

pritamqu/LLaVA-Video-7B-Qwen2-RRPO-16f-LORA
Updated
•
2

pritamqu/LongVU_Qwen2_7B-RRPO-16f-LORA
Updated

pritamqu/halva13b-lora
Updated
•
2

pritamqu/halva7b-lora
Updated
•
9

pritamqu/halva13b384-lora
Updated