Safetensors
vita-mixtral
MAVEN / README.md
JamesMile's picture
Update README.md
9fd211f verified
metadata
license: mit
base_model:
  - VITA-MLLM/VITA

MAVEN

We provide LoRA fine-tuned model weights and configurations for MAVEN. These files can be used to reproduce our experimental results.

Resources

Citation

If you use this model in your research, please cite the following paper:

@article{ma2025fortisavqa,
  title={FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning},
  author={Ma, Jie and Gao, Zhitao and Chai, Qi and Liu, Jun and Wang, Pinghui and Tao, Jing and Su, Zhou},
  journal={arXiv preprint arXiv:2504.00487},
  year={2025}
}
@inproceedings{malook,
  title={Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering},
  author={Ma, Jie and Hu, Min and Wang, Pinghui and Sun, Wangchun and Song, Lingyun and Pei, Hongbin and Liu, Jun and Du, Youtian},
  booktitle={NeurIPS},
  year={2024}
}