Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning
AI & ML interests
Computer Vision
Recent Activity
View all activity
Papers
ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
ExpVid: A Benchmark for Experiment Video Understanding & Reasoning
Organization Card
OpenGVLab
Welcome to OpenGVLab! We are a research group from Shanghai AI Lab focused on Vision-Centric AI research. The GV in our name, OpenGVLab, means general vision, a general understanding of vision, so little effort is needed to adapt to new vision-based tasks.
Models
- InternVL: a pioneering open-source alternative to GPT-4V.
- InternImage: a large-scale vision foundation models with deformable convolutions.
- InternVideo: large-scale video foundation models for multimodal understanding.
- VideoChat: an end-to-end chat assistant for video comprehension.
- All-Seeing-Project: towards panoptic visual recognition and understanding of the open world.
Datasets
- ShareGPT4o: a groundbreaking large-scale resource that we plan to open-source with 200K meticulously annotated images, 10K videos with highly descriptive captions, and 10K audio files with detailed descriptions.
- InternVid: a large-scale video-text dataset for multimodal understanding and generation.
- MMPR: a high-quality, large-scale multimodal preference dataset.
Benchmarks
- MVBench: a comprehensive benchmark for multimodal video understanding.
- CRPE: a benchmark covering all elements of the relation triplets (subject, predicate, object), providing a systematic platform for the evaluation of relation comprehension ability.
- MM-NIAH: a comprehensive benchmark for long multimodal documents comprehension.
- GMAI-MMBench: a comprehensive multimodal evaluation benchmark towards general medical AI.
spaces
13
Running
4
ScaleCUA Demo
📚
Display web content in a Streamlit app
Runtime error
InternVideo2.5
💬
Hierarchical Compression for Long-Context Video Modeling
Running
500
InternVL
⚡
Interact with a multimodal chatbot that analyzes images and text
Running
40
MVBench Leaderboard
🐨
Submit and view model evaluations
Runtime error
18
InternVideo2 Chat 8B HD
👁
Upload a video to chat about its contents
models
286
OpenGVLab/Vlaser-2B-VLA
Updated
•
3
OpenGVLab/Vlaser-8B
8B
•
Updated
•
90
•
2
OpenGVLab/Vlaser-2B
2B
•
Updated
•
55
•
1
OpenGVLab/VeBrain
8B
•
Updated
•
59
OpenGVLab/NaViL-9B
16B
•
Updated
•
53
OpenGVLab/NaViL-2B
4B
•
Updated
•
58
OpenGVLab/SDLM-32B-D4
Text Generation
•
33B
•
Updated
•
423
•
11
OpenGVLab/SDLM-3B-D8
Text Generation
•
3B
•
Updated
•
384
•
3
OpenGVLab/SDLM-3B-D4
Text Generation
•
3B
•
Updated
•
395
•
4
OpenGVLab/VideoChat-R1_5-7B
Video-Text-to-Text
•
8B
•
Updated
•
1.35k
•
7
datasets
49
OpenGVLab/ExpVid
Preview
•
Updated
•
924
•
4
OpenGVLab/GenExam
Updated
•
293
•
3
OpenGVLab/ScaleCUA-Data
Preview
•
Updated
•
6.32k
•
22
OpenGVLab/VRBench
Preview
•
Updated
•
209
•
4
OpenGVLab/MMPR-v1.2
Updated
•
7.16k
•
37
OpenGVLab/MMPR-Tiny
Updated
•
239
•
6
OpenGVLab/MMPR-v1.2-prompts
Updated
•
3.97k
•
2
OpenGVLab/MMBench-GUI
Preview
•
Updated
•
126
•
36
OpenGVLab/GUI-Odyssey
Viewer
•
Updated
•
7.74k
•
6.92k
•
25
OpenGVLab/LORIS
Updated
•
914
•
3