Spaces:
Running
Running
Update description
Browse files
app.py
CHANGED
@@ -53,15 +53,15 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
|
53 |
gr.Markdown("# VLMVibeEval")
|
54 |
gr.Markdown(
|
55 |
"""
|
56 |
-
A lightweight leaderboard for evaluating Vision Language Models (VLMs) β based on vibes.
|
57 |
|
58 |
-
Traditional benchmarks
|
59 |
|
60 |
1. Predefined categories with images and prompts.
|
61 |
2. Check any model on these examples.
|
62 |
-
3. Explore the generations and judge for yourself.
|
63 |
|
64 |
-
This is not about scores β it's about *how it feels*.
|
65 |
"""
|
66 |
)
|
67 |
|
|
|
53 |
gr.Markdown("# VLMVibeEval")
|
54 |
gr.Markdown(
|
55 |
"""
|
56 |
+
A lightweight leaderboard for evaluating Vision Language Models (VLMs) β based on vibes. π
|
57 |
|
58 |
+
Traditional benchmarks don't give concrete signal for your use case and models are often saturated over them. Instead, we let you **vibe test** models across curated, in-the-wild examples:
|
59 |
|
60 |
1. Predefined categories with images and prompts.
|
61 |
2. Check any model on these examples.
|
62 |
+
3. Explore the generations and judge for yourself, as different models have different styles and strengths. π£οΈ
|
63 |
|
64 |
+
This is not about scores β it's about *how it feels*. You can submit new models in the community tab and we'll shortly update the app! π€
|
65 |
"""
|
66 |
)
|
67 |
|