Spaces:
Running
Running
Update description
Browse files
app.py
CHANGED
|
@@ -53,15 +53,15 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
|
|
| 53 |
gr.Markdown("# VLMVibeEval")
|
| 54 |
gr.Markdown(
|
| 55 |
"""
|
| 56 |
-
A lightweight leaderboard for evaluating Vision Language Models (VLMs) — based on vibes.
|
| 57 |
|
| 58 |
-
Traditional benchmarks
|
| 59 |
|
| 60 |
1. Predefined categories with images and prompts.
|
| 61 |
2. Check any model on these examples.
|
| 62 |
-
3. Explore the generations and judge for yourself.
|
| 63 |
|
| 64 |
-
This is not about scores — it's about *how it feels*.
|
| 65 |
"""
|
| 66 |
)
|
| 67 |
|
|
|
|
| 53 |
gr.Markdown("# VLMVibeEval")
|
| 54 |
gr.Markdown(
|
| 55 |
"""
|
| 56 |
+
A lightweight leaderboard for evaluating Vision Language Models (VLMs) — based on vibes. 🌞
|
| 57 |
|
| 58 |
+
Traditional benchmarks don't give concrete signal for your use case and models are often saturated over them. Instead, we let you **vibe test** models across curated, in-the-wild examples:
|
| 59 |
|
| 60 |
1. Predefined categories with images and prompts.
|
| 61 |
2. Check any model on these examples.
|
| 62 |
+
3. Explore the generations and judge for yourself, as different models have different styles and strengths. 🗣️
|
| 63 |
|
| 64 |
+
This is not about scores — it's about *how it feels*. You can submit new models in the community tab and we'll shortly update the app! 🤗
|
| 65 |
"""
|
| 66 |
)
|
| 67 |
|