TIGER-Lab
/

VideoScore

Visual Question Answering

text-classification

text-generation-inference

Model card Files Files and versions Community

hexuan21 commited on Jun 20, 2024

Commit

57f345d

·

verified ·

1 Parent(s): 7ef7cf6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -36,7 +36,7 @@ averaged among all the evaluation aspects as indicator.
 For GenAI-Bench and VBench, which include human preference data among two or more videos,
 we employ the model's output to predict preferences and use pairwise accuracy as the performance indicator.
 | metric            | Final Sum Score | VideoEval-test | EvalCrafter | GenAI-Bench | VBench |
-|-------------------|----------------:|---------------:|------------:|-------------|--------|
 | MantisScore (reg) |           278.3 |           75.7 |        51.1 |        78.5 |   73.0 |
 | MantisScore (gen) |           222.4 |           77.1 |        27.6 |        59.0 |   58.7 |
 | Gemini-1.5-Pro    |           158.8 |           22.1 |        22.9 |        60.9 |   52.9 |

 For GenAI-Bench and VBench, which include human preference data among two or more videos,
 we employ the model's output to predict preferences and use pairwise accuracy as the performance indicator.
 | metric            | Final Sum Score | VideoEval-test | EvalCrafter | GenAI-Bench | VBench |
+|-------------------|:---------------:|:--------------:|:-----------:|:-----------:|:------:|
 | MantisScore (reg) |           278.3 |           75.7 |        51.1 |        78.5 |   73.0 |
 | MantisScore (gen) |           222.4 |           77.1 |        27.6 |        59.0 |   58.7 |
 | Gemini-1.5-Pro    |           158.8 |           22.1 |        22.9 |        60.9 |   52.9 |