Spaces:

allenai
/

ZebraLogic

Running

yuchenlin commited on Jul 19, 2024

Commit

3e5d61f

1 Parent(s): 7302659

add blog link

Files changed (2) hide show

ZeroEval-main/result_dirs/zebra-grid.summary.json CHANGED Viewed

@@ -372,16 +372,5 @@
     "Hard Puzzle Acc": "0.00",
     "Total Puzzles": 1000,
     "Reason Lens": "1592.60"
-  },
-  {
-    "Model": "gemma-2-27b-it@vllm",
-    "Mode": "greedy",
-    "Puzzle Acc": "0.47",
-    "Cell Acc": "0.31",
-    "No answer": "96.23",
-    "Easy Puzzle Acc": "2.08",
-    "Hard Puzzle Acc": "0.00",
-    "Total Puzzles": 212,
-    "Reason Lens": "1280.62"
   }
 ]

     "Hard Puzzle Acc": "0.00",
     "Total Puzzles": 1000,
     "Reason Lens": "1592.60"
   }
 ]

_header.md CHANGED Viewed

@@ -2,5 +2,5 @@
 # 🦓 ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models
 <!-- [📑 FnF Paper](https://arxiv.org/abs/2305.18654) |  -->
-[📰 Blog]() [💻 GitHub](https://github.com/yuchenlin/ZeroEval) | [🤗 HuggingFace](https://huggingface.co/collections/allenai/zebra-logic-bench-6697137cbaad0b91e635e7b0) | [🐦 X](https://twitter.com/billyuchenlin/) | [💬 Discussion](https://huggingface.co/spaces/allenai/ZebraLogicBench-Leaderboard/discussions) | Updated: **{LAST_UPDATED}**

 # 🦓 ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models
 <!-- [📑 FnF Paper](https://arxiv.org/abs/2305.18654) |  -->
+[📰 Blog](https://huggingface.co/blog/yuchenlin/zebra-logic) [💻 GitHub](https://github.com/yuchenlin/ZeroEval) | [🤗 HuggingFace](https://huggingface.co/collections/allenai/zebra-logic-bench-6697137cbaad0b91e635e7b0) | [🐦 X](https://twitter.com/billyuchenlin/) | [💬 Discussion](https://huggingface.co/spaces/allenai/ZebraLogicBench-Leaderboard/discussions) | Updated: **{LAST_UPDATED}**