Spaces:
Runtime error
Runtime error
Update src/about.py
Browse files- src/about.py +10 -10
src/about.py
CHANGED
|
@@ -12,9 +12,9 @@ TITLE = """<h1>🇹🇠Thai LLM Leaderboard</h1>"""
|
|
| 12 |
# <a href="url"></a>
|
| 13 |
|
| 14 |
INTRODUCTION_TEXT = """
|
| 15 |
-
The Thai
|
| 16 |
As part of an open community project, we welcome you to submit new evaluation tasks or models.
|
| 17 |
-
This leaderboard is developed in collaboration with <a href="https://www.scb10x.com">SCB 10X</a>, <a href="https://www.vistec.ac.th/">
|
| 18 |
"""
|
| 19 |
|
| 20 |
LLM_BENCHMARKS_TEXT = f"""
|
|
@@ -35,25 +35,25 @@ The leaderboard currently consists of the following benchmarks:
|
|
| 35 |
- <a href="https://huggingface.co/datasets/iapp/iapp_wiki_qa_squad">iapp Wiki QA Squad</a>: iapp Wiki QA Squad is an extractive question-answering dataset derived from Thai Wikipedia articles.
|
| 36 |
|
| 37 |
|
| 38 |
-
Metric Implementation Details
|
| 39 |
- BLEU is calculated using flores200's tokenizer using HuggingFace `evaluate` <a href="https://huggingface.co/spaces/evaluate-metric/sacrebleu">implementation</a>.
|
| 40 |
- ROUGEL is calculated using PyThaiNLP newmm tokenizer and HuggingFace `evaluate` <a href="https://huggingface.co/spaces/evaluate-metric/rouge">implementation</a>.
|
| 41 |
- LLM-as-a-judge rating is based on OpenAI's gpt-4o-2024-05-13 using the prompt defined in <a href="https://github.com/lm-sys/FastChat/blob/main/fastchat/llm_judge/data/judge_prompts.jsonl">lmsys MT-Bench</a>.
|
| 42 |
|
| 43 |
-
Reproducibility
|
| 44 |
|
| 45 |
-
- For reproducibility of results, we have open-sourced the evaluation pipeline. Please check out the repository <a href="https://github.com/scb-10x/seacrowd-eval">seacrowd-experiments</a>.
|
| 46 |
|
| 47 |
-
Acknowledgements
|
| 48 |
|
| 49 |
- We are grateful to previous open-source projects that released datasets, tools, and knowledge. We thank community members for tasks and model submissions. To contribute, please see the submit tab.
|
| 50 |
"""
|
| 51 |
|
| 52 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
| 53 |
CITATION_BUTTON_TEXT = r"""@misc{thaillm-leaderboard,
|
| 54 |
-
author
|
| 55 |
-
title
|
| 56 |
-
year
|
| 57 |
-
publisher
|
| 58 |
url={https://huggingface.co/spaces/ThaiLLM-Leaderboard/leaderboard}
|
| 59 |
}"""
|
|
|
|
| 12 |
# <a href="url"></a>
|
| 13 |
|
| 14 |
INTRODUCTION_TEXT = """
|
| 15 |
+
The Thai LLM Leaderboard 🇹🇠aims to standardize evaluation methods for large language models (LLMs) in the Thai language, building on <a href="https://github.com/SEACrowd">SEACrowd</a>.
|
| 16 |
As part of an open community project, we welcome you to submit new evaluation tasks or models.
|
| 17 |
+
This leaderboard is developed in collaboration with <a href="https://www.scb10x.com">SCB 10X</a>, <a href="https://www.vistec.ac.th/">VISTEC</a>, and <a href="https://github.com/SEACrowd">SEACrowd</a>.
|
| 18 |
"""
|
| 19 |
|
| 20 |
LLM_BENCHMARKS_TEXT = f"""
|
|
|
|
| 35 |
- <a href="https://huggingface.co/datasets/iapp/iapp_wiki_qa_squad">iapp Wiki QA Squad</a>: iapp Wiki QA Squad is an extractive question-answering dataset derived from Thai Wikipedia articles.
|
| 36 |
|
| 37 |
|
| 38 |
+
<b>Metric Implementation Details</b>:
|
| 39 |
- BLEU is calculated using flores200's tokenizer using HuggingFace `evaluate` <a href="https://huggingface.co/spaces/evaluate-metric/sacrebleu">implementation</a>.
|
| 40 |
- ROUGEL is calculated using PyThaiNLP newmm tokenizer and HuggingFace `evaluate` <a href="https://huggingface.co/spaces/evaluate-metric/rouge">implementation</a>.
|
| 41 |
- LLM-as-a-judge rating is based on OpenAI's gpt-4o-2024-05-13 using the prompt defined in <a href="https://github.com/lm-sys/FastChat/blob/main/fastchat/llm_judge/data/judge_prompts.jsonl">lmsys MT-Bench</a>.
|
| 42 |
|
| 43 |
+
<b>Reproducibility</b>:
|
| 44 |
|
| 45 |
+
- For the reproducibility of results, we have open-sourced the evaluation pipeline. Please check out the repository <a href="https://github.com/scb-10x/seacrowd-eval">seacrowd-experiments</a>.
|
| 46 |
|
| 47 |
+
<b>Acknowledgements</b>:
|
| 48 |
|
| 49 |
- We are grateful to previous open-source projects that released datasets, tools, and knowledge. We thank community members for tasks and model submissions. To contribute, please see the submit tab.
|
| 50 |
"""
|
| 51 |
|
| 52 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
| 53 |
CITATION_BUTTON_TEXT = r"""@misc{thaillm-leaderboard,
|
| 54 |
+
author={SCB 10X and VISTEC and SEACrowd},
|
| 55 |
+
title={Thai LLM Leaderboard},
|
| 56 |
+
year={2024},
|
| 57 |
+
publisher={Hugging Face},
|
| 58 |
url={https://huggingface.co/spaces/ThaiLLM-Leaderboard/leaderboard}
|
| 59 |
}"""
|