rzanoli commited on
Commit
7aacef3
·
1 Parent(s): b5e0623

Small Changes

Browse files
Files changed (1) hide show
  1. src/about.py +2 -2
src/about.py CHANGED
@@ -93,8 +93,8 @@ TITLE = """<h1 align="center" id="space-title">🚀 EVALITA-LLM Leaderboard 🚀
93
  INTRODUCTION_TEXT = """
94
  Evalita-LLM is a benchmark designed to evaluate Large Language Models (LLMs) on Italian tasks. The distinguishing features of Evalita-LLM are the following: (i) **all tasks are native Italian**, avoiding translation issues and potential cultural biases; (ii) the benchmark includes **generative** tasks, enabling more natural interaction with LLMs; (iii) **all tasks are evaluated against multiple prompts**, this way mitigating the model sensitivity to specific prompts and allowing a fairer evaluation.
95
 
96
- **Multiple Choice**: 📊TE (Textual Entailment), 😃SA (Sentiment Analysis), ⚠️HS (Hate Speech Detection), 🏥AT (Admission Test), 🔤WIC (Word in Context), ❓FAQ (Frequently Asked Questions)
97
- **Generative**: 🔄LS (Lexical Substitution), 📝SU (Summarization), 🏷️NER (Named Entity Recognition), 🔗REL (Relation Extraction)
98
  """
99
 
100
  # Which evaluations are you running? how can people reproduce what you have?
 
93
  INTRODUCTION_TEXT = """
94
  Evalita-LLM is a benchmark designed to evaluate Large Language Models (LLMs) on Italian tasks. The distinguishing features of Evalita-LLM are the following: (i) **all tasks are native Italian**, avoiding translation issues and potential cultural biases; (ii) the benchmark includes **generative** tasks, enabling more natural interaction with LLMs; (iii) **all tasks are evaluated against multiple prompts**, this way mitigating the model sensitivity to specific prompts and allowing a fairer evaluation.
95
 
96
+ **<small>Multiple Choice:</small>** <small> 📊TE (Textual Entailment), 😃SA (Sentiment Analysis), ⚠️HS (Hate Speech Detection), 🏥AT (Admission Test), 🔤WIC (Word in Context), ❓FAQ (Frequently Asked Questions) </small><br>
97
+ **<small>Generative:</small>** <small>🔄LS (Lexical Substitution), 📝SU (Summarization), 🏷️NER (Named Entity Recognition), 🔗REL (Relation Extraction) </small>
98
  """
99
 
100
  # Which evaluations are you running? how can people reproduce what you have?