MCILAB commited on
Commit
abf9d9d
·
verified ·
1 Parent(s): 0ed4b45

Update src/about.py

Browse files
Files changed (1) hide show
  1. src/about.py +6 -2
src/about.py CHANGED
@@ -35,7 +35,8 @@ LLM_BENCHMARKS_TEXT = f"""
35
  ## Open Persian LLM Alignment Leaderboard
36
 
37
  Developed by MCILAB in collaboration with the Machine Learning Laboratory at Sharif University of Technology, this benchmark presents a comprehensive evaluation framework for assessing the alignment of Persian Large Language Models (LLMs) with critical ethical dimensions, including safety, fairness, and social norms.
38
- Addressing the gaps in existing LLM evaluation frameworks, this benchmark is specifically tailored to Persian linguistic and cultural contexts. It combines three types of Persian-language benchmarks:
 
39
  1. Translated datasets (adapted from established English benchmarks)
40
  2. Synthetically generated data (newly created for Persian LLMs)
41
  3. Naturally collected data (reflecting indigenous cultural nuances)
@@ -54,11 +55,14 @@ Translated Datasets
54
  • SocialBench-fa: Evaluates adherence to culturally accepted behaviors.
55
  ### Naturally Collected Persian Dataset
56
  • GuardBench-fa: A large-scale dataset designed to align Persian LLMs with local cultural norms.
57
- A Unified Framework for Persian LLM Evaluation
 
58
  By combining these datasets, our work establishes a culturally grounded alignment evaluation framework, enabling systematic assessment across three key aspects:
59
  • Safety: Avoiding harmful or toxic content.
60
  • Fairness: Mitigating biases in model outputs.
61
  • Social Norms: Ensuring culturally appropriate behavior.
 
 
62
  This benchmark not only fills a critical gap in Persian LLM evaluation but also provides a standardized leaderboard to track progress in developing aligned, ethical, and culturally aware Persian language models.
63
 
64
  """
 
35
  ## Open Persian LLM Alignment Leaderboard
36
 
37
  Developed by MCILAB in collaboration with the Machine Learning Laboratory at Sharif University of Technology, this benchmark presents a comprehensive evaluation framework for assessing the alignment of Persian Large Language Models (LLMs) with critical ethical dimensions, including safety, fairness, and social norms.
38
+ Addressing the gaps in existing LLM evaluation frameworks, this benchmark is specifically tailored to Persian linguistic and cultural contexts.
39
+ ### It combines three types of Persian-language benchmarks:
40
  1. Translated datasets (adapted from established English benchmarks)
41
  2. Synthetically generated data (newly created for Persian LLMs)
42
  3. Naturally collected data (reflecting indigenous cultural nuances)
 
55
  • SocialBench-fa: Evaluates adherence to culturally accepted behaviors.
56
  ### Naturally Collected Persian Dataset
57
  • GuardBench-fa: A large-scale dataset designed to align Persian LLMs with local cultural norms.
58
+
59
+ ### A Unified Framework for Persian LLM Evaluation
60
  By combining these datasets, our work establishes a culturally grounded alignment evaluation framework, enabling systematic assessment across three key aspects:
61
  • Safety: Avoiding harmful or toxic content.
62
  • Fairness: Mitigating biases in model outputs.
63
  • Social Norms: Ensuring culturally appropriate behavior.
64
+
65
+
66
  This benchmark not only fills a critical gap in Persian LLM evaluation but also provides a standardized leaderboard to track progress in developing aligned, ethical, and culturally aware Persian language models.
67
 
68
  """