|
INTRODUCTION_TEXT = """ |
|
|
|
<span style="font-size:16px; font-family: 'Times New Roman', serif;"> <b> Welcome to the ChineseSafe Leaderboard! |
|
On this leaderboard, we share the evaluation results of LLMs obtained by developing a brand new content moderation benchmark for Chinese. πππ</b> |
|
</span> |
|
|
|
# Dataset |
|
<span style="font-size:16px; font-family: 'Times New Roman', serif"> |
|
To evaluate the safety risk of LLMs of large language models, we present ChineseSafe, a Chinese safety benchmark to facilitate research |
|
on the content safety of LLMs for Chinese (Mandarin). |
|
To align with the regulations for Chinese Internet content moderation, our ChineseSafe contains 205,034 examples |
|
across 4 classes and 10 sub-classes of safety issues. For Chinese contexts, we add several special types of illegal content: political sensitivity, pornography, |
|
and variant/homophonic words. In particular, the benchmark is constructed as a balanced dataset, containing safe and unsafe data collected from internet resources and public datasets [1,2,3]. |
|
We hope the evaluation can provides a guideline for developers and researchers to facilitate the safety of LLMs. A publicly accessible test set comprising 20,000 examples is released at <a href="https://huggingface.co/datasets/SUSTech/ChineseSafe" target="_blank">ChineseSafe</a>.<br> |
|
|
|
The leadboard is under construction and maintained by <a href="https://hongxin001.github.io/" target="_blank">Hongxin Wei's</a> research group at SUSTech. |
|
Comments, issues, contributions, and collaborations are all welcomed! |
|
Email: [email protected] |
|
</span> |
|
""" |
|
|
|
|
|
METRICS_TEXT = """ |
|
# Metrics |
|
<span style="font-size:16px; font-family: 'Times New Roman', serif"> |
|
We report the results with five metrics: overall accuracy, precision/recall for safe/unsafe content. |
|
In particular, the results are shown as <b>metric/std</b> format in the table, |
|
where <b>std</b> indicates the standard deviation of the results with various random seeds. |
|
</span> |
|
""" |
|
|
|
EVALUTION_TEXT= """ |
|
# Evaluation |
|
<span style="font-size:16px; font-family: 'Times New Roman', serif"> |
|
We evaluate the models using two methods: perplexity(multiple choice) and generation. |
|
For perplexity, we select the label which is the lowest perplexity as the predicted results. |
|
For generation, we use the content generated by the model to make prediction. |
|
The following are the results of the evaluation. πππ |
|
</span> <br><br> |
|
|
|
|
|
""" |
|
|
|
REFERENCE_TEXT = """ |
|
# References |
|
<span style="font-size:16px; font-family: 'Times New Roman', serif"> |
|
[1] Sun H, Zhang Z, Deng J, et al. Safety assessment of chinese large language models[J]. arXiv preprint arXiv:2304.10436, 2023. <br> |
|
[2] https://github.com/konsheng/Sensitive-lexicon <br> |
|
[3] https://www.cluebenchmarks.com/static/pclue.html <br> |
|
|
|
""" |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ACKNOWLEDGEMENTS_TEXT = """ |
|
# Acknowledgements |
|
This research is supported by the Shenzhen Fundamental Research Program (Grant No. |
|
JCYJ20230807091809020). We gratefully acknowledge the support of "Data+AI" Data Intelligent Laboratory, a joint lab constructed by Deepexi and the Department of Statistics and Data Science |
|
at Southern University of Science and Technology. |
|
""" |
|
|
|
CONTACT_TEXT = """ |
|
# Contact |
|
<span style="font-size:16px; font-family: 'Times New Roman', serif"> |
|
The leadboard is under construction and maintained by <a href="https://hongxin001.github.io/" target="_blank">Hongxin Wei's</a> research group at SUSTech. |
|
We will release the technical report in the near future. |
|
Comments, issues, contributions, and collaborations are all welcomed! |
|
Email: [email protected] |
|
""" |
|
|