aisingapore
/

Llama-SEA-LION-v3-70B-IT

Text Generation

text-generation-inference

Model card Files Files and versions Community

waiyiaisg commited on Apr 15

Commit

d6a6480

·

verified ·

1 Parent(s): ad82811

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ Current Version: `14.04.2025`
 # Llama-SEA-LION-v3-70B-IT
-SEA-LION is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
 SEA-LION stands for _Southeast Asian Languages In One Network_.
@@ -49,7 +49,7 @@ For tokenisation, the model employs the default tokenizer used in Llama 3.1 70B
 We evaluated Llama-SEA-LION-v3-70B-IT on both general language capabilities and instruction-following capabilities.
 #### General Language Capabilities
-For the evaluation of general language capabilities, we employed the [SEA-HELM (also known as BHASA) evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
 These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarisation (Abssum), Causal Reasoning (Causal) and Natural Language Inference (NLI).
 Note: SEA-HELM is implemented using prompts to elicit answers in a strict format. For all tasks, the model is expected to provide an answer tag from which the answer is automatically extracted. For tasks where options are provided, the answer should comprise one of the pre-defined options. The scores for each task is normalised to account for baseline performance due to random chance.

 # Llama-SEA-LION-v3-70B-IT
+[SEA-LION](https://arxiv.org/abs/2504.05747) is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.
 SEA-LION stands for _Southeast Asian Languages In One Network_.
 We evaluated Llama-SEA-LION-v3-70B-IT on both general language capabilities and instruction-following capabilities.
 #### General Language Capabilities
+For the evaluation of general language capabilities, we employed the [SEA-HELM evaluation benchmark](https://arxiv.org/abs/2502.14301) across a variety of tasks.
 These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarisation (Abssum), Causal Reasoning (Causal) and Natural Language Inference (NLI).
 Note: SEA-HELM is implemented using prompts to elicit answers in a strict format. For all tasks, the model is expected to provide an answer tag from which the answer is automatically extracted. For tasks where options are provided, the answer should comprise one of the pre-defined options. The scores for each task is normalised to account for baseline performance due to random chance.