Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -24,8 +24,6 @@ This model reaches [Arena Hard](https://github.com/lmarena/arena-hard-auto) of 8
24
 
25
  As of 1 Oct 2024, this model is #1 on all three automatic alignment benchmarks (verified tab for AlpacaEval 2 LC), edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet.
26
 
27
- As of Oct 24th, 2024 the model has Elo Score of 1267(+-7), rank 9 and style controlled rank of 26 on [ChatBot Arena leaderboard](https://lmarena.ai/?leaderboard).
28
-
29
  This model was trained using RLHF (specifically, REINFORCE), [Llama-3.1-Nemotron-70B-Reward](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward) and [HelpSteer2-Preference prompts](https://huggingface.co/datasets/nvidia/HelpSteer2) on a Llama-3.1-70B-Instruct model as the initial policy.
30
 
31
  Llama-3.1-Nemotron-70B-Instruct-HF has been converted from [Llama-3.1-Nemotron-70B-Instruct](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct) to support it in the HuggingFace Transformers codebase. Please note that evaluation results might be slightly different from the [Llama-3.1-Nemotron-70B-Instruct](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct) as evaluated in NeMo-Aligner, which the evaluation results below are based on.
@@ -54,9 +52,10 @@ There are **3 “R”s** in the word “strawberry”.
54
  Note: This model is a demonstration of our techniques for improving helpfulness in general-domain instruction following. It has not been tuned for performance in specialized domains such as math.
55
 
56
 
57
- ## License
58
- Your use of this model is governed by the [NVIDIA Open Model License](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
59
- Additional Information: [Llama 3.1 Community License Agreement](https://www.llama.com/llama3_1/license/). Built with Llama.
 
60
 
61
  ## Evaluation Metrics
62
 
 
24
 
25
  As of 1 Oct 2024, this model is #1 on all three automatic alignment benchmarks (verified tab for AlpacaEval 2 LC), edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet.
26
 
 
 
27
  This model was trained using RLHF (specifically, REINFORCE), [Llama-3.1-Nemotron-70B-Reward](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward) and [HelpSteer2-Preference prompts](https://huggingface.co/datasets/nvidia/HelpSteer2) on a Llama-3.1-70B-Instruct model as the initial policy.
28
 
29
  Llama-3.1-Nemotron-70B-Instruct-HF has been converted from [Llama-3.1-Nemotron-70B-Instruct](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct) to support it in the HuggingFace Transformers codebase. Please note that evaluation results might be slightly different from the [Llama-3.1-Nemotron-70B-Instruct](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct) as evaluated in NeMo-Aligner, which the evaluation results below are based on.
 
52
  Note: This model is a demonstration of our techniques for improving helpfulness in general-domain instruction following. It has not been tuned for performance in specialized domains such as math.
53
 
54
 
55
+ ## Terms of use
56
+
57
+ By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the [license](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE), [acceptable use policy](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/USE_POLICY.md) and [Meta’s privacy policy](https://www.facebook.com/privacy/policy/)
58
+
59
 
60
  ## Evaluation Metrics
61