nvidia
/

Nemotron-H-47B-Base-8K

@@ -1,5 +1,47 @@
 # Nemotron-H-47B-Base-8K
 **Model Developer:** NVIDIA
 **Model Dates:**
@@ -12,31 +54,23 @@ September 2024
 The pretraining data has a cutoff date of September 2024.
-## Model Overview
-NVIDIA Nemotron-H-47B-Base-8K is a large language model (LLM) developed by NVIDIA, designed as a completion model for a given piece of text. It uses a hybrid model architecture that consists primarily of Mamba-2 and MLP layers combined with just five Attention layers. The model is pruned and distilled from Nemotron-H-47B-Base-8K using 63B tokens, and features an 8K context length. The supported languages include: English, German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, and Chinese.
-For best performance on a given task, users are encouraged to customize the model using the NeMo Framework suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA, and more), and Model Alignment (SFT, SteerLM, RLHF, and more) using [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner).
-This model is for research and development only.
-## License/Terms of Use
-GOVERNING TERMS: Use of this model is governed by the NVIDIA Internal Scientific Research and Development Model License.
-NVIDIA Internal Scientific Research and Development Model License
-## Model Architecture
-- Architecture Type: Transformer
-- Network Architecture: Nemotron-Hybrid
- This model has 47B of model parameters.
-### Deployment Geography: Global
-### Use Case: This model is intended for developers and researchers building LLMs
-### Release Date: 04/09/2025
-Huggingface 04/09/2025 via https://huggingface.co/
-NGC 04/09/2025 via https://catalog.ngc.nvidia.com/models
 ## Input
 - Input Type(s): Text
@@ -52,7 +86,7 @@ NGC 04/09/2025 via https://catalog.ngc.nvidia.com/models
 Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
 ## Software Integration
-- Runtime Engine(s): NeMo 24.09
 - Supported Hardware Microarchitecture Compatibility: NVIDIA H100-80GB, NVIDIA A100
 - Operating System(s): Linux
@@ -63,6 +97,22 @@ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated sys
 As this is a base model, no explicit prompt format is recommended or required.
 ## Training, Testing, and Evaluation Datasets
 #Training & Testing Datasets:
@@ -84,19 +134,17 @@ Hybrid: Automated, Human, Synthetic
 **Data Labeling for Training Datasets:**
 Hybrid: Automated, Human, Synthetic
-#### Reasoning Evaluations:
 | ARC Challenge 25-shot | Hellaswag 10-shot | Winogrande 5-shot | CommonsenseQA 7-shot |
 |-------------|--------------|-----------------|------------------|
 | 94.6 | 87.9 | 83.9 | 87.3 |
-ARC (Ai2 reasoning challenge)-Challenge - The challenge set of questions from a benchmark that contains grade-school level, multiple-choice science questions to assess question answering ability of language models. [Dataset](https://huggingface.co/datasets/allenai/ai2_arc)
-Hellaswag - Tests the ability of a language model to correctly finish the provided context from a choice of possible options. [Dataset](https://huggingface.co/datasets/Rowan/hellaswag )
-Winogrande - Tests the ability to choose the right option for a given sentence which requires commonsense reasoning. [Dataset](https://huggingface.co/datasets/allenai/winogrande )
-CommonsenseQA - A multiple-choice question answering dataset that requires different type of commonsense knowledge to predict the correct answers. [Dataset](https://huggingface.co/datasets/tau/commonsense_qa  )
 #### Coding Evaluations:
@@ -104,11 +152,9 @@ CommonsenseQA - A multiple-choice question answering dataset that requires diffe
 |-------------|--------------|-----------------|------------------|
 | 75.9 | 65.6| 61.0 | 56.1 |
-MBPP (Mostly Basic Python Programming Problems) - Evaluates ability to generate solutions for Python programming tasks. [Dataset](https://github.com/google-research/google-research/tree/master/mbpp)
-MBPP+ - Extended version of MBPP with additional validation. [Dataset](https://huggingface.co/datasets/evalplus/mbppplus)
-HumanEval - Tests code generation and completion abilities in Python. [Dataset](https://github.com/openai/human-eval)
 #### Math Evaluations:
@@ -116,13 +162,10 @@ HumanEval - Tests code generation and completion abilities in Python. [Dataset](
 |--------------|------------|------------|------------|
 | 93.3 | 57.4 | 34.2 | 57.9 |
-GSM8K (Grade School Math 8K) - Evaluates grade school level mathematical word problem solving. [Dataset](https://github.com/openai/grade-school-math)
-MATH-500 - Tests advanced mathematical problem solving across algebra, geometry, and calculus. [Dataset](https://huggingface.co/datasets/HuggingFaceH4/MATH-500)
-MATH Lvl 5 - Only the most difficult questions from the MATH dataset. [Dataset](https://github.com/hendrycks/math)
-MATH-500 - Tests advanced mathematical problem solving across algebra, geometry, and calculus. [Dataset](https://huggingface.co/datasets/HuggingFaceH4/MATH-500)
 #### Other Evaluations:
@@ -131,9 +174,8 @@ MATH-500 - Tests advanced mathematical problem solving across algebra, geometry,
 |------------------|------------|
 |83.6 | 61.8 |
-MMLU - Tests knowledge across 57 subjects including science, humanities, math and more. [Dataset](https://github.com/hendrycks/test)
-MMLU Pro - Evaluates language understanding models across a broad range of challenging, reasoning-focused questions across 14 diverse domains.
 [Dataset](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro)
 ## Potential Known Risks for Usage
@@ -156,3 +198,4 @@ Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.

+---
+library_name: transformers
+license: other
+license_name: nvidia-internal-scientific-research-and-development-model-license
+license_link: >-
+  https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-internal-scientific-research-and-development-model-license/
+pipeline_tag: text-generation
+language:
+  - en
+  - de
+  - es
+  - fr
+  - it
+  - ko
+  - pt
+  - ru
+  - jp
+  - zh
+tags:
+  - nvidia
+  - pytorch
+  - nemotron-h
+---
 # Nemotron-H-47B-Base-8K
+## Model Overview
+NVIDIA Nemotron-H-47B-Base-8K is a large language model (LLM) developed by NVIDIA, designed as a completion model for a given piece of text. It uses a hybrid model architecture that consists primarily of Mamba-2 and MLP layers combined with just five Attention layers. The model is pruned and distilled from Nemotron-H-47B-Base-8K using 63B tokens, and features an 8K context length. The supported languages include: English, German, Spanish, French, Italian, Korean, Portuguese, Russian, Japanese, and Chinese. For more detailed information on the model architecture, training, and evaluation, please see the [project page](https://research.nvidia.com/labs/adlr/nemotronh/) and the [technical report](https://arxiv.org/abs/2504.03624).
+For best performance on a given task, users are encouraged to customize the model using the [NeMo Framework](https://docs.nvidia.com/nemo-framework/index.html) suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA, and more), and Model Alignment (SFT, SteerLM, RLHF, and more) using [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner).
+This model is for research and development only.
+This model is part of the Nemotron-H Collection. You can find the models in this family here:
+- [Nemotron-H-56B-Base-8K](https://huggingface.co/nvidia/Nemotron-H-56B-Base-8K)
+- [Nemotron-H-47B-Base-8K](https://huggingface.co/nvidia/Nemotron-H-47B-Base-8K)
+- [Nemotron-H-8B-Base-8K](https://huggingface.co/nvidia/Nemotron-H-8B-Base-8K)
+## License/Terms of Use
+GOVERNING TERMS: Use of this model is governed by the [NVIDIA Internal Scientific Research and Development Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-internal-scientific-research-and-development-model-license/).
 **Model Developer:** NVIDIA
 **Model Dates:**
 The pretraining data has a cutoff date of September 2024.
+## Use Case:
+This model is intended for developers and researchers building LLMs.
+## Release Date:
+4/12/2025
+## References
+- [\[2504.03624\] Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models](https://arxiv.org/abs/2504.03624)
+## Model Architecture
+- Architecture Type: Hybrid Mamba-Transformer
+- Network Architecture: Nemotron-H
+This model has 47B of model parameters.
 ## Input
 - Input Type(s): Text
 Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
 ## Software Integration
+- Runtime Engine(s): NeMo 24.12
 - Supported Hardware Microarchitecture Compatibility: NVIDIA H100-80GB, NVIDIA A100
 - Operating System(s): Linux
 As this is a base model, no explicit prompt format is recommended or required.
+### Example
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# Load the tokenizer and model
+tokenizer  = AutoTokenizer.from_pretrained("nvidia/Nemotron-H-47B-Base-8K", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("nvidia/Nemotron-H-47B-Base-8K", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
+prompt = "When was NVIDIA founded?"
+outputs = model.generate(**tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to(model.device))
+print(tokenizer.decode(outputs[0]))
+```
 ## Training, Testing, and Evaluation Datasets
 #Training & Testing Datasets:
 **Data Labeling for Training Datasets:**
 Hybrid: Automated, Human, Synthetic
+#### Commonsense Understanding Evaluations:
 | ARC Challenge 25-shot | Hellaswag 10-shot | Winogrande 5-shot | CommonsenseQA 7-shot |
 |-------------|--------------|-----------------|------------------|
 | 94.6 | 87.9 | 83.9 | 87.3 |
+- ARC (Ai2 reasoning challenge)-Challenge - The challenge set of questions from a benchmark that contains grade-school level, multiple-choice science questions to assess question answering ability of language models. [Dataset](https://huggingface.co/datasets/allenai/ai2_arc)
+- Hellaswag - Tests the ability of a language model to correctly finish the provided context from a choice of possible options. [Dataset](https://huggingface.co/datasets/Rowan/hellaswag )
+- Winogrande - Tests the ability to choose the right option for a given sentence which requires commonsense reasoning. [Dataset](https://huggingface.co/datasets/allenai/winogrande )
+- CommonsenseQA - A multiple-choice question answering dataset that requires different type of commonsense knowledge to predict the correct answers. [Dataset](https://huggingface.co/datasets/tau/commonsense_qa  )
 #### Coding Evaluations:
 |-------------|--------------|-----------------|------------------|
 | 75.9 | 65.6| 61.0 | 56.1 |
+- MBPP (Mostly Basic Python Programming Problems) - Evaluates ability to generate solutions for Python programming tasks. [Dataset](https://github.com/google-research/google-research/tree/master/mbpp)
+- MBPP+ - Extended version of MBPP with additional validation. [Dataset](https://huggingface.co/datasets/evalplus/mbppplus)
+- HumanEval - Tests code generation and completion abilities in Python. [Dataset](https://github.com/openai/human-eval)
 #### Math Evaluations:
 |--------------|------------|------------|------------|
 | 93.3 | 57.4 | 34.2 | 57.9 |
+- GSM8K (Grade School Math 8K) - Evaluates grade school level mathematical word problem solving. [Dataset](https://github.com/openai/grade-school-math)
+- MATH-500 - Tests advanced mathematical problem solving across algebra, geometry, and calculus. [Dataset](https://huggingface.co/datasets/HuggingFaceH4/MATH-500)
+- MATH Lvl 5 - Only the most difficult questions from the MATH dataset. [Dataset](https://github.com/hendrycks/math)
+- MATH-500 - Tests advanced mathematical problem solving across algebra, geometry, and calculus. [Dataset](https://huggingface.co/datasets/HuggingFaceH4/MATH-500)
 #### Other Evaluations:
 |------------------|------------|
 |83.6 | 61.8 |
+- MMLU - Tests knowledge across 57 subjects including science, humanities, math and more. [Dataset](https://github.com/hendrycks/test)
+- MMLU Pro - Evaluates language understanding models across a broad range of challenging, reasoning-focused questions across 14 diverse domains.
 [Dataset](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro)
 ## Potential Known Risks for Usage