nyunai
/

nyun-c2-llama3-61B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nyun-c2-llama3-61B / README.md

Arnav0400's picture

Update README.md

b429e01 verified 3 months ago

|

No virus

1.35 kB

	---
	license: llama3
	---
	# 🔹 Key Highlights:

	- 13% Fewer Parameters: nyun-c2-llama3-61B comprises approximately 13% fewer parameters than the popular Llama-3-70B.
	- Better Performance: Despite having fewer parameters, this model performs better than Llama3-70B on multiple benchmarks.
	- No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.

	## Pipeline and Collaboration

	For insights into the pipeline and the list of methods used to optimize these models, check out our PruneGPT repository (https://github.com/nyunAI/PruneGPT).
	We invite companies and organizations interested in joining forces with us to release more such open-source variants to reach out at [email protected].

	### Model Performance

	\| Dataset \| nyun-c2-llama3-61B \| Meta-Llama3-70B \| Meta-Llama2-70B \| MBZUAI K2-65B \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| MMLU (5-shot) \| 78.8 \| 79.5 \| 69.7 \| 67.9 \|
	\| Winogrande (5-shot) \| 86.2 \| 83.1 \| 81.8 \| 77.0 \|
	\| BoolQ (0-shot) \| 85.1 \| 79.0 \| 73.1 \| 83.0 \|
	\| Hellaswag (10-shot) \| 87.4 \| 88.0 \| 86.9 \| 85.5 \|
	\| Arc Challenge (25-shot) \| 67.6 \| 68.8 \| 67.2 \| 64.8 \|
	\| GSM8K (5-shot) \| 79.4 \| 76.9 \| 52.6 \| 50.2 \|
	\| Average \| 80.7 \| 79.2 \| 71.9 \| 71.4 \|

	- Developed by: [Nyun AI](https://nyunai.com/)
	- Repository: [Github](https://github.com/nyunAI/PruneGPT)