nyunai
/

nyun-c1-llama3-60B

Text Generation

text-generation-inference

Model card Files Files and versions Community

nyun-c1-llama3-60B / README.md

Arnav0400's picture

Update README.md

fc7c94b verified about 1 year ago

|

history blame contribute delete

1.37 kB

	---
	license: llama3
	---
	# 🔹 Key Highlights:

	- 14% Fewer Parameters: nyun-llama3-60B comprises approximately 14% fewer parameters than the popular Llama-3-70B.
	- Intact Performance: Despite having fewer parameters, our model performs at par if not better, and occasionally outperforms, the Llama-3-70B.
	- No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.

	## Pipeline and Collaboration

	For insights into the pipeline and the list of methods used to optimize these models, check out our PruneGPT repository (https://github.com/nyunAI/PruneGPT).
	We invite companies and organizations interested in joining forces with us to release more such open-source variants to reach out at [email protected].

	### Model Performance

	\| Dataset \| Nyun-Llama3-60B \| Meta-Llama3-70B \| Meta-Llama2-70B \| MBZUAI K2-65B \|
	\| --- \| --- \| --- \| --- \| --- \|
	\| MMLU (5-shot) \| 78.6 \| 79.5 \| 69.7 \| 67.9 \|
	\| Winogrande (5-shot) \| 83.4 \| 83.1 \| 81.8 \| 77.0 \|
	\| BoolQ (0-shot) \| 85.2 \| 79.0 \| 73.1 \| 83.0 \|
	\| Hellaswag (10-shot) \| 85.7 \| 88.0 \| 86.9 \| 85.5 \|
	\| Arc Challenge (25-shot) \| 64.4 \| 68.8 \| 67.2 \| 64.8 \|
	\| GSM8K (5-shot) \| 68.7 \| 76.9 \| 52.6 \| 50.2 \|
	\| Average \| 77.7 \| 79.2 \| 71.9 \| 71.4 \|


	- Developed by: [Nyun AI](https://nyunai.com/)
	- Repository: [Github](https://github.com/nyunAI/PruneGPT)