Quantized GGUF model Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge
This model has been quantized using llama-quantize from llama.cpp
Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge
Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge is a merge of the following models using mergekit:
🧩 Merge Configuration
slices:
- sources:
- model: NousResearch/Hermes-2-Pro-Llama-3-8B
layer_range: [0, 31]
- model: shenzhi-wang/Llama3-8B-Chinese-Chat
layer_range: [0, 31]
merge_method: slerp
base_model: NousResearch/Hermes-2-Pro-Llama-3-8B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: float16
Model Features
This fusion model combines the robust generative capabilities of NousResearch/Hermes-2-Pro-Llama-3-8B with the refined tuning of shenzhi-wang/Llama3-8B-Chinese-Chat, creating a versatile model suitable for a variety of text generation tasks. Leveraging the strengths of both parent models, Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge provides enhanced context understanding, nuanced text generation, and improved performance across diverse NLP tasks, including multilingual capabilities and structured outputs.
Evaluation Results
Hermes-2-Pro-Llama-3-8B
- Scored 90% on function calling evaluation.
- Scored 84% on structured JSON output evaluation.
Llama3-8B-Chinese-Chat
- Significant improvements in roleplay, function calling, and math capabilities compared to previous versions.
- Achieved high performance in both Chinese and English tasks, surpassing ChatGPT in certain benchmarks.
Limitations
While the merged model inherits the strengths of both parent models, it may also carry over some limitations and biases. For instance, the model may exhibit inconsistencies in responses when handling complex queries or when generating content that requires deep contextual understanding. Additionally, the model's performance may vary based on the language used, with potential biases present in the training data affecting the quality of outputs in less represented languages or dialects. Users should remain aware of these limitations when deploying the model in real-world applications.
- Downloads last month
- 116