Update README.md

754329c verified about 2 months ago

3.57 kB

	---
	license: mit
	datasets:
	- bitext/Bitext-customer-support-llm-chatbot-training-dataset
	- MohammadOthman/mo-customer-support-tweets-945k
	- taskydata/baize_chatbot
	language:
	- en
	base_model:
	- deepseek-ai/DeepSeek-R1-0528
	- unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
	new_version: Aeshp/deepseekR1_tunedchat
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- bitsandbytes
	- deepseek
	- unsloth
	- tensorboard
	- text-generation-inference
	- llama
	- 5B
	---


	# Aeshp/deepseekR1_tunedchat

	This model is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B), loaded via Unsloth in 4-bit as [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit). It has been trained on customer service and general chat datasets:

	- [taskydata/baize_chatbot](https://huggingface.co/datasets/taskydata/baize_chatbot)
	- [MohammadOthman/mo-customer-support-tweets-945k](https://huggingface.co/datasets/MohammadOthman/mo-customer-support-tweets-945k)
	- [bitext/Bitext-customer-support-llm-chatbot-training-dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset)

	The training was performed in three steps, and the final weights were merged with the base model and pushed here.
	It is a light model.

	## 📝 License

	This model is released under the MIT license, allowing free use, modification, and further fine-tuning.

	## 💡 How to Fine-Tune Further

	All code and instructions for further fine-tuning, inference, and pushing to the Hugging Face Hub are available in the open-source GitHub repository:
	[https://github.com/Aeshp/deepseekR1finetune](https://github.com/Aeshp/deepseekR1finetune)

	- You can fine-tune this model on your own domain-specific data.
	- Please adjust hyperparameters and dataset size as needed.
	- Example scripts and notebooks are provided for both base model and checkpoint-based fine-tuning.

	## ⚠️ Notes

	- The model may sometimes hallucinate, as is common with LLMs.
	- For best results, use a large, high-quality dataset for further fine-tuning to avoid overfitting.

	## 📚 References

	### Hugging Face Models
	- [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
	- [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)
	- [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
	- [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit)

	### Datasets
	- [taskydata/baize_chatbot](https://huggingface.co/datasets/taskydata/baize_chatbot)
	- [MohammadOthman/mo-customer-support-tweets-945k](https://huggingface.co/datasets/MohammadOthman/mo-customer-support-tweets-945k)
	- [bitext/Bitext-customer-support-llm-chatbot-training-dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset)

	### GitHub Repositories
	- [Aeshp/deepseekR1finetune](https://github.com/Aeshp/deepseekR1finetune)
	- [meta-llama/llama](https://github.com/meta-llama/llama)
	- [deepseek-ai/DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1)
	- [Unsloth Documentation](https://docs.unsloth.ai/)

	### Papers
	- [DeepSeek R1 Paper](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf)

	---

	For all usage instructions, fine-tuning guides, and code, please see the [GitHub repository](https://github.com/Aeshp/deepseekR1finetune).