Update README.md

5ab66a2 verified 10 months ago

9.02 kB

	---
	license: mit
	base_model: jpacifico/Chocolatine-3B-Instruct-DPO-Revised
	pipeline_tag: text-generation
	inference: false
	model_creator: jpacifico
	model_name: Chocolatine-3B-Instruct-DPO-Revised
	model_type: phi3
	language:
	- fr
	- en
	datasets:
	- jpacifico/french-orca-dpo-pairs-revised
	library_name: transformers
	quantized_by: ThiloteE
	tags:
	- text-generation-inference
	- transformers
	- GGUF
	- GPT4All-community
	- GPT4All
	- conversational
	- french
	- chocolatine


	---

	> [!NOTE]
	>This is a model that is assumed to perform well, but may require more testing and user feedback. Be aware, only models featured within the GUI of GPT4All, are curated and officially supported by Nomic. Use at your own risk.


	# About

	<!-- ### quantize_version: 3 -->
	<!-- ### convert_type: hf -->


	- Static quants of https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised at commit [fa3e742](https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised/commit/fa3e742dd80b3f38127fb62f5fc66eaf468fb95c)
	- Quantized by [ThiloteE](https://huggingface.co/ThiloteE) with llama.cpp commit [e09a800](https://github.com/ggerganov/llama.cpp/commit/e09a800f9a9b19c73aa78e03b4c4be8ed988f3e6)

	These quants were created with a customized configuration that have been proven to not cause visible end of string (eos) tokens during inference with [GPT4All](https://www.nomic.ai/gpt4all).
	The config.json, generation_config.json and tokenizer_config.json differ from the original configuration as can be found in the original model's repository at the time of creation of these quants.


	# Prompt Template (for GPT4All)

	Example System Prompt:
	```
	<\|system\|>
	Vous trouverez ci-dessous une instruction décrivant une tâche. Rédigez une réponse qui réponde de manière appropriée à la demande.<\|end\|>

	```

	Chat Template:
	```
	<\|user\|>
	%1<\|end\|>
	<\|assistant\|>
	%2<\|end\|>

	```

	# Context Length

	`4096`

	Use a lower value during inference, if you do not have enough RAM or VRAM.

	# Provided Quants


	\| Link \| Type \| Size/GB \| Notes \|
	\|:-----\|:-----\|--------:\|:------\|
	\| [GGUF](https://huggingface.co/GPT4All-Community/Chocolatine-3B-Instruct-DPO-Revised-GGUF/resolve/main/Chocolatine-3B-Instruct-DPO-Revised-Q4_0.gguf?download=true) \| Q4_0 \| 2.44 \| fast, recommended \|




	# About GGUF

	If you are unsure how to use GGUF files, refer to one of [TheBloke's
	READMEs](https://huggingface.co/TheBloke/DiscoLM_German_7b_v1-GGUF) for
	more details, including on how to concatenate multi-part files.

	Here is a handy graph by ikawrakow comparing some quant types (lower is better):

	![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

	And here are Artefact2's thoughts on the matter:
	https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

	# Thanks

	I thank Mradermacher and TheBloke for Inspiration to this model card and their contributions to open source. Also 3Simplex for lots of help along the way.
	Shoutout to the GPT4All and llama.cpp communities :-)


	------

	<!-- footer end -->
	<!-- original-model-card start -->


	------
	------

	# Original Model card:

	<!---
	library_name: transformers
	license: mit
	language:
	- fr
	- en
	tags:
	- french
	- chocolatine
	datasets:
	- jpacifico/french-orca-dpo-pairs-revised
	pipeline_tag: text-generation
	--->

	### Chocolatine-3B-Instruct-DPO-Revised

	DPO fine-tuned of [microsoft/Phi-3-mini-4k-instruct](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) (3.82B params)
	using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
	Training in French also improves the model in English, surpassing the performances of its base model.
	Window context = 4k tokens

	### Benchmarks

	Chocolatine is the best-performing 3B model on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) (august 2024)

	![image/png](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Assets/openllm_choco3b_revised.png?raw=false)


	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|27.63\|
	\|IFEval (0-Shot) \|56.23\|
	\|BBH (3-Shot) \|37.16\|
	\|MATH Lvl 5 (4-Shot)\|14.5\|
	\|GPQA (0-shot) \|9.62\|
	\|MuSR (0-shot) \|15.1\|
	\|MMLU-PRO (5-shot) \|33.21\|


	### MT-Bench-French

	Chocolatine-3B-Instruct-DPO-Revised is outperforming GPT-3.5-Turbo on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french) by Bofeng Huang,
	used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench)

	```
	########## First turn ##########
	score
	model turn
	gpt-3.5-turbo 1 8.1375
	Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
	Daredevil-8B 1 7.8875
	Daredevil-8B-abliterated 1 7.8375
	Chocolatine-3B-Instruct-DPO-v1.0 1 7.6875
	NeuralDaredevil-8B-abliterated 1 7.6250
	Phi-3-mini-4k-instruct 1 7.2125
	Meta-Llama-3-8B-Instruct 1 7.1625
	vigostral-7b-chat 1 6.7875
	Mistral-7B-Instruct-v0.3 1 6.7500
	Mistral-7B-Instruct-v0.2 1 6.2875
	French-Alpaca-7B-Instruct_beta 1 5.6875
	vigogne-2-7b-chat 1 5.6625
	vigogne-2-7b-instruct 1 5.1375

	########## Second turn ##########
	score
	model turn
	Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
	gpt-3.5-turbo 2 7.679167
	Chocolatine-3B-Instruct-DPO-v1.0 2 7.612500
	NeuralDaredevil-8B-abliterated 2 7.125000
	Daredevil-8B 2 7.087500
	Daredevil-8B-abliterated 2 6.873418
	Meta-Llama-3-8B-Instruct 2 6.800000
	Mistral-7B-Instruct-v0.2 2 6.512500
	Mistral-7B-Instruct-v0.3 2 6.500000
	Phi-3-mini-4k-instruct 2 6.487500
	vigostral-7b-chat 2 6.162500
	French-Alpaca-7B-Instruct_beta 2 5.487395
	vigogne-2-7b-chat 2 2.775000
	vigogne-2-7b-instruct 2 2.240506

	########## Average ##########
	score
	model
	Chocolatine-3B-Instruct-DPO-Revised 7.962500
	gpt-3.5-turbo 7.908333
	Chocolatine-3B-Instruct-DPO-v1.0 7.650000
	Daredevil-8B 7.487500
	NeuralDaredevil-8B-abliterated 7.375000
	Daredevil-8B-abliterated 7.358491
	Meta-Llama-3-8B-Instruct 6.981250
	Phi-3-mini-4k-instruct 6.850000
	Mistral-7B-Instruct-v0.3 6.625000
	vigostral-7b-chat 6.475000
	Mistral-7B-Instruct-v0.2 6.400000
	French-Alpaca-7B-Instruct_beta 5.587866
	vigogne-2-7b-chat 4.218750
	vigogne-2-7b-instruct 3.698113
	```

	### Usage

	You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_3B_inference_test_colab.ipynb)

	You can also run Chocolatine using the following code:

	```python
	import transformers
	from transformers import AutoTokenizer

	# Format prompt
	message = [
	{"role": "system", "content": "You are a helpful assistant chatbot."},
	{"role": "user", "content": "What is a Large Language Model?"}
	]
	tokenizer = AutoTokenizer.from_pretrained(new_model)
	prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

	# Create pipeline
	pipeline = transformers.pipeline(
	"text-generation",
	model=new_model,
	tokenizer=tokenizer
	)

	# Generate text
	sequences = pipeline(
	prompt,
	do_sample=True,
	temperature=0.7,
	top_p=0.9,
	num_return_sequences=1,
	max_length=200,
	)
	print(sequences[0]['generated_text'])
	```

	* 4-bit quantized version is available here : [jpacifico/Chocolatine-3B-Instruct-DPO-Revised-Q4_K_M-GGUF](https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-Revised-Q4_K_M-GGUF)

	* Ollama: [jpacifico/chocolatine-3b](https://ollama.com/jpacifico/chocolatine-3b)

	```bash
	ollama run jpacifico/chocolatine-3b
	```

	Ollama Modelfile example :

	```bash
	FROM ./chocolatine-3b-instruct-dpo-revised-q4_k_m.gguf
	TEMPLATE """{{ if .System }}<\|system\|>
	{{ .System }}<\|end\|>
	{{ end }}{{ if .Prompt }}<\|user\|>
	{{ .Prompt }}<\|end\|>
	{{ end }}<\|assistant\|>
	{{ .Response }}<\|end\|>
	"""
	PARAMETER stop """{"stop": ["<\|end\|>","<\|user\|>","<\|assistant\|>"]}"""
	SYSTEM """You are a friendly assistant called Chocolatine."""
	```

	### Limitations

	The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
	It does not have any moderation mechanism.

	- Developed by: Jonathan Pacifico, 2024
	- Model type: LLM
	- Language(s) (NLP): French, English
	- License: MIT


	<!-- original-model-card end -->
	<!-- end -->