dranger003
/

c4ai-command-r-plus-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

c4ai-command-r-plus-iMat.GGUF / README.md

dranger003's picture

Update README.md

97a6ea0 verified 7 months ago

|

2.18 kB

	---
	license: cc-by-nc-4.0
	pipeline_tag: text-generation
	library_name: gguf
	base_model: CohereForAI/c4ai-command-r-plus
	---
	2024-04-05: Support for this model is still being worked on - [`PR#6491`](https://github.com/ggerganov/llama.cpp/pull/6491).
	For now, you can test the model using this fork: [https://github.com/dranger003/llama.cpp/tree/Noeda/commandr-plus](https://github.com/dranger003/llama.cpp/tree/Noeda/commandr-plus)

	* GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
	* The importance matrix was trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
	* [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
	* The [imatrix is being used on the K-quants](https://github.com/ggerganov/llama.cpp/pull/4930) as well (only for < Q6_K).
	* You can merge GGUFs with `gguf-split --merge <first-chunk> <output-file>` although this is not required since [f482bb2e](https://github.com/ggerganov/llama.cpp/commit/f482bb2e4920e544651fb832f2e0bcb4d2ff69ab).

	> C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities, this includes Retrieval Augmented Generation (RAG) and tool use to automate sophisticated tasks. The tool use in this model generation enables multi-step tool use which allows the model to combine multiple tools over multiple steps to accomplish difficult tasks. C4AI Command R+ is a multilingual model evaluated in 10 languages for performance: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese. Command R+ is optimized for a variety of use cases including reasoning, summarization, and question answering.

	\| Layers \| Context \| [Template](https://huggingface.co/CohereForAI/c4ai-command-r-plus#tool-use--multihop-capabilities) \|
	\| --- \| --- \| --- \|
	\| <pre>64</pre> \| <pre>131072</pre> \| <pre>\<BOS_TOKEN\>\<\\|START_OF_TURN_TOKEN\\|\>\<\\|USER_TOKEN\\|\>{prompt}\<\\|END_OF_TURN_TOKEN\\|\>\<\\|START_OF_TURN_TOKEN\\|\>\<\\|CHATBOT_TOKEN\\|\>{response}</pre> \|