--- base_model: teknium/OpenHermes-2.5-Mistral-7B inference: false license: mit model_creator: Teknium model_name: OpenHermes 2.5 - Mistral 7B model_type: mistral prompt_template: | <|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant quantized_by: Semantically AI pruned_by: Semantically AI --- # Openhermes 2.5 - Mistral 7B - GGUF - Model creator: [Teknium](https://huggingface.co/teknium) - Original model [OpenHermes 2.5 Mistral 7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) ## Description This repo contains GGUF format model files for [Teknium's OpenHermes 2.5 Mistral 7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. ## Prompt template: ChatML ``` <|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ``` ## Explanation of quantisation methods
Click to see details The GGUF model is pruned to 50% using sparseGPT method [sparseGPT](https://github.com/IST-DASLab/sparsegpt)
``` from llama_cpp import Llama llm = Llama(model_path="openhermes-2.5-mistral-7b-pruned50-Q8_0.gguf") output = llm("""<|im_start|>system You are a friendly chatbot who always responds in the style of a pirate.<|im_end|> <|im_start|>user How many helicopters can a human eat in one sitting?<|im_end|> <|im_start|>assistant """) print(output) ```