Triangle104's picture
Update README.md
fe24d62 verified
metadata
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - esper
  - esper-3
  - valiant
  - valiant-labs
  - qwen
  - qwen-3
  - qwen-3-8b
  - 8b
  - deepseek
  - deepseek-r1-0528
  - deepseek-r1
  - reasoning
  - code
  - code-instruct
  - python
  - javascript
  - dev-ops
  - jenkins
  - terraform
  - scripting
  - powershell
  - azure
  - aws
  - gcp
  - cloud
  - problem-solving
  - architect
  - engineer
  - developer
  - creative
  - analytical
  - expert
  - rationality
  - conversational
  - chat
  - instruct
  - llama-cpp
  - gguf-my-repo
base_model: ValiantLabs/DeepSeek-R1-0528-Qwen3-8B-Esper3
datasets:
  - sequelbox/Titanium2.1-DeepSeek-R1
  - sequelbox/Tachibana2-DeepSeek-R1
  - sequelbox/Raiden-DeepSeek-R1
license: apache-2.0

Triangle104/DeepSeek-R1-0528-Qwen3-8B-Esper3-Q4_K_M-GGUF

This model was converted to GGUF format from ValiantLabs/DeepSeek-R1-0528-Qwen3-8B-Esper3 using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.


Esper 3 is a coding, architecture, and DevOps reasoning specialist built on Qwen 3.

  • Finetuned on our DevOps and architecture reasoning and code reasoning data generated with Deepseek R1!
  • Improved general and creative reasoning to supplement problem-solving and general chat performance.
  • Small model sizes allow running on local desktop and mobile, plus super-fast server inference!

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Triangle104/DeepSeek-R1-0528-Qwen3-8B-Esper3-Q4_K_M-GGUF --hf-file deepseek-r1-0528-qwen3-8b-esper3-q4_k_m.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Triangle104/DeepSeek-R1-0528-Qwen3-8B-Esper3-Q4_K_M-GGUF --hf-file deepseek-r1-0528-qwen3-8b-esper3-q4_k_m.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Triangle104/DeepSeek-R1-0528-Qwen3-8B-Esper3-Q4_K_M-GGUF --hf-file deepseek-r1-0528-qwen3-8b-esper3-q4_k_m.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo Triangle104/DeepSeek-R1-0528-Qwen3-8B-Esper3-Q4_K_M-GGUF --hf-file deepseek-r1-0528-qwen3-8b-esper3-q4_k_m.gguf -c 2048