metadata

license: apache-2.0
language:
  - en
base_model:
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
pipeline_tag: text-generation
library_name: transformers
tags:
  - text-generation-inference
  - math
  - moderately abliterated
  - abliterated
  - code
  - R1
  - RL

Sombrero-R1-14B-Elite13

Sombrero-R1-14B-Elite13 is a fine-tuned variant of the DeepSeek-R1-Distill-Qwen-14B model, enhanced through reinforcement learning to serve as a high-performance reasoning assistant. It excels in both mathematical problem-solving and general-purpose conversational tasks. This model combines distilled efficiency with refined instruction-following behavior, offering an ideal balance of speed, capability, and coherence for complex interactive tasks.

Key Enhancements

Reinforcement Learning Fine-Tuning Trained with reinforcement learning objectives to optimize for alignment, reward-guided reasoning, and helpfulness in conversation.
Mathematical Reasoning Proficiency Delivers accurate solutions and step-by-step breakdowns for algebra, calculus, number theory, logic puzzles, and applied mathematics.
Instruction Adherence Capable of understanding and following multi-part instructions, including structured tasks and iterative refinement prompts.
Expanded Context Handling Supports up to 128K tokens of context with output lengths up to 8K tokens, ideal for technical and educational use cases.
Cross-Domain Knowledge Offers broad general knowledge capabilities, making it suitable for tutoring, research, and exploratory conversation across topics.

Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Sombrero-R1-14B-Elite13"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Solve: Integrate (x^2 * e^x) dx"
messages = [
    {"role": "system", "content": "You are a helpful AI assistant skilled in math and reasoning."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Intended Use Cases

Mathematics Problem Solving Ideal for step-by-step derivations, symbolic computation, numerical explanations, and LaTeX-supported outputs.
Educational and Instructional Support Helpful in classrooms and learning platforms, offering guided explanations for students and instructors.
Chat-based Reasoning Designed for coherent, context-aware dialogue generation with structured logic and continuity.
Multilingual Knowledge Assistance Supports 29+ languages, including English, Chinese, French, German, Arabic, and others, for multilingual learning.
Document and Code Explanation Can explain complex documents, code snippets, or structured logic flows in natural language.

Known Limitations

Compute Intensive Requires high-memory hardware (e.g., ≥48GB VRAM) to fully utilize context length and generation capacity.
Potential for Bias and Hallucinations While tuned for alignment, some responses may still exhibit artifacts from pretraining biases or inaccuracies in edge cases.
Drift in Long Responses Output may occasionally degrade in structure or accuracy across long generations.
Static Knowledge Does not have real-time awareness or access to events or research developments post-training.
Creative Task Variability While optimized for logic, its performance in narrative or subjective content may be inconsistent.