Hicoder-R1-Distill-Gemma-27B

Notably, this CoT-enabled model was trained using only a single RTX 4090D, achieved through optimizations in both GPU VRAM and system RAM management, as well as specific techniques applied during the training steps.

Model Overview

Hicoder-R1-Distill-Gemma-27B is a large language model fine-tuned from Google's Gemma-3 27B base model. This model is specifically optimized for Chain-of-Thought (CoT) reasoning and code generation tasks.

Base Model: google/gemma-3-27b
Fine-tuned by: tonyli8623
Focus Areas: Chain-of-Thought (CoT), Code Generation, Code Explanation, Debugging
Language: Primarily English for prompts and reasoning, generates code in multiple languages.

Key Features

Enhanced CoT Reasoning: Explicitly trained to break down complex problems into intermediate steps before providing a final answer, particularly useful for complex coding or algorithmic tasks.
Strong Coding Capabilities: Generates, explains, debugs, and translates code across various programming languages (e.g., Python, JavaScript, Java, C++, SQL, etc.).
Gemma-3 Foundation: Built upon the powerful and efficient architecture of Google's Gemma-3 27B model.
Distillation Enhanced (Implied): Potentially benefits from knowledge distillation for improved performance relative to standard fine-tuning on the target tasks.

How to Use

You can use this model with the Hugging Face transformers library.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Specify the path to your fine-tuned model (local or Hugging Face Hub ID)
model_id = "tonyli8623/Hicoder-R1-Distill-Gemma-27B"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, # Use bfloat16 for efficiency if supported
    device_map="auto" # Automatically distribute across available GPUs
)

# --- Example 1: Simple Code Generation ---
prompt_simple = "Write a Python function to calculate the factorial of a number."
# Note: Use the appropriate chat template if the base model requires it (e.g., Gemma-2 instruct)
# Example using Gemma-2 instruct template structure (adjust if needed):
messages_simple = [
    {"role": "user", "content": prompt_simple}
]
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

outputs_simple = model.generate(
    input_ids_simple,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
print("--- Simple Code Generation ---")
print(response_simple)

# --- Example 2: Code Generation with CoT ---
prompt_cot = """Think step-by-step to write a Python function that finds all prime numbers up to a given integer 'n' using the Sieve of Eratosthenes algorithm. Then, provide the function.

Let's break this down:
1.  Understand the Sieve of Eratosthenes.
2.  Outline the steps needed in the function.
3.  Write the Python code based on the outline."""

messages_cot = [
    {"role": "user", "content": prompt_cot}
]
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

outputs_cot = model.generate(
    input_ids_cot,
    max_new_tokens=500, # Allow more tokens for CoT + code
    do_sample=True,
    temperature=0.6,
    top_k=50,
    top_p=0.95
)
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
print("\n--- Code Generation with CoT ---")
print(response_cot)

Prompting: For best results, especially when seeking CoT reasoning, explicitly ask the model to "think step-by-step" or "provide your reasoning process before the code". in system prompts add "You are a code engineer proficient in various programming languages. Before answering, please carefully consider the question and create a logically coherent thought process, starting with and ending with . After thinking, provide the answer."

Limitations and Bias

This model is based on Gemma-3, and inherits its capabilities and limitations.
While fine-tuned for coding, it may still generate incorrect, inefficient, or insecure code. Always review and test generated code thoroughly.
The model's knowledge is limited to its training data cutoff.
Like all LLMs, it may exhibit biases present in the underlying training data.
Chain-of-Thought reasoning may not always be perfect or logical.

License

The license for this model depends on the base Gemma-2 model's license and any additional terms you impose. The Gemma-3 models are typically governed by the "Gemma Terms of Use". Please consult the specific license file included with the model or the Gemma Terms of Use.

Gemma Terms of Use: [Link to Google's Gemma Terms, e.g., https://ai.google.dev/gemma/terms]
Fine-tuning Specific License (if any): [Specify if you add Apache 2.0, MIT, etc., or state it follows the base model license]

Citation

If you use this model in your research or work, please consider citing:

@misc{hicoder_r1_distill_gemma_27b_[year],
  title={Hicoder-R1-Distill-Gemma-27B: A Chain-of-Thought and Code Generation Focused Model},
  author={[Your Name/Organization]},
  year={[Year of Release]},
  howpublished={\url{[Link to Model Hub or Repository]}}
}

@misc{gemma2_2024,
  title={Gemma 3 Technical Report},
  author={Gemma Team, Google},
  year={2024},
  howpublished={\url{https://ai.google.dev/gemma}} % Replace with actual Gemma 2 paper/report link if available
}

Contact

For questions, feedback, or issues, please contact [email protected].

中文版 (Chinese Version)

模型概述

Hicoder-R1-Distill-Gemma-27B 是一个基于 Google Gemma-3 27B (基础模型进行微调的大型语言模型。该模型专门针对思维链 (Chain-of-Thought, CoT) 推理和代码生成任务进行了优化。

基础模型: google/gemma-2-27b (或指定使用的确切变体，例如 gemma-2-27b-it)
微调者: [您的姓名/组织名称]
专注领域: 思维链 (CoT), 代码生成, 代码解释, 代码调试
语言: 主要使用英文进行提示和推理，可生成多种编程语言的代码。

主要特性

增强的 CoT 推理能力: 经过专门训练，能够在提供最终答案之前将复杂问题分解为中间步骤，这对于复杂的编码或算法任务特别有用。
强大的编码能力: 能生成、解释、调试和翻译多种编程语言（如 Python, JavaScript, Java, C++, SQL 等）的代码。
基于 Gemma-2: 构建于 Google 强大且高效的 Gemma-2 27B 模型架构之上。
蒸馏增强 (推测): 可能受益于知识蒸馏，相对于在目标任务上的标准微调，性能有所提升。

如何使用

您可以通过 Hugging Face transformers 库来使用此模型。

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# 指定您的微调模型的路径 (本地路径或 Hugging Face Hub ID)
model_id = "tonyli8623/Hicoder-R1-Distill-Gemma-27B"
# 加载分词器和模型
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, # 如果硬件支持，使用 bfloat16 以提高效率
    device_map="auto" # 自动将模型分配到可用的 GPU 上
)

# --- 示例 1: 简单代码生成 ---
prompt_simple = "编写一个 Python 函数来计算一个数的阶乘。"
# 注意: 如果基础模型需要，请使用相应的聊天模板 (例如 Gemma-2 instruct)
# 使用 Gemma-2 instruct 模板结构的示例 (如果需要请调整):
messages_simple = [
    {"role": "user", "content": prompt_simple}
]
input_ids_simple = tokenizer.apply_chat_template(messages_simple, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

outputs_simple = model.generate(
    input_ids_simple,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95
)
response_simple = tokenizer.decode(outputs_simple[0][input_ids_simple.shape[1]:], skip_special_tokens=True)
print("--- 简单代码生成 ---")
print(response_simple)

# --- 示例 2: 带 CoT 的代码生成 ---
prompt_cot = """请逐步思考如何编写一个 Python 函数，使用埃拉托斯特尼筛法 (Sieve of Eratosthenes) 找出小于等于给定整数 'n' 的所有素数。然后，提供该函数。

让我们分解一下步骤：
1. 理解埃拉托斯特尼筛法的原理。
2. 概述函数中需要的步骤。
3. 基于概述编写 Python 代码。"""

messages_cot = [
    {"role": "user", "content": prompt_cot}
]
input_ids_cot = tokenizer.apply_chat_template(messages_cot, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

outputs_cot = model.generate(
    input_ids_cot,
    max_new_tokens=500, # 为 CoT + 代码允许更多 token
    do_sample=True,
    temperature=0.6,
    top_k=50,
    top_p=0.95
)
response_cot = tokenizer.decode(outputs_cot[0][input_ids_cot.shape[1]:], skip_special_tokens=True)
print("\n--- 带 CoT 的代码生成 ---")
print(response_cot)

提示词技巧 (Prompting): 为了获得最佳效果，尤其是在需要 CoT 推理时，请明确要求模型“逐步思考”或“在代码前提供你的推理过程”。如添加system prompts "你是一位精通各种编程语言的代码工程师。在回答之前，请仔细思考问题，并创建一个逻辑连贯的思考过程，以开始，以结束，思考完后给出答案。"

局限性与偏见

该模型基于 Gemma-2，继承了其能力和局限性。
尽管针对编码进行了微调，它仍可能生成不正确、低效或不安全的代码。请务必仔细审查和测试生成的代码。
模型的知识仅限于其训练数据的截止日期。
与所有大型语言模型一样，它可能表现出基础训练数据中存在的偏见。
思维链推理可能并非总是完美或符合逻辑。

许可证 (License)

该模型的许可证取决于基础 Gemma-2 模型的许可证以及您可能施加的任何附加条款。Gemma-2 模型通常受 "Gemma 使用条款" 的约束。请查阅模型附带的具体许可证文件或 Gemma 使用条款。

Gemma 使用条款: [指向 Google Gemma 条款的链接, 例如: https://ai.google.dev/gemma/terms]
微调特定许可证 (如有): [在此说明您是否添加了 Apache 2.0, MIT 等许可证，或声明其遵循基础模型的许可证]

引用

如果您在研究或工作中使用此模型，请考虑引用：

@misc{hicoder_r1_distill_gemma_27b_[年份],
  title={Hicoder-R1-Distill-Gemma-27B: 一个专注于思维链和代码生成的模型},
  author={[您的姓名/组织名称]},
  year={[发布年份]},
  howpublished={\url{[模型 Hub 或仓库的链接]}}
}

@misc{gemma2_2024,
  title={Gemma 2 Technical Report},
  author={Gemma Team, Google},
  year={2024},
 
}

tonyli8623
/

Hicoder-R1-Distill-Gemma-27B