Finetune any base model (e.g. Qwen3-8B) on any given code repository

Model Description

This model is a fine-tuned version of Qwen/Qwen3-8B specifically trained to understand and answer questions about any given private or new project repository, for example, Laddr - a framework for building scalable multi-agent systems.

The fine-tuning was performed using LoRA (Low-Rank Adaptation) with an innovative training data generation approach that does not rely on LLM-generated synthetic data, avoiding circular dependencies and hallucination issues.

Key Features

  • Project-Specific Knowledge: Deep understanding of Laddr's architecture, codebase, and APIs
  • Code Location: Accurately locates functions, classes, and modules (+30% improvement)
  • Code Understanding: Explains code functionality with detailed context (+19.3% improvement)
  • Maintains General Abilities: Retains base model's general knowledge capabilities
  • Zero Hallucination Training Data: Generated from real code via AST parsing, not LLM synthesis

Model Details

Base Model

  • Model: Qwen/Qwen3-8B
  • Parameters: 8 Billion
  • Architecture: Transformer-based causal language model

Fine-tuning Specifications

  • Method: LoRA (Low-Rank Adaptation)
  • LoRA Rank: 64
  • LoRA Alpha: 128
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Training Framework: DeepSpeed ZeRO-3
  • Precision: BF16
  • Epochs: 3
  • Training Samples: 650+
  • Training Time: ~2-3 hours on 2x GPUs (48GB each)

Training Data

The training dataset was automatically generated from the Laddr repository using:

  • Python AST parsing for code structure extraction
  • Real docstrings and code comments
  • Function signatures and parameter information
  • Call graph relationships
  • Project statistics and module structure

Data Composition:

  • Code Explanation: 300+ samples (46%)
  • API Usage: 150+ samples (23%)
  • Code Location: 100+ samples (15%)
  • Project Overview: 50+ samples (8%)
  • Design Proposals: 50+ samples (8%)

Data Split:

  • Training: 80% (520+ samples)
  • Validation: 10% (65+ samples)
  • Test: 10% (65+ samples)

Performance

Overall Results

Metric Base Model Fine-tuned Improvement
Overall Score 49.4% 71.5% +22.1%
Code Location 60.0% 90.0% +30.0%
Code Understanding 59.3% 78.6% +19.3%
Project Overview 35.0% 51.7% +16.7%
General Knowledge 10.0% 30.0% +20.0%

Detailed Performance by Task Type

Code Location Tasks (+30.0%):

  • Accurately identifies file locations of functions/classes
  • Provides complete file paths with line numbers
  • Eliminates uncertainty in location queries

Code Understanding Tasks (+19.3%):

  • Explains code functionality with context
  • Includes function signatures and parameters
  • Extracts and presents real docstrings

Project Overview Tasks (+16.7%):

  • Describes project purpose and architecture
  • Lists technology stack and dependencies
  • Provides project statistics and structure

General Knowledge (+20.0%):

  • Maintains ability to answer general programming questions
  • Improves conciseness (from verbose to clear)
  • Preserves base model capabilities

Use Cases

1. Code Assistant for Laddr Project

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "tensense/code_repo_finetuning"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

question = "Where is the create_system_tools function located in Laddr?"
messages = [{"role": "user", "content": question}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0][len(inputs['input_ids'][0]):], skip_special_tokens=True)
print(response)

2. API Documentation Assistant

Query about function usage, parameters, and return values.

3. Codebase Onboarding

Help new developers quickly understand the Laddr project structure and key components.

4. Code Review Support

Provide context and explanations for code review processes.

Example Queries

Query 1: Code Location

Question: "Where is create_system_tools located in Laddr?"

Response:

create_system_tools is located in lib/laddr/src/laddr/core/system_tools.py.

Function Description: Create system tools for task delegation and artifact management. This function checks for user-provided overrides before registering the default implementations.

Function Signature:

def create_system_tools(message_bus, storage_backend=None, agent=None) -> dict[str, tuple[Any, list[str]]]:

Parameters:

  • message_bus (Any): Message queue backend
  • storage_backend (Any): Optional storage backend
  • agent (Any): Optional agent instance

Returns: dict[str, tuple[Any, list[str]]]

Query 2: Project Overview

Question: "What is Laddr's main functionality?"

Response:

Laddr is a Python project that serves as a framework for building scalable multi-agent systems.

Main Technologies: bandit, black, boto3-stubs, build, click, etc.

The project contains 279 code elements, consisting of 67 classes and 212 functions/methods.

Core Modules:

  • core (279 elements)
  • cli (52 elements)
  • llms (39 elements)

Limitations

  • Project-Specific: Optimized for Laddr project; may not perform as well on other codebases
  • Knowledge Cutoff: Based on the Laddr repository as of training time (2025-01)
  • Language Focus: Primarily trained on Python code and English/Chinese documentation
  • Limited General Coding: While it maintains general knowledge, it's optimized for Laddr-specific queries

Training Methodology

Innovation: LLM-Free Training Data Generation

Unlike traditional approaches that use LLMs to generate synthetic training data, this project employs a novel methodology:

  1. AST-Based Code Parsing: Python Abstract Syntax Tree analysis extracts accurate code structure
  2. Real Documentation: Utilizes actual docstrings, comments, and code signatures
  3. Call Graph Analysis: Builds function dependency relationships
  4. Pattern Extraction: Identifies code patterns (implementation, usage, interaction)
  5. Template-Based QA: Generates question-answer pairs using templates with real code context

Benefits:

  • ✅ Avoids circular dependency (using LLM data to train LLM)
  • ✅ Eliminates hallucination in training data
  • ✅ Ensures factual accuracy
  • ✅ Provides complete reasoning traces

Training Pipeline

GitHub Repository
    ↓
[1. Repository Analyzer]
    → Extracts code elements, patterns, call graph
    ↓
[2. Data Generator]
    → Creates QA pairs with code context
    ↓
[3. Model Fine-tuner]
    → LoRA + DeepSpeed ZeRO-3 training
    ↓
[4. LoRA Merger]
    → Merges adapter into base model
    ↓
[5. Model Evaluator]
    → Compares base vs fine-tuned
    ↓
Fine-tuned Model

Extensibility

The training methodology is repository-agnostic and can be applied to any codebase:

Adapt to Your Repository

# 1. Update configuration
python utils/config_manager.py https://github.com/your-org/your-repo

# 2. Analyze repository
python scripts/01_analyze_repo.py

# 3. Generate training data
python scripts/02_generate_data.py

# 4. Fine-tune model
deepspeed --num_gpus=2 scripts/03_train_model.py

# 5. Merge LoRA weights
python scripts/04_merge_weights.py

# 6. Evaluate
python scripts/05_evaluate.py

Supported Languages (currently):

  • Python (primary)
  • Markdown (documentation)

Extensible to:

  • JavaScript/TypeScript
  • Java
  • Go
  • Rust

Ethical Considerations

  • Code Attribution: All training data comes from the open-source Laddr repository
  • License Compliance: Respects Apache 2.0 license of both base model and Laddr project
  • No Private Data: Only uses publicly available code
  • Reproducibility: Complete methodology documented for transparency

Citation

If you use this model or methodology in your research, please cite:

@misc{qwen3-code-repo-finetuned-2025,
  title={Finetune any base model (e.g. Qwen3-8B) on any given code repository},
  author={Tensense},
  year={2025},
  publisher={HuggingFace},
  url={https://huggingface.co/tensense/code_repo_finetuning}
}

Acknowledgments

  • Base Model: Qwen Team for Qwen3-8B
  • Laddr Project: AgnetLabs for the multi-agent framework
  • Training Framework: HuggingFace Transformers, DeepSpeed, PEFT (LoRA)

License

This model is released under the Apache 2.0 License, consistent with:

  • Qwen3-8B base model license
  • Laddr project license

Model Card Authors

[Tensense]

Model Card Contact

For questions or issues, please contact:


Additional Resources

Version History

v1.0 (2025-11-15)

  • Initial release
  • Fine-tuned on Laddr repository
  • 650+ training samples
  • LoRA rank 64, alpha 128
  • 3 epochs training
  • Overall improvement: +22.1%

Note: This is a demonstration of repository-specific fine-tuning methodology. The approach can be adapted to any codebase for creating custom code assistants.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tensense/code_repo_finetuning

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Adapter
(402)
this model

Evaluation results