Finetune any base model (e.g. Qwen3-8B) on any given code repository
Model Description
This model is a fine-tuned version of Qwen/Qwen3-8B specifically trained to understand and answer questions about any given private or new project repository, for example, Laddr - a framework for building scalable multi-agent systems.
The fine-tuning was performed using LoRA (Low-Rank Adaptation) with an innovative training data generation approach that does not rely on LLM-generated synthetic data, avoiding circular dependencies and hallucination issues.
Key Features
- ✅ Project-Specific Knowledge: Deep understanding of Laddr's architecture, codebase, and APIs
- ✅ Code Location: Accurately locates functions, classes, and modules (+30% improvement)
- ✅ Code Understanding: Explains code functionality with detailed context (+19.3% improvement)
- ✅ Maintains General Abilities: Retains base model's general knowledge capabilities
- ✅ Zero Hallucination Training Data: Generated from real code via AST parsing, not LLM synthesis
Model Details
Base Model
- Model: Qwen/Qwen3-8B
- Parameters: 8 Billion
- Architecture: Transformer-based causal language model
Fine-tuning Specifications
- Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 64
- LoRA Alpha: 128
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Training Framework: DeepSpeed ZeRO-3
- Precision: BF16
- Epochs: 3
- Training Samples: 650+
- Training Time: ~2-3 hours on 2x GPUs (48GB each)
Training Data
The training dataset was automatically generated from the Laddr repository using:
- Python AST parsing for code structure extraction
- Real docstrings and code comments
- Function signatures and parameter information
- Call graph relationships
- Project statistics and module structure
Data Composition:
- Code Explanation: 300+ samples (46%)
- API Usage: 150+ samples (23%)
- Code Location: 100+ samples (15%)
- Project Overview: 50+ samples (8%)
- Design Proposals: 50+ samples (8%)
Data Split:
- Training: 80% (520+ samples)
- Validation: 10% (65+ samples)
- Test: 10% (65+ samples)
Performance
Overall Results
| Metric | Base Model | Fine-tuned | Improvement |
|---|---|---|---|
| Overall Score | 49.4% | 71.5% | +22.1% ✅ |
| Code Location | 60.0% | 90.0% | +30.0% ⭐ |
| Code Understanding | 59.3% | 78.6% | +19.3% |
| Project Overview | 35.0% | 51.7% | +16.7% |
| General Knowledge | 10.0% | 30.0% | +20.0% |
Detailed Performance by Task Type
Code Location Tasks (+30.0%):
- Accurately identifies file locations of functions/classes
- Provides complete file paths with line numbers
- Eliminates uncertainty in location queries
Code Understanding Tasks (+19.3%):
- Explains code functionality with context
- Includes function signatures and parameters
- Extracts and presents real docstrings
Project Overview Tasks (+16.7%):
- Describes project purpose and architecture
- Lists technology stack and dependencies
- Provides project statistics and structure
General Knowledge (+20.0%):
- Maintains ability to answer general programming questions
- Improves conciseness (from verbose to clear)
- Preserves base model capabilities
Use Cases
1. Code Assistant for Laddr Project
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "tensense/code_repo_finetuning"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
question = "Where is the create_system_tools function located in Laddr?"
messages = [{"role": "user", "content": question}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0][len(inputs['input_ids'][0]):], skip_special_tokens=True)
print(response)
2. API Documentation Assistant
Query about function usage, parameters, and return values.
3. Codebase Onboarding
Help new developers quickly understand the Laddr project structure and key components.
4. Code Review Support
Provide context and explanations for code review processes.
Example Queries
Query 1: Code Location
Question: "Where is create_system_tools located in Laddr?"
Response:
create_system_toolsis located inlib/laddr/src/laddr/core/system_tools.py.Function Description: Create system tools for task delegation and artifact management. This function checks for user-provided overrides before registering the default implementations.
Function Signature:
def create_system_tools(message_bus, storage_backend=None, agent=None) -> dict[str, tuple[Any, list[str]]]:Parameters:
message_bus(Any): Message queue backendstorage_backend(Any): Optional storage backendagent(Any): Optional agent instanceReturns:
dict[str, tuple[Any, list[str]]]
Query 2: Project Overview
Question: "What is Laddr's main functionality?"
Response:
Laddr is a Python project that serves as a framework for building scalable multi-agent systems.
Main Technologies: bandit, black, boto3-stubs, build, click, etc.
The project contains 279 code elements, consisting of 67 classes and 212 functions/methods.
Core Modules:
core(279 elements)cli(52 elements)llms(39 elements)
Limitations
- Project-Specific: Optimized for Laddr project; may not perform as well on other codebases
- Knowledge Cutoff: Based on the Laddr repository as of training time (2025-01)
- Language Focus: Primarily trained on Python code and English/Chinese documentation
- Limited General Coding: While it maintains general knowledge, it's optimized for Laddr-specific queries
Training Methodology
Innovation: LLM-Free Training Data Generation
Unlike traditional approaches that use LLMs to generate synthetic training data, this project employs a novel methodology:
- AST-Based Code Parsing: Python Abstract Syntax Tree analysis extracts accurate code structure
- Real Documentation: Utilizes actual docstrings, comments, and code signatures
- Call Graph Analysis: Builds function dependency relationships
- Pattern Extraction: Identifies code patterns (implementation, usage, interaction)
- Template-Based QA: Generates question-answer pairs using templates with real code context
Benefits:
- ✅ Avoids circular dependency (using LLM data to train LLM)
- ✅ Eliminates hallucination in training data
- ✅ Ensures factual accuracy
- ✅ Provides complete reasoning traces
Training Pipeline
GitHub Repository
↓
[1. Repository Analyzer]
→ Extracts code elements, patterns, call graph
↓
[2. Data Generator]
→ Creates QA pairs with code context
↓
[3. Model Fine-tuner]
→ LoRA + DeepSpeed ZeRO-3 training
↓
[4. LoRA Merger]
→ Merges adapter into base model
↓
[5. Model Evaluator]
→ Compares base vs fine-tuned
↓
Fine-tuned Model
Extensibility
The training methodology is repository-agnostic and can be applied to any codebase:
Adapt to Your Repository
# 1. Update configuration
python utils/config_manager.py https://github.com/your-org/your-repo
# 2. Analyze repository
python scripts/01_analyze_repo.py
# 3. Generate training data
python scripts/02_generate_data.py
# 4. Fine-tune model
deepspeed --num_gpus=2 scripts/03_train_model.py
# 5. Merge LoRA weights
python scripts/04_merge_weights.py
# 6. Evaluate
python scripts/05_evaluate.py
Supported Languages (currently):
- Python (primary)
- Markdown (documentation)
Extensible to:
- JavaScript/TypeScript
- Java
- Go
- Rust
Ethical Considerations
- Code Attribution: All training data comes from the open-source Laddr repository
- License Compliance: Respects Apache 2.0 license of both base model and Laddr project
- No Private Data: Only uses publicly available code
- Reproducibility: Complete methodology documented for transparency
Citation
If you use this model or methodology in your research, please cite:
@misc{qwen3-code-repo-finetuned-2025,
title={Finetune any base model (e.g. Qwen3-8B) on any given code repository},
author={Tensense},
year={2025},
publisher={HuggingFace},
url={https://huggingface.co/tensense/code_repo_finetuning}
}
Acknowledgments
- Base Model: Qwen Team for Qwen3-8B
- Laddr Project: AgnetLabs for the multi-agent framework
- Training Framework: HuggingFace Transformers, DeepSpeed, PEFT (LoRA)
License
This model is released under the Apache 2.0 License, consistent with:
- Qwen3-8B base model license
- Laddr project license
Model Card Authors
[Tensense]
Model Card Contact
For questions or issues, please contact:
- Email: [email protected]
- GitHub: TopologyApplied
- HuggingFace: tensense
Additional Resources
- Base Model: Qwen/Qwen3-8B
- Training Code: GitHub Repository
- Checkpoint & Finetuned Model: Huggingface
- Laddr Project: GitHub
- Evaluation Report: [Link to comparison_report.json]
- Design Documentation: [Link to design docs]
Version History
v1.0 (2025-11-15)
- Initial release
- Fine-tuned on Laddr repository
- 650+ training samples
- LoRA rank 64, alpha 128
- 3 epochs training
- Overall improvement: +22.1%
Note: This is a demonstration of repository-specific fine-tuning methodology. The approach can be adapted to any codebase for creating custom code assistants.
Model tree for tensense/code_repo_finetuning
Evaluation results
- Overall Scoreself-reported71.500
- Improvement over Base Modelself-reported22.100