File size: 5,036 Bytes

740c5fe
 
 
 
 
 
 
 
7ebe029
 
 
 
b680ff9
 
 
740c5fe
 
18bdcae
7ebe029
18bdcae
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
 
 
7ebe029
740c5fe
7ebe029
 
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
 
 
7ebe029
740c5fe
7ebe029
 
 
740c5fe
7ebe029
740c5fe
7ebe029
 
 
 
 
740c5fe
 
 
7ebe029
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
740c5fe
 
 
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
740c5fe
7ebe029
01ee11f
 
7ebe029
 
740c5fe
7ebe029
 
 
740c5fe
7ebe029
 
740c5fe
7ebe029
 
 
 
 
 
740c5fe
7ebe029
 
 
 
 
 
 
740c5fe
7ebe029
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
 
 
 
 
 
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
740c5fe
7ebe029
 
740c5fe

---
base_model: Qwen/Qwen2.5-3B-Instruct
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:Qwen/Qwen2.5-3B-Instruct
- lora
- transformers
- custom-llm
- knowledge-llm
- tony-stark
- fine-tuning
license: mit
language:
- en
---


# 🧠 Custom Knowledge LLM: Tony Stark Edition
![Banner](./banner.png)

This is a fine-tuned version of the [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model, adapted to answer domain-specific questions related to **Tony Stark**, using the LoRA (Low-Rank Adaptation) method for parameter-efficient fine-tuning.

---

## 📌 Model Details

### Model Description

This project is a fun + educational experiment that fine-tunes a base LLM using a fictional dataset based on Tony Stark from the Marvel universe.

- **Developed by:** [Aviral Srivastava](https://www.linkedin.com/in/aviral-srivastava26/)
- **Model type:** Causal Language Model (Instruction-tuned)
- **Language:** English
- **License:** MIT
- **Finetuned from model:** [`Qwen/Qwen2.5-3B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)

---

## 🧑‍💻 Uses

### Direct Use

This model is fine-tuned to answer Tony Stark–related prompts such as:

- "Who is Tony Stark?"
- "What suits did Iron Man build?"
- "What are leadership traits of Stark?"

### Downstream Use

The methodology can be directly reused for:
- Corporate knowledge assistants
- Domain-specific customer support
- Educational tutors trained on custom material
- Healthcare, law, and e-commerce Q&A bots

### Out-of-Scope Use

This model is not designed for:
- Real-world advice in medical, legal, or financial domains
- Factual accuracy outside of Tony Stark lore
- Handling unrelated general-purpose queries

---

## ⚠️ Bias, Risks, and Limitations

- This model is trained on fictional data and is not meant for serious tasks.
- It reflects only the content provided in the custom dataset.
- It may "hallucinate" facts if asked general questions.

### Recommendations

Please do not use this for any commercial or factual purpose without re-training on a verified dataset.

---

## 🚀 How to Use

```python
from transformers import pipeline

qa = pipeline(
    model="Avirallm/Custom-Knowledge-LLM-Tony-Stark-Edition",
    tokenizer="Avirallm/Custom-Knowledge-LLM-Tony-Stark-Edition",
    device="cuda"  # or "cpu" if not using GPU
)

qa("List all Iron Man suits and their features.")
```
## 🏋️‍♂️ Training Details

### 📦 Training Data  
A custom JSON dataset of prompt-completion pairs related to Tony Stark. Example entry:

~json
{
  "prompt": "Who is Tony Stark?",
  "completion": "Tony Stark is a fictional billionaire inventor from Marvel..."
}
~

### 🔧 Training Hyperparameters  
- **Epochs:** 10  
- **Batch Size:** 1  
- **Optimizer:** AdamW  
- **Learning Rate:** 0.001  
- **Mixed Precision:** FP16  
- **Framework:** Hugging Face `Trainer` + PEFT LoRA  

### 🖥️ Training Setup  
- Trained fully on **Google Colab Free Tier**  
- Using **Qwen/Qwen2.5-3B-Instruct** with LoRA adapters  
- Fine-tuned only **adapter layers** (not full model)  

---

## 📊 Evaluation

This project is **primarily exploratory** and not evaluated on public benchmarks.

---

## 🌱 Environmental Impact

- **Hardware:** Google Colab Free GPU (Tesla T4)  
- **Training Time:** ~380 seconds (10 epochs, 1580 steps)  
- **Carbon Emission:** Negligible (low-compute, single GPU)  

---

## 🧠 Architecture

- **Base Model:** Qwen2.5-3B-Instruct (Alibaba Cloud)  
- **Fine-Tuning:** LoRA adapters on top of base weights  
- **Task Type:** Text generation, instruction following  
- **Token Limit:** 128 tokens (during training)  

---

## ✨ Example Applications

- Fan-based AI chatbot (Iron Man Assistant)  
- Fictional universe assistants for games and comics  
- Domain-specific tutors for educational platforms  
- Startup knowledge bots (replace "Tony Stark" with your brand)  

---

## 📁 Repository Structure

- `adapter_model.safetensors` – LoRA adapter weights  
- `tokenizer_config.json`, `tokenizer.json`, `vocab.json` – Tokenizer files  
- `README.md` – Project overview  
- `training_args.bin` – Training arguments  
- `tonyst.json` (optional) – Custom dataset (if shared)  

---

## 📬 Get in Touch

Have a use case in mind? Want your own custom-trained LLM?  
📧 **Email:** [[email protected]](mailto:[email protected])  
🔗 **LinkedIn:** [Aviral Srivastava](https://www.linkedin.com/in/aviral-srivastava26/)  
💻 **GitHub:** [aviral-sri](https://github.com/aviral-sri)  

---

## 🙏 Credits

- **Base Model:** [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)  
- **Fine-Tuning:** PEFT + LoRA  
- **Tools Used:**  
  - Hugging Face Transformers  
  - Hugging Face Datasets  
  - Google Colab  
  - W&B for tracking  

**Inspired by:** Marvel's Tony Stark (for learning only, non-commercial)

---

## 🪪 License

This project is licensed under the MIT License.  
Feel free to modify, share, and build upon it.