|
--- |
|
base_model: theprint/Tom-Qwen-7B-Instruct |
|
library_name: peft |
|
pipeline_tag: text-generation |
|
language: en |
|
license: apache-2.0 |
|
tags: |
|
- lora |
|
- sft |
|
- transformers |
|
- trl |
|
- unsloth |
|
- fine-tuned |
|
datasets: |
|
- theprint/ReWiz |
|
--- |
|
# Rewiz-Tom-7B |
|
|
|
A fine-tuned 7B parameter model specialized in reasoning (Rewiz), based on a model that was already finetuned for step-by-step instruction and conversation (Tom). |
|
|
|
## Model Details |
|
|
|
This model is a fine-tuned version of theprint/Tom-Qwen-7B-Instruct using the Unsloth framework with LoRA (Low-Rank Adaptation) for efficient training. |
|
|
|
- **Developed by:** theprint |
|
- **Model type:** Causal Language Model (Fine-tuned with LoRA) |
|
- **Language:** en |
|
- **License:** apache-2.0 |
|
- **Base model:** theprint/Tom-Qwen-7B-Instruct |
|
- **Fine-tuning method:** LoRA with rank 128 |
|
|
|
## Intended Use |
|
|
|
Conversation, brainstorming, and general instruction following |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The Rewiz data set is a curated mix of 20,000 reasoning-based entries. |
|
|
|
- **Dataset:** theprint/ReWiz |
|
- **Format:** alpaca |
|
|
|
### Training Procedure |
|
|
|
- **Training epochs:** 2 |
|
- **LoRA rank:** 128 |
|
- **Learning rate:** 0.0002 |
|
- **Batch size:** 4 |
|
- **Framework:** Unsloth + transformers + PEFT |
|
- **Hardware:** NVIDIA RTX 5090 |
|
|
|
## Usage |
|
|
|
```python |
|
from unsloth import FastLanguageModel |
|
import torch |
|
|
|
# Load model and tokenizer |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name="theprint/Rewiz-Tom-7B", |
|
max_seq_length=4096, |
|
dtype=None, |
|
load_in_4bit=True, |
|
) |
|
|
|
# Enable inference mode |
|
FastLanguageModel.for_inference(model) |
|
|
|
# Example usage |
|
inputs = tokenizer(["Your prompt here"], return_tensors="pt") |
|
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
### Alternative Usage (Standard Transformers) |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
"theprint/Rewiz-Tom-7B", |
|
torch_dtype=torch.float16, |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained("theprint/Rewiz-Tom-7B") |
|
|
|
# Example usage |
|
messages = [ |
|
{"role": "system", "content": "You are a helpful assistant."}, |
|
{"role": "user", "content": "Your question here"} |
|
] |
|
|
|
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True) |
|
outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7, do_sample=True) |
|
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
## GGUF Quantized Versions |
|
|
|
Quantized GGUF versions are available in the `gguf/` directory for use with llama.cpp: |
|
|
|
- `Rewiz-Tom-7B-f16.gguf` (14531.9 MB) - 16-bit float (original precision, largest file) |
|
- `Rewiz-Tom-7B-q3_k_m.gguf` (3632.0 MB) - 3-bit quantization (medium quality) |
|
- `Rewiz-Tom-7B-q4_k_m.gguf` (4466.1 MB) - 4-bit quantization (medium, recommended for most use cases) |
|
- `Rewiz-Tom-7B-q5_k_m.gguf` (5192.6 MB) - 5-bit quantization (medium, good quality) |
|
- `Rewiz-Tom-7B-q6_k.gguf` (5964.5 MB) - 6-bit quantization (high quality) |
|
- `Rewiz-Tom-7B-q8_0.gguf` (7723.4 MB) - 8-bit quantization (very high quality) |
|
|
|
### Using with llama.cpp |
|
|
|
```bash |
|
# Download a quantized version (q4_k_m recommended for most use cases) |
|
wget https://huggingface.co/theprint/Rewiz-Tom-7B/resolve/main/gguf/Rewiz-Tom-7B-q4_k_m.gguf |
|
|
|
# Run with llama.cpp |
|
./llama.cpp/main -m Rewiz-Tom-7B-q4_k_m.gguf -p "Your prompt here" -n 256 |
|
``` |
|
## Limitations |
|
|
|
May hallucinate or provide incorrect information. |
|
|
|
## Citation |
|
|
|
If you use this model, please cite: |
|
|
|
```bibtex |
|
@misc{rewiz_tom_7b, |
|
title={Rewiz-Tom-7B: Fine-tuned theprint/Tom-Qwen-7B-Instruct}, |
|
author={theprint}, |
|
year={2025}, |
|
publisher={Hugging Face}, |
|
url={https://huggingface.co/theprint/Rewiz-Tom-7B} |
|
} |
|
``` |
|
|
|
## Acknowledgments |
|
|
|
- Base model: [theprint/Tom-Qwen-7B-Instruct](https://huggingface.co/theprint/Tom-Qwen-7B-Instruct) |
|
- Training dataset: [theprint/ReWiz](https://huggingface.co/datasets/theprint/ReWiz) |
|
- Fine-tuning framework: [Unsloth](https://github.com/unslothai/unsloth) |
|
- Quantization: [llama.cpp](https://github.com/ggerganov/llama.cpp) |
|
|