Lumen-8b-05-2025
Model Highlights
Lumen-8b-05-2025 is a fine-tuned version of Qwen3-8B, created by Noetic Labs to enhance performance on academic and complex instruction following tasks. This model leverages the powerful base capabilities of Qwen3 while being optimized for scholarly and technical content through targeted fine-tuning.
Model Overview
Lumen-8b-05-2025 has the following features:
- Base Model: Qwen3-8B
- Quantization: Built from the 4-bit quantized unsloth/Qwen3-8B-unsloth-bnb-4bit
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Parameters:
- Rank: 16
- Training: 1 epoch
- Framework: Unsloth
- Parameters: 8.2B (base model)
- Context Length: 32,768 tokens natively, 131,072 with YaRN
- Release Date: May 2025
Training Data
Lumen-8b-05-2025 was fine-tuned on a carefully curated dataset of under 5,000 samples from:
EvolKit-75K: A high-quality instruction tuning dataset created by Arcee AI, used in training models like Arcee SuperNova and INTELLECT-1. Approximately two-thirds of our training samples came from this dataset.
Academic-Chains: A specialized academic dataset. We selected only samples with a suitability_score ≥ 0.5, representing approximately one-third of our training data.
Limitations
This is the first release of our series of experimental models, Lumen. We plan on scaling our academic-reasoning dataset soon enough to then train Lumen revisions.
Please use the the system prompt in the quickstart below. Note that benchmarks results for this model will probably be lower when compared directly to Qwen3 8b; this is to be expected until we are able to scale our dataset and to refine our training pipeline.
Quickstart
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model
model_name = "NoeticLabs/Lumen-8b-05-2025"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# Prepare input
system_prompt = "You are a helpful assistant. Think before answering and put your thoughts between the <think> and </think> tags. Use an appropriate amount of thinking based on the query."
prompt = "Explain the significance of transformer architecture in modern NLP."
messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": prompt}]
# Format input with chat template
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switch between thinking and non-thinking modes
)
# Generate response
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=2048,
temperature=0.6,
top_p=0.95,
top_k=20
)
# Process and print output
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print(response)
Best Practices
For optimal performance with Lumen-8b-05-2025:
- General: Use our system prompt from the quickstart above
- For complex reasoning tasks: Use thinking mode (
enable_thinking=True
) with temperature=0.6, top_p=0.95, top_k=20 - For general dialogue: Consider non-thinking mode (
enable_thinking=False
) with temperature=0.7, top_p=0.8, top_k=20 - For long contexts: Enable YaRN scaling for inputs exceeding 32k tokens (may not work as well as on the original Qwen3)
About Noetic Labs
Noetic Labs is a student-led innovation lab exploring the frontiers of AI, technology, and human connection. We're dedicated to making life better through technology and research, tackling real-world challenges with creative solutions.
Licensing Information
This model is licensed under the Apache License 2.0.
Citation Information
@misc{noeticlabs_2025_lumen,
title = {Lumen},
url = {https://huggingface.co/NoeticLabs/Lumen-8b-05-2025},
author = {Noetic Labs},
month = {May},
year = {2025}
}
- Downloads last month
- 44