Lumen-8b-05-2025

Model Highlights

Lumen-8b-05-2025 is a fine-tuned version of Qwen3-8B, created by Noetic Labs to enhance performance on academic and complex instruction following tasks. This model leverages the powerful base capabilities of Qwen3 while being optimized for scholarly and technical content through targeted fine-tuning.

Model Overview

Lumen-8b-05-2025 has the following features:

Base Model: Qwen3-8B
Quantization: Built from the 4-bit quantized unsloth/Qwen3-8B-unsloth-bnb-4bit
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Parameters:
- Rank: 16
- Training: 1 epoch
- Framework: Unsloth
Parameters: 8.2B (base model)
Context Length: 32,768 tokens natively, 131,072 with YaRN
Release Date: May 2025

Training Data

Lumen-8b-05-2025 was fine-tuned on a carefully curated dataset of under 5,000 samples from:

EvolKit-75K: A high-quality instruction tuning dataset created by Arcee AI, used in training models like Arcee SuperNova and INTELLECT-1. Approximately two-thirds of our training samples came from this dataset.
Academic-Chains: A specialized academic dataset. We selected only samples with a suitability_score ≥ 0.5, representing approximately one-third of our training data.

Limitations

This is the first release of our series of experimental models, Lumen. We plan on scaling our academic-reasoning dataset soon enough to then train Lumen revisions.

Please use the the system prompt in the quickstart below. Note that benchmarks results for this model will probably be lower when compared directly to Qwen3 8b; this is to be expected until we are able to scale our dataset and to refine our training pipeline.

Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model
model_name = "NoeticLabs/Lumen-8b-05-2025"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Prepare input
system_prompt = "You are a helpful assistant. Think before answering and put your thoughts between the <think> and </think> tags. Use an appropriate amount of thinking based on the query."
prompt = "Explain the significance of transformer architecture in modern NLP."
messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": prompt}]

# Format input with chat template
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True  # Switch between thinking and non-thinking modes
)

# Generate response
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048,
    temperature=0.6,
    top_p=0.95,
    top_k=20
)

# Process and print output
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print(response)

Best Practices

For optimal performance with Lumen-8b-05-2025:

General: Use our system prompt from the quickstart above
For complex reasoning tasks: Use thinking mode (enable_thinking=True) with temperature=0.6, top_p=0.95, top_k=20
For general dialogue: Consider non-thinking mode (enable_thinking=False) with temperature=0.7, top_p=0.8, top_k=20
For long contexts: Enable YaRN scaling for inputs exceeding 32k tokens (may not work as well as on the original Qwen3)

About Noetic Labs

Noetic Labs is a student-led innovation lab exploring the frontiers of AI, technology, and human connection. We're dedicated to making life better through technology and research, tackling real-world challenges with creative solutions.

Licensing Information

This model is licensed under the Apache License 2.0.

Citation Information

@misc{noeticlabs_2025_lumen,
    title  = {Lumen},
    url    = {https://huggingface.co/NoeticLabs/Lumen-8b-05-2025},
    author = {Noetic Labs},
    month  = {May},
    year   = {2025}
}

NoeticLabs
/

Lumen-8b-05-2025