Model Card for nexa-Llama-sci7b
Model Details
Model Description:nexa-Llama-sci7b
is a fine-tuned variant of the open-weight meta-llama/Llama-2-7b
model, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using the PEFT (Parameter-Efficient Fine-Tuning) library with LoRA in 4-bit quantized mode using the bitsandbytes
backend.
This model is part of the Nexa Scientific Intelligence series, developed for scalable, automated scientific reasoning and domain-specific text generation.
- Developed by: Allan (Independent Scientific Intelligence Architect)
- Funded by: Self-funded
- Shared by: Allan (https://huggingface.co/allan-wandia)
- Model type: Decoder-only transformer (causal language model)
- Language(s): English (scientific domain-specific vocabulary)
- License: Apache 2.0 (inherits from base model)
- Fine-tuned from:
meta-llama/Llama-2-7b
- Repository: https://huggingface.co/allan-wandia/nexa-Llama-sci7b
- Demo: Coming soon via Hugging Face Spaces or Lambda inference endpoint
Uses
Direct Use
- Scientific hypothesis generation
- Abstract and method section synthesis
- Domain-specific research writing
- Semantic completion of structured research prompts
Downstream Use
- Fine-tuning or distillation into smaller expert models
- Foundation for test-time reasoning agents
- Seed model for bootstrapping larger synthetic scientific corpora
Out-of-Scope Use
- General conversation or chat use cases
- Non-English scientific domains
- Legal, financial, or clinical advice generation
Bias, Risks, and Limitations
While the model performs well on structured scientific input, it inherits biases from its base model (meta-llama/Llama-2-7b
) and fine-tuning dataset. Results should be evaluated by domain experts before use in high-stakes settings. It may hallucinate plausible but incorrect facts, especially in low-data areas.
Recommendations
Users should:
- Validate critical outputs against trusted scientific literature
- Avoid deploying in clinical or regulatory environments without further evaluation
- Consider additional domain fine-tuning for niche fields
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "allan-wandia/nexa-Llama-sci7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")
prompt = "Generate a novel hypothesis in quantum materials research:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=250)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
- Size: 100 million tokens sampled from a 500M+ token corpus
- Source: Curated scientific literature, abstracts, methodologies, and domain-labeled corpora (Bio, Physics, QST, Astro)
- Labeling: Token-level labels auto-generated via Nexa DataVault tokenizer infrastructure
Preprocessing
- Tokenization with sequence truncation to 1024 tokens
- Labeled and batched using CPU; inference dispatched to GPU asynchronously
Training Hyperparameters
Base model: meta-llama/Llama-2-7b-chat-hf Sequence length: 1024 Batch size: 1 (with gradient accumulation) Gradient Accumulation Steps: 64 Effective Batch Size: 64 Learning rate: 2e-5 Epochs: 2 LoRA: Enabled (PEFT) Quantization: 4-bit via bitsandbytes Optimizer: 8-bit AdamW Framework: Transformers + PEFT + Accelerate
Environmental Impact
Component Value
Hardware Type 2× NVIDIA T4 GPUs
Hours used ~7.5
Cloud Provider Kaggle (Google Cloud)
Compute Region US
Carbon Emitted Estimate pending (likely < 1kg CO2)
Technical Specifications
Model Architecture
Transformer decoder (Llama-2-7b architecture) LoRA adapters applied to attention and FFN layers Quantized with bitsandbytes to 4-bit for memory efficiency
Compute Infrastructure
- CPU: Intel i5 8th Gen vPro (batch preprocessing)
- GPU: 2× NVIDIA T4 (CUDA 12.1)
Software Stack
- PEFT 0.12.0
- Transformers 4.51.3
- Accelerate
- TRL
- Torch 2.x
Citation
@misc{nexa-Llama-sci7b,
title = {Nexa Llama Sci7b},
author = {Allan Wandia},
year = {2025},
howpublished = {\url{https://huggingface.co/allan-wandia/nexa-Llama-sci7b}},
note = {Fine-tuned model for scientific generation tasks}
}
Model Card Contact
For questions, contact Allan via Hugging Face or at:📫 Email: [email protected] Model Card Authors
Allan Wandia (Independent ML Engineer and Systems Architect)
Glossary
LoRA: Low-Rank Adaptation PEFT: Parameter-Efficient Fine-Tuning Safe Tensors: Secure, fast format for model weights
Links
GitHub Repo and Notebook: https://github.com/DarkStarStrix/Nexa_Auto
- Downloads last month
- 4
Model tree for Allanatrix/nexa-Llama-sci7b
Base model
meta-llama/Llama-2-7b-chat-hf