BSG CyLLama Setup and Usage Guide
This guide explains how to set up and use the BSG CyLLama scientific summarization model.
Overview
BSG CyLLama is a LoRA-adapted Llama-3.2-1B-Instruct model fine-tuned for scientific text summarization. The model excels at generating high-quality abstracts and summaries from scientific papers and research content.
Files Structure
bsg_cyllama/
βββ scientific_model_production_v2/ # Trained model files
β βββ config.json # Model configuration
β βββ prompt_generator.pt # Prompt generation utilities
β βββ model/ # LoRA adapter files
β βββ adapter_config.json
β βββ adapter_model.safetensors
β βββ tokenizer.json
β βββ ...
βββ bsg_training_data_complete_aligned.tsv # Complete training dataset (19,174 records)
βββ bsg_cyllama_trainer_v2.py # Training script
βββ scientific_model_inference2.py # Inference utilities
βββ bsg_training_data_gen.py # Data generation pipeline
βββ compile_complete_training_data.py # Data compilation script
βββ upload_to_huggingface.py # HF upload utilities
βββ run_upload.py # Simple upload runner
Prerequisites
Python Environment:
python >= 3.8 torch >= 2.0 transformers >= 4.30.0 peft >= 0.4.0 huggingface_hub pandas numpy
Hardware Requirements:
- GPU with at least 8GB VRAM (recommended)
- 16GB+ system RAM
- CUDA support for optimal performance
Installation
Clone/Download the repository:
git clone <your-repo-url> cd bsg_cyllama
Install dependencies:
pip install torch transformers peft huggingface_hub pandas numpy sentence-transformers
Activate environment (if using virtual environment):
source ~/myenv/bin/activate
Usage
1. Basic Inference
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model
base_model_name = "meta-llama/Llama-3.2-1B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "./scientific_model_production_v2/model")
def generate_summary(text, max_length=200):
prompt = f"Summarize the following scientific text:\n\n{text}\n\nSummary:"
inputs = tokenizer.encode(prompt, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
inputs,
max_length=max_length,
num_return_sequences=1,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id,
do_sample=True
)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
return summary.split("Summary:")[-1].strip()
2. Using the Inference Script
python scientific_model_inference2.py
3. Training from Scratch
python bsg_cyllama_trainer_v2.py
Dataset Information
The complete training dataset contains 19,174 records with the following structure:
- AbstractSummary: Detailed scientific summary
- ShortSummary: Concise version
- Title: Research paper title
- OriginalText: Source abstract
- OriginalKeywords: Topic keywords
- Clustering information: For data organization
Loading the Dataset
import pandas as pd
# Load complete training data
df = pd.read_csv("bsg_training_data_complete_aligned.tsv", sep="\t")
print(f"Dataset size: {len(df)} records")
print(f"Columns: {df.columns.tolist()}")
# Example training pair
sample = df.iloc[0]
print(f"Original: {sample['OriginalText'][:200]}...")
print(f"Summary: {sample['AbstractSummary'][:200]}...")
Model Configuration
- Base Model: meta-llama/Llama-3.2-1B-Instruct
- LoRA Rank: 128
- LoRA Alpha: 256
- Target Modules: v_proj, o_proj, k_proj, gate_proj, q_proj, up_proj, down_proj
- Training Samples: 19,174
Uploading to Hugging Face
To upload your model and dataset to Hugging Face:
Set up your token:
# Your token is already configured in the script
Run the upload:
python run_upload.py
Enter your HF username when prompted
This will create two repositories:
{username}/bsg-cyllama
(model){username}/bsg-cyllama-training-data
(dataset)
Performance Tips
For better performance:
- Use GPU inference
- Adjust temperature (0.5-0.8 for more focused summaries)
- Experiment with max_length based on your needs
Memory optimization:
- Use torch.float16 for inference
- Enable gradient checkpointing for training
- Use smaller batch sizes if needed
Troubleshooting
CUDA out of memory:
- Reduce batch size
- Use CPU inference
- Enable gradient checkpointing
Import errors:
- Check transformers version:
pip install transformers>=4.30.0
- Install missing dependencies:
pip install peft sentence-transformers
- Check transformers version:
Model loading issues:
- Verify file paths
- Check model file integrity
- Ensure proper permissions
Example Applications
- Scientific Paper Summarization
- Abstract Generation
- Research Literature Review
- Technical Documentation Condensation
Citation
@misc{bsg-cyllama-2025,
title={BSG CyLLama: Scientific Summarization with LoRA-tuned Llama},
author={BSG Research Team},
year={2025},
url={https://huggingface.co/bsg-cyllama}
}
Support
For questions, issues, or collaboration:
- Check this guide first
- Review the error messages
- Open an issue in the repository
- Contact the development team
Last Updated: January 2025 Model Version: v2 Dataset Version: Complete Aligned (19,174 records)