Galactus
This model is a fine-tuned version of microsoft/Phi-4-multimodal-instruct on the Galaxy's Last Exam Benchmark.
Model description
Galactus is a SOTA multimodal language model that outperforms all OpenAI and Gemini models on the Galaxy's Last Exam Benchmark. This benchmark features challenging tasks that push the boundaries of metaphysical competence—for instance, determining how many times two lines intersect or simulating the effect of adding three minutes to an analog clock. The model accepts image input along with text prompts and has been specifically optimized to tackle the most complex visual reasoning tasks.
Intended uses & limitations
This model is intended for handling complex visual reasoning tasks that require metaphysical competence. Please do not use for normal human tasks.
Training and evaluation data
The model was exclusively trained on the Galaxy's Last Exam Benchmark.
Training procedure
The model was trained using LoRA adapters focused on the vision components of the base model.
Prompt format
This model uses the following image prompt format: <|image_1|> + user text
Training hyperparameters
The following hyperparameters were used during training:
num_train_epochs: specified in args (but used checkpoint @ 252 epochs) per_device_train_batch_size: specified in args gradient_checkpointing: True gradient_checkpointing_kwargs: {'use_reentrant': False} gradient_accumulation_steps: specified in args optim: 'adamw_torch' adam_beta1: 0.9 adam_beta2: 0.95 adam_epsilon: 1e-7 learning_rate: specified in args weight_decay: 0.0 save_strategy: 'steps' save_steps: 10 eval_steps: 10 if eval_dataset else None evaluation_strategy: 'steps' if eval_dataset else 'no' load_best_model_at_end: True if eval_dataset else False max_grad_norm: 1.0 lr_scheduler_type: 'linear' warmup_steps: 50 logging_steps: 10 save_total_limit: 2 save_only_model: True dataloader_num_workers: 4 ddp_find_unused_parameters: True
Training results
The model achieved 72% performance on the Galaxy's Last Exam Benchmark. https://github.com/DavidTee1/Galaxys-Last-Exam-Benchmark
Framework versions
Transformers 4.46.1 PyTorch 2.7.0.dev20250304+cu128 TorchVision 0.22.0.dev20250304+cu128 Tokenizers 0.20.3
- Downloads last month
- 26
Model tree for DTee8/galactus
Base model
microsoft/Phi-4-multimodal-instruct