aistrategyndev
/

Larp-Qwen32B-250805

Model card Files Files and versions Community

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

library_name: transformers license: other base_model: Qwen/Qwen2.5-32B-Instruct tags:

llama-factory
full
generated_from_trainer model-index:
name: Larp-Qwen32B-250805

This model is a fine-tuned version of Qwen/Qwen2.5-32B-Instruct

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 16
gradient_accumulation_steps: 12
total_train_batch_size: 192
total_eval_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3.0

Framework versions

Transformers 4.53.0
Pytorch 2.7.0+cu126
Datasets 3.1.0
Tokenizers 0.21.2

Downloads last month: 6

Safetensors

Model size

1.12M params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support