You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

pt

This model is a fine-tuned version of google/gemma-2-2b-jpn-it on the shisa dataset.

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 2
total_train_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 0.1

Safetensors

Model size

3B params

Tensor type

BF16

Base model

Finetuned

Finetuned

Finetuned

(27)

this model