safouaneelg/Apertus-8B-Instruct-2509-GSM8k-SFT

Apertus has released two models: 70B and 8B parameter multi-language model. Check out the model info here: Swiss-AI/LLM

Finetuned on GSM8k

This repo contains the fine-tuned version of Apertus on GSM8K dataset.

The fine-tuning was performed using Unsloth on a single GPU RTX A6000 48GB Linux machine using the below parameters:

per_device_train_batch_size: 2
gradient_accumulation_steps: 4 (effective batch size: 8)
warmup_steps: 5
num_train_epochs: 2
learning_rate: 2e-4
fp16/bf16: Enabled based on hardware support
logging_steps: 1
optimizer: adamw_8bit
weight_decay: 0.01
lr_scheduler_type: linear
seed: 3407
eval_strategy: steps
eval_steps: 50
packing: True

How to use

You can run this fine-tuned version using the below instructions:

Transformers 4.56.0 are required to run the model.

pip install -U transformers

I have personally managed to run it after setting the xiELU activation function which can theoretically be installed via the below command line.

pip install git+https://github.com/rubber-duck-debug/xielu

If you struggle, check the xiELU installation below (for linux users only).

Run inference using:

Transformers pipeline
Unsloth pipeline (This works better, if you have StaticLayer error, comment the arg prompt_lookup_num_tokens=None)

from unsloth import FastLanguageModel
import torch

# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="safouaneelg/Apertus-8B-Instruct-2509-GSM8k-SFT",
    max_seq_length=2048,
    load_in_4bit=True,
)

# Move to device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Example prompt from GSM8k
prompt = "Short answer please. Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?."

messages_think = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages_think,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt", add_special_tokens=False).to(model.device)

outputs = model.generate(
    **model_inputs,
    max_new_tokens=256,  
    temperature=0.8, 
    top_p=0.9,   
    use_cache=True,
    do_sample=True, 
    prompt_lookup_num_tokens=None #for some reasoning this sometimes solve the inferencing errors
)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

import os
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_name = "safouaneelg/Apertus-8B-Instruct-2509-GSM8k-SFT"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
).to(device)

# prepare the model input
prompt = "Short answer please. Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
messages_think = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages_think,
    tokenize=False,
    add_generation_prompt=True,
)

streamer = TextStreamer(tokenizer)

model_inputs = tokenizer([text], return_tensors="pt", add_special_tokens=False).to(model.device)

# Generate the output
generated_ids = model.generate(**model_inputs, streamer=streamer, max_new_tokens=2024)

# Get and decode the output
output_ids = generated_ids[0][len(model_inputs.input_ids[0]) :]
print(tokenizer.decode(output_ids, skip_special_tokens=True))

Output example:

xiELU activation error solving

The pip install might result in a set of errors, below is how I managed to make it running:

xiELU need cmake 3.30 +. Therefore, you can reach out to cmake downloader Link and select the distribution. For my test I have used cmake-4.1.1-linux-x86_64.sh.
Copy to opt/ and make the script executable via below command lines:

sudo cp cmake-4.1.1-linux-x86_64.sh /opt/
sudo chmod +x cmake-4.1.1-linux-x86_64.sh

Update environment paths CUDA_HOME is a requirement as well, so additionally to cmake you need to point to CUDA_HOME. if you have another cuda version change this line to CUDA_HOME=/usr/local/cudaX.X

export PATH="/opt/cmake-4.1.1-linux-x86_64/bin:$PATH"
export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

Save and apply the changes by running:

source ~/.bashrc
# or
source ~/.zshrc

Verify the installation Run nvcc --version to confirm that the CUDA compiler is now in your PATH. You should see the CUDA version number. And, run echo $CUDA_HOME to confirm that the environment variable is set correctly.
Reactivate you conda/venv environment and run the pip install git+https://github.com/rubber-duck-debug/xielu.

Credits to the base model

license: apache-2.0 base_model:

swiss-ai/Apertus-8B-Instruct-2509 pipeline_tag: text-generation library_name: transformers tags:
- multilingual
- compliant
- swiss-ai
- apertus

extra_gated_prompt: "### Apertus LLM Acceptable Use Policy \n(1.0 | September 1, 2025)\n"Agreement" The Swiss National AI Institute (SNAI) is a partnership between the two Swiss Federal Institutes of Technology, ETH Zurich and EPFL. The Swiss National AI Institute (SNAI) is a partnership between the two Swiss Federal Institutes of Technology, ETH Zurich and EPFL. \n\nBy using the Apertus LLM you agree to indemnify, defend, and hold harmless ETH Zurich and EPFL against any third-party claims arising from your use of Apertus LLM. \n\nThe training data and the Apertus LLM may contain or generate information that directly or indirectly refers to an identifiable individual (Personal Data). You process Personal Data as independent controller in accordance with applicable data protection law. SNAI will regularly provide a file with hash values for download which you can apply as an output filter to your use of our Apertus LLM. The file reflects data protection deletion requests which have been addressed to SNAI as the developer of the Apertus LLM. It allows you to remove Personal Data contained in the model output. We strongly advise downloading and applying this output filter from SNAI every six months following the release of the model. " extra_gated_fields: Your Name: text Country: country Affiliation: text geo: ip_location By clicking Submit below I accept the terms of use: checkbox extra_gated_button_content: Submit

Citation

@misc{swissai2025apertus,
  title={{Apertus: Democratizing Open and Compliant LLMs for Global Language Environments}},
  author={Apertus Team},
  year={2025},
  howpublished={\url{https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509}}
}

Downloads last month: 74

Safetensors

Model size

8.05B params

Tensor type

BF16

F32

Model tree for safouaneelg/Apertus-8B-Instruct-2509-GSM8k-SFT

Base model

swiss-ai/Apertus-8B-2509

Finetuned

swiss-ai/Apertus-8B-Instruct-2509

Quantized

(14)

this model

safouaneelg
/

Apertus-8B-Instruct-2509-GSM8k-SFT