Model Architecture

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

MODEL_NAME = "DeepMount00/Llama-3.1-8b-Ita"

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, torch_dtype=torch.bfloat16).eval()
model.to(device)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

def generate_answer(prompt):
    messages = [
        {"role": "user", "content": prompt},
    ]
    model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
    generated_ids = model.generate(model_inputs, max_new_tokens=200, do_sample=True,
                                          temperature=0.001)
    decoded = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
    return decoded[0]

prompt = "Come si apre un file json in python?"
answer = generate_answer(prompt)
print(answer)

Developer

[Michele Montebovi]

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 28.23
IFEval (0-Shot) 79.17
BBH (3-Shot) 30.93
MATH Lvl 5 (4-Shot) 10.88
GPQA (0-shot) 5.03
MuSR (0-shot) 11.40
MMLU-PRO (5-shot) 31.96
Downloads last month
6,532
Safetensors
Model size
8.03B params
Tensor type
BF16
Β·
Inference Providers NEW
The selected billing account doesn't have any compatible Inference Provider enabled for this model. Settings

Model tree for DeepMount00/Llama-3.1-8b-ITA

Finetuned
(1358)
this model
Finetunes
1 model
Quantizations
6 models

Spaces using DeepMount00/Llama-3.1-8b-ITA 7

Collection including DeepMount00/Llama-3.1-8b-ITA

Evaluation results