DanteLLM

DanteLLM is a Large Language Model developed in Sapienza lab. In October 2023 we submitted a paper called DanteLLM: Let's Push Italian LLM Research Forward! 🤌 🇮🇹

That paper got accepted with the scores 5, 4, 4 out of 5

How to run the model

from transformers import AutoTokenizer, AutoModelForCausalLM
device = "cuda" # the device to load the model onto

model_id = "rstless-research/DanteLLM-7B-Instruct-Italian-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

model.eval()

messages = [
    {"role": "user", "content": "Ciao chi sei?"},
    {"role": "assistant", "content": "Ciao, sono DanteLLM, un large language model. Come posso aiutarti?"},
    {"role": "user", "content": "Quanto dista la Terra dalla Luna?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(input_ids=model_inputs, max_new_tokens=300, do_sample=True, temperature=0.3)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
# La Terra si trova a 384,400 chilometri (238,855 miglia) dalla Luna. La distanza varia leggermente a causa della sua orbita ellittica.

Authors

  • Andrea Bacciu* (work done prior joining Amazon)
  • Cesare Campagnano*
  • Giovanni Trappolini
  • Prof. Fabrizio Silvestri

* Equal contribution

Downloads last month
0
Safetensors
Model size
7.24B params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for rstless-research/DanteLLM-7B-Instruct-Italian-v0.1

Adapter
(886)
this model