Model Details
Model Description
This model is a product of an initial attempt at turning an LLM into a Football simulation engine.
Refer to the blog post for all matters regarding the the data preperation, training details, evaluation etc.
- Developed by: cemrtkn
- Finetuned from model [optional]: mistralai/Mistral-7B-v0.3
Model Sources [optional]
- Repository: https://github.com/cemrtkn/LLM-football-simulation-engine
- Blog: https://medium.com/@cemrtkn/llm-as-a-simulator-fine-tuning-llms-to-simulate-football-matches-537c0e678b55
Training Data
I have a processed the raw event data from Statsbomb to curate the natural language dataset. The processing script is on the repo.
Basic Usage
from adapters import list_adapters, get_adapter_info
from transformers import (
AutoTokenizer,
BitsAndBytesConfig,
AutoModelForCausalLM
)
from tqdm import tqdm
import torch
from huggingface_hub import login
# log into hf to get access to the adapter
login(token=hf_token)
# init configs
model_name = "mistralai/Mistral-7B-v0.3"
device_map = {"":0}
use_4bit = True
bnb_4bit_compute_dtype = "float16"
bnb_4bit_quant_type = "nf4"
use_nested_quant = False
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)
bnb_config = BitsAndBytesConfig(
load_in_4bit = use_4bit,
bnb_4bit_quant_type = bnb_4bit_quant_type,
bnb_4bit_compute_dtype = compute_dtype,
bnb_4bit_use_double_quant = use_nested_quant,)
# load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code = True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config = bnb_config,
device_map = device_map,
)
model.load_adapter("cemrtkn/llm-football-simulator")
input_text = """This is a sequence of football match events.
Time: 00:00:00.000 | Event: Half Start
"""
seq_list = input_text.split('\n')
match_list = []
for i in tqdm(range(5)):
input_text = '\n'.join(seq_list)
inputs = tokenizer(input_text, return_tensors="pt").to("cuda") # Ensure tensors are on the right device
# hacky way to generate
output = model.generate(**inputs, max_new_tokens=100, do_sample=True ,temperature=1, pad_token_id=tokenizer.eos_token_id)
prediction = tokenizer.decode(output[0], skip_special_tokens=True).split('\n')[-2]
print(prediction)
Framework versions
- PEFT 0.13.2
- Downloads last month
- 40
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for cemrtkn/llm-football-simulator
Base model
mistralai/Mistral-7B-v0.3