Mistral-7B-Instruct-v0.2 4-bit for Location Extraction
This model is a version of mistralai/Mistral-7B-Instruct-v0.2
that has been quantized to 4-bit using bitsandbytes
and is specifically prompted for extracting geographical locations from text. It outputs identified locations in a JSON format.
Model Creator: [boods/boods/mistral_location_extractor_4bit_0.1]
Last Updated: May 10, 2025
Model Description
This model leverages the powerful instruction-following capabilities of mistralai/Mistral-7B-Instruct-v0.2
. It is not fine-tuned on a specific dataset for location extraction but rather guided by a carefully designed system prompt to identify geographical locations and return them in a structured JSON output: {"locations": ["<location1>", "<location2>", ...]}
.
The model is quantized to 4-bit precision, which significantly reduces its memory footprint and computational requirements, making it more accessible for deployment on a wider range of hardware, albeit with a potential minor trade-off in performance compared to the full-precision version.
Generation hyperparameters for this model's specific task of location extraction have been tuned using Optuna to optimize performance (Macro F1-score for B-Location and I-Location tags) on a validation dataset.
Intended Uses
- Extracting geographical place names from English text.
- Named Entity Recognition (NER) focusing specifically on location entities.
- Populating databases or knowledge graphs with location information.
- Geocoding preprocessing: identifying locations in text before converting them to coordinates.
Out-of-Scope Uses:
- This model is not designed for extracting other types of entities (e.g., persons, organizations) unless they are part of a location name.
- It is not a geocoding model; it identifies location names but does not provide coordinates.
- Performance on languages other than English is not guaranteed and likely to be suboptimal.
- Not suitable for use cases requiring very high precision without thorough testing and potential fine-tuning on domain-specific data.
How to Use
You can use this model with the transformers
library. The key is to use the specific prompt structure and the optimized generation parameters.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import json
MODEL_ID = "boods/boods/mistral_location_extractor_4bit_0.1"
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
device_map="auto", # Or explicitly to DEVICE
torch_dtype=torch.float16, # if applicable, or remove if using default from saved model
load_in_4bit=True, # if not already part of the saved config
trust_remote_code=True # if any custom code is part of the model repo
).eval()
# Add pad_token if it doesn't exist
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
SYSTEM_PROMPT = (
"You are a precise information鈥慹xtraction assistant. "
"Identify every geographical location mentioned in the user鈥檚 sentence. "
"Return ONLY a valid JSON object of the form {\"locations\": [<list鈥憃f鈥憇trings>]}. "
"Return an empty list if no location is found."
)
def build_prompt(sentence: str) -> str:
chat = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Sentence: {sentence}\nAnswer:"}
]
return tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# Optimal generation parameters (replace with your actual best_params_from_optuna)
# Example:
# best_generation_configs = {
# "max_new_tokens": 128,
# "do_sample": True,
# "temperature": 0.7,
# "top_k": 50,
# "top_p": 0.95
# }
# If no sampling:
# best_generation_configs = {
# "max_new_tokens": 128,
# "do_sample": False,
# "num_beams": 3
# }
best_generation_configs = {
# [Fill with the 'best_params_from_optuna' dictionary from your Optuna study]
# For example:
# "max_new_tokens": [value_from_optuna],
# "do_sample": [value_from_optuna],
# "temperature": [value_from_optuna, if do_sample is True],
# "top_k": [value_from_optuna, if do_sample is True],
# "top_p": [value_from_optuna, if do_sample is True],
# "num_beams": [value_from_optuna, if do_sample is False]
}
# Ensure all required params for model.generate are included
best_generation_configs["eos_token_id"] = tokenizer.eos_token_id
best_generation_configs["pad_token_id"] = tokenizer.pad_token_id
def extract_locations(sentence: str):
prompt = build_prompt(sentence)
input_ids = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output_ids = model.generate(
input_ids=input_ids["input_ids"],
attention_mask=input_ids.get("attention_mask"),
**best_generation_configs
)
completion = tokenizer.decode(output_ids[0][input_ids["input_ids"].shape[1]:], skip_special_tokens=True).strip()
# Robust JSON parsing (simplified for example)
try:
# Try to find JSON object within the string if it's embedded
match = re.search(r"{\s*\"locations\"\s*:\s*\[.*?\]\s*}", completion, re.DOTALL)
if match:
completion = match.group(0)
else: # Fallback for cleaner JSON attempts
if completion.startswith("```json"): completion = completion[7:]
if completion.endswith("```"): completion = completion[:-3]
completion = completion.strip()
# Minimal attempt to fix if JSON is not perfectly formed at the start
if not completion.startswith("{"):
completion = "{" + completion.split("{", 1)[-1] if "{" in completion else ""
data = json.loads(completion)
return data.get("locations", [])
except json.JSONDecodeError:
# Fallback: regex for quoted strings if JSON fails
return re.findall(r'"([^"]+)"', completion)
# Example usage:
sentence1 = "I visited Berlin last summer, then went to Rome."
sentence2 = "The headquarters are in Cupertino, California, not London."
sentence3 = "There are no locations here."
print(f"'{sentence1}' -> Locations: {extract_locations(sentence1)}")
print(f"'{sentence2}' -> Locations: {extract_locations(sentence2)}")
print(f"'{sentence3}' -> Locations: {extract_locations(sentence3)}")
- Downloads last month
- 40