Mistral-7B-Instruct-v0.2 4-bit for Location Extraction

This model is a version of mistralai/Mistral-7B-Instruct-v0.2 that has been quantized to 4-bit using bitsandbytes and is specifically prompted for extracting geographical locations from text. It outputs identified locations in a JSON format.

Model Creator: [boods/boods/mistral_location_extractor_4bit_0.1] Last Updated: May 10, 2025

Model Description

This model leverages the powerful instruction-following capabilities of mistralai/Mistral-7B-Instruct-v0.2. It is not fine-tuned on a specific dataset for location extraction but rather guided by a carefully designed system prompt to identify geographical locations and return them in a structured JSON output: {"locations": ["<location1>", "<location2>", ...]}.

The model is quantized to 4-bit precision, which significantly reduces its memory footprint and computational requirements, making it more accessible for deployment on a wider range of hardware, albeit with a potential minor trade-off in performance compared to the full-precision version.

Generation hyperparameters for this model's specific task of location extraction have been tuned using Optuna to optimize performance (Macro F1-score for B-Location and I-Location tags) on a validation dataset.

Intended Uses

  • Extracting geographical place names from English text.
  • Named Entity Recognition (NER) focusing specifically on location entities.
  • Populating databases or knowledge graphs with location information.
  • Geocoding preprocessing: identifying locations in text before converting them to coordinates.

Out-of-Scope Uses:

  • This model is not designed for extracting other types of entities (e.g., persons, organizations) unless they are part of a location name.
  • It is not a geocoding model; it identifies location names but does not provide coordinates.
  • Performance on languages other than English is not guaranteed and likely to be suboptimal.
  • Not suitable for use cases requiring very high precision without thorough testing and potential fine-tuning on domain-specific data.

How to Use

You can use this model with the transformers library. The key is to use the specific prompt structure and the optimized generation parameters.

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import json

MODEL_ID = "boods/boods/mistral_location_extractor_4bit_0.1"
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    device_map="auto", # Or explicitly to DEVICE
    torch_dtype=torch.float16, # if applicable, or remove if using default from saved model
    load_in_4bit=True, # if not already part of the saved config
    trust_remote_code=True # if any custom code is part of the model repo
).eval()

# Add pad_token if it doesn't exist
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

SYSTEM_PROMPT = (
    "You are a precise information鈥慹xtraction assistant. "
    "Identify every geographical location mentioned in the user鈥檚 sentence. "
    "Return ONLY a valid JSON object of the form {\"locations\": [<list鈥憃f鈥憇trings>]}. "
    "Return an empty list if no location is found."
)

def build_prompt(sentence: str) -> str:
    chat = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user",   "content": f"Sentence: {sentence}\nAnswer:"}
    ]
    return tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# Optimal generation parameters (replace with your actual best_params_from_optuna)
# Example:
# best_generation_configs = {
#     "max_new_tokens": 128,
#     "do_sample": True,
#     "temperature": 0.7,
#     "top_k": 50,
#     "top_p": 0.95
# }
# If no sampling:
# best_generation_configs = {
#     "max_new_tokens": 128,
#     "do_sample": False,
#     "num_beams": 3
# }
best_generation_configs = {
    # [Fill with the 'best_params_from_optuna' dictionary from your Optuna study]
    # For example:
    # "max_new_tokens": [value_from_optuna],
    # "do_sample": [value_from_optuna],
    # "temperature": [value_from_optuna, if do_sample is True],
    # "top_k": [value_from_optuna, if do_sample is True],
    # "top_p": [value_from_optuna, if do_sample is True],
    # "num_beams": [value_from_optuna, if do_sample is False]
}
# Ensure all required params for model.generate are included
best_generation_configs["eos_token_id"] = tokenizer.eos_token_id
best_generation_configs["pad_token_id"] = tokenizer.pad_token_id


def extract_locations(sentence: str):
    prompt = build_prompt(sentence)
    input_ids = tokenizer(prompt, return_tensors="pt").to(model.device)

    with torch.no_grad():
        output_ids = model.generate(
            input_ids=input_ids["input_ids"],
            attention_mask=input_ids.get("attention_mask"),
            **best_generation_configs
        )

    completion = tokenizer.decode(output_ids[0][input_ids["input_ids"].shape[1]:], skip_special_tokens=True).strip()

    # Robust JSON parsing (simplified for example)
    try:
        # Try to find JSON object within the string if it's embedded
        match = re.search(r"{\s*\"locations\"\s*:\s*\[.*?\]\s*}", completion, re.DOTALL)
        if match:
            completion = match.group(0)
        else: # Fallback for cleaner JSON attempts
            if completion.startswith("```json"): completion = completion[7:]
            if completion.endswith("```"): completion = completion[:-3]
            completion = completion.strip()
            # Minimal attempt to fix if JSON is not perfectly formed at the start
            if not completion.startswith("{"):
                completion = "{" + completion.split("{", 1)[-1] if "{" in completion else ""
        
        data = json.loads(completion)
        return data.get("locations", [])
    except json.JSONDecodeError:
        # Fallback: regex for quoted strings if JSON fails
        return re.findall(r'"([^"]+)"', completion)


# Example usage:
sentence1 = "I visited Berlin last summer, then went to Rome."
sentence2 = "The headquarters are in Cupertino, California, not London."
sentence3 = "There are no locations here."

print(f"'{sentence1}' -> Locations: {extract_locations(sentence1)}")
print(f"'{sentence2}' -> Locations: {extract_locations(sentence2)}")
print(f"'{sentence3}' -> Locations: {extract_locations(sentence3)}")
Downloads last month
40
Safetensors
Model size
3.86B params
Tensor type
F32
F16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support