Model Card for Llama Query Expansion Fine-Tuned

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct that has been optimized for query expansion and optimization tasks. It is designed to improve search query performance in multimedia applications by generating expanded or reformulated queries from a given input.

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: Aygün Varol
Funded by : Ministry of National Education of the Republic of Türkiye and by the Jane and Aatos Erkko Foundation EVIL-AI project
Shared by : Aygün Varol
Model type: Causal Language Model / Instruction-Tuned LM
Language(s) (NLP): English
License: MIT
Finetuned from model : meta-llama/Llama-3.2-1B-Instruct

Model Sources

Repository:
Paper : -
Demo : -

Uses

Direct Use

This model can be used to optimize and expand user queries to improve search performance. It is particularly useful in systems where query understanding and expansion can enhance retrieval accuracy.

Downstream Use

The fine-tuned model can be integrated into larger systems, for example:

In research settings to study query reformulation techniques.

Out-of-Scope Use

The model is not designed for general-purpose text generation outside of query optimization.
It may not perform well on queries in languages other than English.
It is not intended for applications where absolute factual correctness is critical.

Bias, Risks, and Limitations

Bias:
The model may reflect biases present in the training data. Users should be cautious of potential overgeneralizations or biased query expansions.
Risks:
Generated query expansions may sometimes include irrelevant or redundant information. It is recommended to review outputs before deploying them in high-stakes applications.
Limitations:
- The model's performance may degrade on queries that differ significantly from those seen during fine-tuning.
- It might generate multiple variations when a single concise output is preferable.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It is recommended to implement post-processing steps to filter or verify the generated queries before using them in production.

How to Get Started with the Model

To use the model, install the transformers library and load the model using the code below:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Aygun/llama-query-expansion-finetuned")
tokenizer = AutoTokenizer.from_pretrained("Aygun/llama-query-expansion-finetuned")

prompt = "Generate an optimized version of this query: healthy breakfast ideas"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
optimized_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(optimized_query)

Training Details

Training Data

The model was fine-tuned on the s-emanuilov/query-expansion dataset available on Hugging Face. This dataset consists of query-expansion pairs where each sample includes:

query: The original user query.
expansions: A list of expanded versions of the query.

This dataset was curated to reflect realistic search queries and their corresponding expansions, making it well-suited for training models aimed at query optimization.

Training Procedure

The model was fine-tuned using the LoRA (Low-Rank Adaptation) technique.

Preprocessing

Data was preprocessed to create prompt–completion pairs where:

Prompt: "Generate expanded versions of this query: <query>\n\nExpanded queries:"

Completion: A formatted list of expanded queries.

Training Hyperparameters

- Base Model: meta-llama/Llama-3.2-1B-Instruct
- LoRA Rank: 16
- lora_alpha: 32
- Target Modules: ["q_proj", "k_proj", "v_proj", "o_proj"]
- LoRA Dropout: 0.05
- Number of Epochs: 3
- Per Device Batch Size: 2
- Gradient Accumulation Steps: 4
- Learning Rate: 2e-4
- Warmup Steps: 100
- Mixed Precision: Enabled (fp16)

Citation

Bibtex:

@misc{llama_query_expansion_finetuned,
  title={Llama Query Expansion Fine-Tuned},
  author={Aygün Varol},
  note={Fine-tuned version of meta-llama/Llama-3.2-1B-Instruct using LoRA for query expansion.},
  year={2025}}

APA:

Aygün Varol (2025). Llama Query Expansion Fine-Tuned (Fine-tuned version of meta-llama/Llama-3.2-1B-Instruct using LoRA for query expansion). Retrieved from Hugging Face Hub.