--- license: apache-2.0 tags: - unsloth - query-expansion datasets: - s-emanuilov/query-expansion base_model: - Qwen/Qwen2.5-3B-Instruct --- # Query Expansion Dataset - based on Qwen2.5-3B Fine-tuned Qwen2.5-3B model for generating search query expansions. Part of a collection of query expansion models available in different architectures and sizes. ## Overview **Task:** Search query expansion **Base model:** [Qwen2.5-3B](https://huggingface.co/Qwen/Qwen2.5-3B) **Training data:** [Query Expansion Dataset](https://huggingface.co/datasets/s-emanuilov/query-expansion) Query Expansion Model ## Variants ### Fine-tuned models - [Qwen2.5-7B](https://huggingface.co/s-emanuilov/query-expansion-Qwen2.5-7B) - [Llama-3.2-3B](https://huggingface.co/s-emanuilov/query-expansion-Llama-3.2-3B) ### GGUF variants - [Qwen2.5-3B-GGUF](https://huggingface.co/s-emanuilov/query-expansion-Qwen2.5-3B-GGUF) - [Qwen2.5-7B-GGUF](https://huggingface.co/s-emanuilov/query-expansion-Qwen2.5-7B-GGUF) - [Llama-3.2-3B-GGUF](https://huggingface.co/s-emanuilov/query-expansion-Llama-3.2-3B-GGUF) Each GGUF model is available in several quantization formats: F16, Q8_0, Q5_K_M, Q4_K_M, Q3_K_M ## Details This model is designed for enhancing search and retrieval systems by generating semantically relevant query expansions. It could be useful for: - Advanced RAG systems - Search enhancement - Query preprocessing - Low-latency query expansion ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from unsloth import FastLanguageModel # Model configuration MODEL_NAME = "s-emanuilov/query-expansion-Qwen2.5-3B" MAX_SEQ_LENGTH = 2048 DTYPE = "float16" LOAD_IN_4BIT = True # Load model and tokenizer model, tokenizer = FastLanguageModel.from_pretrained( model_name=MODEL_NAME, max_seq_length=MAX_SEQ_LENGTH, dtype=DTYPE, load_in_4bit=LOAD_IN_4BIT, ) # Enable faster inference FastLanguageModel.for_inference(model) # Define prompt template PROMPT_TEMPLATE = """Below is a search query. Generate relevant expansions and related terms that would help broaden and enhance the search results. ### Query: {query} ### Expansions: {output}""" # Prepare input query = "apple stock" inputs = tokenizer( [PROMPT_TEMPLATE.format(query=query, output="")], return_tensors="pt" ).to("cuda") # Generate with streaming output from transformers import TextStreamer streamer = TextStreamer(tokenizer) output = model.generate( **inputs, streamer=streamer, max_new_tokens=128, ) ``` ## Example **Input:** "apple stock" **Expansions:** - "apple stock price" - "how to invest in apple stocks" - "apple stock analysis" - "what is the future of apple stocks?" - "understanding apple's stock market performance" ## Citation If you find my work helpful, feel free to give me a citation. ``` ```