File size: 2,919 Bytes

---
license: apache-2.0
tags:
- retrieval
- tv-show-recommendation
- sentence-transformers
- semantic-search
library_name: sentence-transformers
model-index:
- name: fine-tuned movie retriever
  results:
  - task:
      type: retrieval
      name: Information Retrieval
    metrics:
    - name: Recall@1
      type: recall
      value: 0.454
    - name: Recall@3
      type: recall
      value: 0.676
    - name: Recall@5
      type: recall
      value: 0.730
    - name: Recall@10
      type: recall
      value: 0.797
metrics:
- recall
base_model:
- sentence-transformers/all-MiniLM-L6-v2
---

# 🎬 Fine-Tuned TV Show Retriever (Rich Semantic & Metadata Queries + Smart Negatives)

[![Model](https://img.shields.io/badge/HuggingFace-Model-blue?logo=huggingface)](https://huggingface.co/your-username/my-st-model)

This is a custom fine-tuned sentence-transformer model designed for movie and TV recommendation systems. Optimized for high-quality vector retrieval in a movie and TV show recommendation RAG pipeline. Fine-tuning was done using ~32K synthetic natural language queries across metadata and vibe-based prompts:

- Enriched vibe-style natural language queries (e.g., Emotionally powerful space exploration film with themes of love and sacrifice.)
- Metadata-based natural language queries (e.g., Any crime movies from the 1990s directed by Quentin Tarantino about heist?)
- Smarter negative sampling (genre contrast, theme mismatch, star-topic confusion)
- A dataset of over 32,000 triplets (query, positive doc, negative doc)


## 🧠 Training Details

- Base model: `sentence-transformers/all-MiniLM-L6-v2`
- Loss function: `MultipleNegativesRankingLoss`
- Epochs: 4
- Optimized for: top-k semantic retrieval in RAG systems


## 📈 Evaluation: Fine-tuned vs Base Model

| Metric      | Fine-Tuned Model Score | Base Model Score |
|-------------|:----------------------:|:----------------:|
| Recall@1    |                  0.454 |            0.133 |
| Recall@3    |                  0.676 |            0.230 |
| Recall@5    |                  0.730 |            0.279 |
| Recall@10   |                  0.797 |            0.349 |
| MRR         |                  0.583 |            0.207 |

**Evaluation setup**:
- Dataset: 3,600 held-out metadata and vibe-style natural queries
- Method: Top-k ranking using cosine similarity between query and positive documents
- Goal: Assess top-k retrieval quality in recommendation-like settings


## 📦 Usage

```python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("jjtsao/fine-tuned_tv_show_retriever")
query_embedding = model.encode("mind-bending sci-fi thrillers from the 2000s about identity")
```


## 🔍 Ideal Use Cases

- RAG-style movie recommendation apps
- Semantic filtering of large movie catalogs
- Query-document reranking pipelines


## 📜 License

Apache 2.0 — open for personal and commercial use.