File size: 2,919 Bytes
0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 e59949f 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 0f86a19 0379731 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
license: apache-2.0
tags:
- retrieval
- tv-show-recommendation
- sentence-transformers
- semantic-search
library_name: sentence-transformers
model-index:
- name: fine-tuned movie retriever
results:
- task:
type: retrieval
name: Information Retrieval
metrics:
- name: Recall@1
type: recall
value: 0.454
- name: Recall@3
type: recall
value: 0.676
- name: Recall@5
type: recall
value: 0.730
- name: Recall@10
type: recall
value: 0.797
metrics:
- recall
base_model:
- sentence-transformers/all-MiniLM-L6-v2
---
# π¬ Fine-Tuned TV Show Retriever (Rich Semantic & Metadata Queries + Smart Negatives)
[](https://huggingface.co/your-username/my-st-model)
This is a custom fine-tuned sentence-transformer model designed for movie and TV recommendation systems. Optimized for high-quality vector retrieval in a movie and TV show recommendation RAG pipeline. Fine-tuning was done using ~32K synthetic natural language queries across metadata and vibe-based prompts:
- Enriched vibe-style natural language queries (e.g., Emotionally powerful space exploration film with themes of love and sacrifice.)
- Metadata-based natural language queries (e.g., Any crime movies from the 1990s directed by Quentin Tarantino about heist?)
- Smarter negative sampling (genre contrast, theme mismatch, star-topic confusion)
- A dataset of over 32,000 triplets (query, positive doc, negative doc)
## π§ Training Details
- Base model: `sentence-transformers/all-MiniLM-L6-v2`
- Loss function: `MultipleNegativesRankingLoss`
- Epochs: 4
- Optimized for: top-k semantic retrieval in RAG systems
## π Evaluation: Fine-tuned vs Base Model
| Metric | Fine-Tuned Model Score | Base Model Score |
|-------------|:----------------------:|:----------------:|
| Recall@1 | 0.454 | 0.133 |
| Recall@3 | 0.676 | 0.230 |
| Recall@5 | 0.730 | 0.279 |
| Recall@10 | 0.797 | 0.349 |
| MRR | 0.583 | 0.207 |
**Evaluation setup**:
- Dataset: 3,600 held-out metadata and vibe-style natural queries
- Method: Top-k ranking using cosine similarity between query and positive documents
- Goal: Assess top-k retrieval quality in recommendation-like settings
## π¦ Usage
```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("jjtsao/fine-tuned_tv_show_retriever")
query_embedding = model.encode("mind-bending sci-fi thrillers from the 2000s about identity")
```
## π Ideal Use Cases
- RAG-style movie recommendation apps
- Semantic filtering of large movie catalogs
- Query-document reranking pipelines
## π License
Apache 2.0 β open for personal and commercial use. |