🎬 Fine-Tuned TV Show Retriever (Rich Semantic & Metadata Queries + Smart Negatives)

This is a custom fine-tuned sentence-transformer model designed for movie and TV recommendation systems. Optimized for high-quality vector retrieval in a movie and TV show recommendation RAG pipeline. Fine-tuning was done using ~32K synthetic natural language queries across metadata and vibe-based prompts:

Enriched vibe-style natural language queries (e.g., Emotionally powerful space exploration film with themes of love and sacrifice.)
Metadata-based natural language queries (e.g., Any crime movies from the 1990s directed by Quentin Tarantino about heist?)
Smarter negative sampling (genre contrast, theme mismatch, star-topic confusion)
A dataset of over 32,000 triplets (query, positive doc, negative doc)

🧠 Training Details

Base model: sentence-transformers/all-MiniLM-L6-v2
Loss function: MultipleNegativesRankingLoss
Epochs: 4
Optimized for: top-k semantic retrieval in RAG systems

📈 Evaluation: Fine-tuned vs Base Model

Metric	Fine-Tuned Model Score	Base Model Score
Recall@1	0.454	0.133
Recall@3	0.676	0.230
Recall@5	0.730	0.279
Recall@10	0.797	0.349
MRR	0.583	0.207

Evaluation setup:

Dataset: 3,600 held-out metadata and vibe-style natural queries
Method: Top-k ranking using cosine similarity between query and positive documents
Goal: Assess top-k retrieval quality in recommendation-like settings

📦 Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("jjtsao/fine-tuned_tv_show_retriever")
query_embedding = model.encode("mind-bending sci-fi thrillers from the 2000s about identity")

🔍 Ideal Use Cases

RAG-style movie recommendation apps
Semantic filtering of large movie catalogs
Query-document reranking pipelines

📜 License

Apache 2.0 — open for personal and commercial use.

JJTsao
/

fine-tuned_tv_show_retriever