JJTsao's picture
Update README.md
e59949f verified
metadata
license: apache-2.0
tags:
  - retrieval
  - tv-show-recommendation
  - sentence-transformers
  - semantic-search
library_name: sentence-transformers
model-index:
  - name: fine-tuned movie retriever
    results:
      - task:
          type: retrieval
          name: Information Retrieval
        metrics:
          - name: Recall@1
            type: recall
            value: 0.454
          - name: Recall@3
            type: recall
            value: 0.676
          - name: Recall@5
            type: recall
            value: 0.73
          - name: Recall@10
            type: recall
            value: 0.797
metrics:
  - recall
base_model:
  - sentence-transformers/all-MiniLM-L6-v2

🎬 Fine-Tuned TV Show Retriever (Rich Semantic & Metadata Queries + Smart Negatives)

Model

This is a custom fine-tuned sentence-transformer model designed for movie and TV recommendation systems. Optimized for high-quality vector retrieval in a movie and TV show recommendation RAG pipeline. Fine-tuning was done using ~32K synthetic natural language queries across metadata and vibe-based prompts:

  • Enriched vibe-style natural language queries (e.g., Emotionally powerful space exploration film with themes of love and sacrifice.)
  • Metadata-based natural language queries (e.g., Any crime movies from the 1990s directed by Quentin Tarantino about heist?)
  • Smarter negative sampling (genre contrast, theme mismatch, star-topic confusion)
  • A dataset of over 32,000 triplets (query, positive doc, negative doc)

🧠 Training Details

  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Loss function: MultipleNegativesRankingLoss
  • Epochs: 4
  • Optimized for: top-k semantic retrieval in RAG systems

πŸ“ˆ Evaluation: Fine-tuned vs Base Model

Metric Fine-Tuned Model Score Base Model Score
Recall@1 0.454 0.133
Recall@3 0.676 0.230
Recall@5 0.730 0.279
Recall@10 0.797 0.349
MRR 0.583 0.207

Evaluation setup:

  • Dataset: 3,600 held-out metadata and vibe-style natural queries
  • Method: Top-k ranking using cosine similarity between query and positive documents
  • Goal: Assess top-k retrieval quality in recommendation-like settings

πŸ“¦ Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("jjtsao/fine-tuned_tv_show_retriever")
query_embedding = model.encode("mind-bending sci-fi thrillers from the 2000s about identity")

πŸ” Ideal Use Cases

  • RAG-style movie recommendation apps
  • Semantic filtering of large movie catalogs
  • Query-document reranking pipelines

πŸ“œ License

Apache 2.0 β€” open for personal and commercial use.