metadata
license: mit
base_model: Xenova/clap-htsat-unfused
tags:
- audio-classification
- transformers.js
- clap
- audio-tagging
library_name: transformers.js
clip-tagger Model
This is a personalized audio tagging model based on CLAP (Contrastive Language-Audio Pre-training). It extends the base Xenova/clap-htsat-unfused model with user feedback and custom tags.
Model Description
- Base Model: Xenova/clap-htsat-unfused
- Framework: Transformers.js compatible
- Training: User feedback and custom tag integration
- Use Case: Personalized audio content tagging
Usage
import { CLAPProcessor } from './clapProcessor.js';
import { LocalClassifier } from './localClassifier.js';
// Load the model
const processor = new CLAPProcessor();
const classifier = new LocalClassifier();
classifier.loadModel(); // Loads from localStorage or model files
// Process audio
const tags = await processor.processAudio(audioBuffer);
const personalizedTags = classifier.predictAll(features, candidateTags);
Files
localClassifier.js
- Local classifier implementationclapProcessor.js
- CLAP model wrapperuserFeedbackStore.js
- User feedback storage systemmodel-config.json
- Model configurationexample-usage.html
- Usage example
Links
- 🚀 Live Demo: clip-tagger Space
- 📦 Model Repository: sohei1l/clip-tagger
- 💻 Source Code: GitHub Repository
Training Data
This model learns from user corrections and custom tags. The base CLAP model provides initial audio understanding, while the local classifier adapts to user preferences.