Sentiment Analysis Models
This repository contains two logistic regression models trained to predict sentiment scores.
Model Details
- Base embedding model: paraphrase-MiniLM-L6-v2
- Architecture: LogisticRegression (scikit-learn)
- Training data: Custom sentiment dataset with dual expert annotations
- Data split: 70% training, 15% development, 15% test
Performance Metrics
Development Set
Against Expert 1:
- Exact match: 48.32%
- Within 1 level: 93.70%
Against Expert 2:
- Exact match: 38.95%
- Within 1 level: 92.17%
Test Set
Against Expert 1:
- Exact match: 47.81%
- Within 1 level: 93.63%
Against Expert 2:
- Exact match: 40.75%
- Within 1 level: 92.05%
Usage
See inference.py
for an example of how to use these models to predict sentiment for new text.
Model Files
model1.joblib
: Model trained on Expert 1 annotations
model2.joblib
: Model trained on Expert 2 annotations
Data Files
dev_results.csv
: Complete predictions on development set
test_results.csv
: Complete predictions on test set