Sentiment Analysis Models
This repository contains two logistic regression models trained to predict sentiment scores.
Model Details
- Base embedding model: mixedbread-ai/mxbai-embed-large-v1
- Architecture: LogisticRegression (scikit-learn)
- Training data: Custom sentiment dataset with dual expert annotations
- Data split: 70% training, 15% development, 15% test
Performance Metrics
Development Set
Against Expert 1:
- Exact match: 49.27%
- Within 1 level: 96.05%
Against Expert 2:
- Exact match: 41.00%
- Within 1 level: 93.05%
Test Set
Against Expert 1:
- Exact match: 49.32%
- Within 1 level: 94.93%
Against Expert 2:
- Exact match: 41.44%
- Within 1 level: 91.51%
Usage
See inference.py
for an example of how to use these models to predict sentiment for new text.
Model Files
model1.joblib
: Model trained on Expert 1 annotations
model2.joblib
: Model trained on Expert 2 annotations
Data Files
dev_results.csv
: Complete predictions on development set
test_results.csv
: Complete predictions on test set