Sentiment Analysis Models
This repository contains two logistic regression models trained to predict sentiment scores.
Model Details
- Base embedding model: BAAI/bge-large-en-v1.5
- Architecture: LogisticRegression (scikit-learn)
- Training data: Custom sentiment dataset with dual expert annotations
- Data split: 70% training, 15% development, 15% test
Performance Metrics
Development Set
Against Expert 1:
- Exact match: 50.95%
- Within 1 level: 95.17%
Against Expert 2:
- Exact match: 37.92%
- Within 1 level: 92.31%
Test Set
Against Expert 1:
- Exact match: 50.21%
- Within 1 level: 95.27%
Against Expert 2:
- Exact match: 41.23%
- Within 1 level: 92.26%
Usage
See inference.py
for an example of how to use these models to predict sentiment for new text.
Model Files
model1.joblib
: Model trained on Expert 1 annotations
model2.joblib
: Model trained on Expert 2 annotations
Data Files
dev_results.csv
: Complete predictions on development set
test_results.csv
: Complete predictions on test set