FinDeepSeek: Enhanced Sentiment Analysis with LoRA-Adaptive PEFT Models
Overview
FinDeepSeek leverages state-of-the-art model compression techniques like LoRA (Low-Rank Adaptation)/ PEFT (Parameter-Efficient Fine-Tuning) to enhance sentiment analysis across multiple financial datasets. The project demonstrates the integration of Hugging Face's transformers library with adapters for efficient, scalable training and inference.
Introduction
FinGPT (https://github.com/AI4Finance-Foundation/FinGPT/tree/master) provides the open-source community with the best multi-task financial model (FinGPT v3.3). The model however has 13 billion parameters. Fine-tuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on financial news headlines for sentiment analysis enables users to use a 1.5 billion parameter model with similar benchmarks, including a new best of benchmark against the NWGI dataset (see below for details).
Features
- Lightweight Sentiment Models: Optimized using LoRA configurations to enhance efficiency.
- Multi-Dataset Support: Pretrained and fine-tuned on diverse financial sentiment datasets, including FPB, FIQA-SA, TFNS, and NWGI.
- Reproducible Benchmarks: Detailed comparison of model performance across popular sentiment models.
Datasets
- FPB (Financial PhraseBank): Structured financial sentences with sentiment labels.
- FIQA-SA (Financial QA Sentiment Analysis): Dataset focused on sentiment classification tasks.
- TFNS (Twitter Financial News Sentiment): Sentiment annotated tweets related to financial topics.
- NWGI (News with GPT Instructions): News articles processed with sentiment instructions.
Benchmark Results:
Weighted F1 FPB FiQA-SA TFNS NWGI Devices Time Cost [FinDeep R1] 0.790 0.577 0.798 0.667 1 × A100 2.57 hours $12.50 [FinGPT v3.3] 0.882 0.874 0.903 0.643 1 × RTX 3090 17.25 hours $17.25 FinGPT v3.2 0.850 0.860 0.894 0.636 1 × A100 5.5 hours $ 22.55 FinGPT v3.1 0.855 0.850 0.875 0.642 1 × A100 5.5 hours $ 22.55 FinGPT (8bit) 0.855 0.847 0.879 0.632 1 × RTX 3090 6.47 hours $ 6.47 FinGPT (QLoRA) 0.777 0.752 0.828 0.583 1 × RTX 3090 4.15 hours $ 4.15 OpenAI Fine-tune 0.878 0.887 0.883 - - - - GPT-4 0.833 0.630 0.808 - - - - FinBERT 0.880 0.596 0.733 0.538 4 × NVIDIA K80 GPU - - Llama2-7B 0.390 0.800 0.296 0.503 2048 × A100 21 days $ 4.23 million BloombergGPT 0.511 0.751 - - 512 × A100 53 days $ 2.67 million
Installation
Ensure you have the required dependencies:
pip install transformers peft datasets pandas torch
```python
from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# PEFT model directory
peft_model_path = "huggnface/FinDeepSeek-R1-Distill-Qwen-1.5B-LoRA-Sentiment"
# Load model and adapter configurations
peft_config = PeftConfig.from_pretrained(peft_model_path)
base_model = AutoModelForSequenceClassification.from_pretrained(peft_config.base_model_name_or_path,
id2label={"positive": 0, "negative": 1, "neutral": 2},
label2id={0:"positive", 1: "negative", 2:"neutral"},
num_labels=3)
inference_model = PeftModel.from_pretrained(base_model, peft_model_path)
tokenizer = AutoTokenizer.from_pretrained(peft_config.base_model_name_or_path)
# Device allocation
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inference_model.to(device)
print("Base + PEFT adapter model loaded successfully!")
def perform_inference(input_text):
# Tokenize the input text
inputs = tokenizer(
input_text,
return_tensors="pt",
truncation=True,
padding="max_length",
max_length=256, # Adjust max length if necessary
)
# Move inputs to the device
inputs = {key: value.to(device) for key, value in inputs.items()}
# Perform inference
with torch.no_grad():
outputs = inference_model(**inputs)
# Extract logits and predictions
logits = outputs.logits
predicted_class = torch.argmax(logits, dim=-1).item()
# Map predictions to sentiment labels
label_mapping = inference_model.config.label2id
predicted_label = label_mapping[predicted_class]
return predicted_label
test = df['title'].iloc[0]
print(test)
#Flex (FLEX) Q3 Earnings Preview: What You Should Know Beyond the Headline Estimates
predictions = []
# Perform inference for each title in the DataFrame
for title in df['title'][:3]:
predictions.append(perform_inference(title))
print(predictions)
#['neutral', 'positive', 'neutral']
License
This project is licensed under the MIT License. Feel free to use, modify, and distribute.
Base Model
- DeepSeek-R1-Distill-Qwen-1.5B
Model tree for huggnface/FinDeepSeek-R1-Distill-Qwen-1.5B-LoRA-Sentiment
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B