FinDeepSeek: Enhanced Sentiment Analysis with LoRA-Adaptive PEFT Models

Overview

FinDeepSeek leverages state-of-the-art model compression techniques like LoRA (Low-Rank Adaptation)/ PEFT (Parameter-Efficient Fine-Tuning) to enhance sentiment analysis across multiple financial datasets. The project demonstrates the integration of Hugging Face's transformers library with adapters for efficient, scalable training and inference.

Introduction

FinGPT (https://github.com/AI4Finance-Foundation/FinGPT/tree/master) provides the open-source community with the best multi-task financial model (FinGPT v3.3). The model however has 13 billion parameters. Fine-tuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on financial news headlines for sentiment analysis enables users to use a 1.5 billion parameter model with similar benchmarks, including a new best of benchmark against the NWGI dataset (see below for details).

Features

Lightweight Sentiment Models: Optimized using LoRA configurations to enhance efficiency.
Multi-Dataset Support: Pretrained and fine-tuned on diverse financial sentiment datasets, including FPB, FIQA-SA, TFNS, and NWGI.
Reproducible Benchmarks: Detailed comparison of model performance across popular sentiment models.

Datasets

FPB (Financial PhraseBank): Structured financial sentences with sentiment labels.
FIQA-SA (Financial QA Sentiment Analysis): Dataset focused on sentiment classification tasks.
TFNS (Twitter Financial News Sentiment): Sentiment annotated tweets related to financial topics.
NWGI (News with GPT Instructions): News articles processed with sentiment instructions.

Benchmark Results:

Weighted F1	FPB	FiQA-SA	TFNS	NWGI	Devices	Time	Cost
[FinDeep R1]	0.790	0.577	0.798	0.667	1 × A100	2.57 hours	$12.50
[FinGPT v3.3]	0.882	0.874	0.903	0.643	1 × RTX 3090	17.25 hours	$17.25
FinGPT v3.2	0.850	0.860	0.894	0.636	1 × A100	5.5 hours	$ 22.55
FinGPT v3.1	0.855	0.850	0.875	0.642	1 × A100	5.5 hours	$ 22.55
FinGPT (8bit)	0.855	0.847	0.879	0.632	1 × RTX 3090	6.47 hours	$ 6.47
FinGPT (QLoRA)	0.777	0.752	0.828	0.583	1 × RTX 3090	4.15 hours	$ 4.15
OpenAI Fine-tune	0.878	0.887	0.883	-	-	-	-
GPT-4	0.833	0.630	0.808	-	-	-	-
FinBERT	0.880	0.596	0.733	0.538	4 × NVIDIA K80 GPU	-	-
Llama2-7B	0.390	0.800	0.296	0.503	2048 × A100	21 days	$ 4.23 million
BloombergGPT	0.511	0.751	-	-	512 × A100	53 days	$ 2.67 million

Installation

Ensure you have the required dependencies:

pip install transformers peft datasets pandas torch

```python
from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# PEFT model directory
peft_model_path = "huggnface/FinDeepSeek-R1-Distill-Qwen-1.5B-LoRA-Sentiment"  

# Load model and adapter configurations
peft_config = PeftConfig.from_pretrained(peft_model_path)
base_model = AutoModelForSequenceClassification.from_pretrained(peft_config.base_model_name_or_path,
                                                                id2label={"positive": 0, "negative": 1, "neutral": 2},
                                                                label2id={0:"positive", 1: "negative", 2:"neutral"},
                                                                num_labels=3)
inference_model = PeftModel.from_pretrained(base_model, peft_model_path)
tokenizer = AutoTokenizer.from_pretrained(peft_config.base_model_name_or_path)

# Device allocation
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inference_model.to(device)

print("Base + PEFT adapter model loaded successfully!")


def perform_inference(input_text):

        # Tokenize the input text
        inputs = tokenizer(
            input_text,
            return_tensors="pt",
            truncation=True,
            padding="max_length",
            max_length=256,  # Adjust max length if necessary
        )

        # Move inputs to the device
        inputs = {key: value.to(device) for key, value in inputs.items()}

        # Perform inference
        with torch.no_grad():
            outputs = inference_model(**inputs)

        # Extract logits and predictions
        logits = outputs.logits
        predicted_class = torch.argmax(logits, dim=-1).item()

        # Map predictions to sentiment labels
        label_mapping = inference_model.config.label2id
        predicted_label = label_mapping[predicted_class]

        return predicted_label

test = df['title'].iloc[0]
print(test)
#Flex (FLEX) Q3 Earnings Preview: What You Should Know Beyond the Headline Estimates

predictions = []

# Perform inference for each title in the DataFrame
for title in df['title'][:3]:
    predictions.append(perform_inference(title))

print(predictions)
#['neutral', 'positive', 'neutral']

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute.

Base Model

DeepSeek-R1-Distill-Qwen-1.5B

huggnface
/

FinDeepSeek-R1-Distill-Qwen-1.5B-LoRA-Sentiment