FinDeepSeek: Enhanced Sentiment Analysis with LoRA-Adaptive PEFT Models

Overview

FinDeepSeek leverages state-of-the-art model compression techniques like LoRA (Low-Rank Adaptation)/ PEFT (Parameter-Efficient Fine-Tuning) to enhance sentiment analysis across multiple financial datasets. The project demonstrates the integration of Hugging Face's transformers library with adapters for efficient, scalable training and inference.

Introduction

FinGPT (https://github.com/AI4Finance-Foundation/FinGPT/tree/master) provides the open-source community with the best multi-task financial model (FinGPT v3.3). The model however has 13 billion parameters. Fine-tuning deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B on financial news headlines for sentiment analysis enables users to use a 1.5 billion parameter model with similar benchmarks, including a new best of benchmark against the NWGI dataset (see below for details).

Features

  • Lightweight Sentiment Models: Optimized using LoRA configurations to enhance efficiency.
  • Multi-Dataset Support: Pretrained and fine-tuned on diverse financial sentiment datasets, including FPB, FIQA-SA, TFNS, and NWGI.
  • Reproducible Benchmarks: Detailed comparison of model performance across popular sentiment models.

Datasets

  • FPB (Financial PhraseBank): Structured financial sentences with sentiment labels.
  • FIQA-SA (Financial QA Sentiment Analysis): Dataset focused on sentiment classification tasks.
  • TFNS (Twitter Financial News Sentiment): Sentiment annotated tweets related to financial topics.
  • NWGI (News with GPT Instructions): News articles processed with sentiment instructions.

Benchmark Results:

  • Weighted F1 FPB FiQA-SA TFNS NWGI Devices Time Cost
    [FinDeep R1] 0.790 0.577 0.798 0.667 1 × A100 2.57 hours $12.50
    [FinGPT v3.3] 0.882 0.874 0.903 0.643 1 × RTX 3090 17.25 hours $17.25
    FinGPT v3.2 0.850 0.860 0.894 0.636 1 × A100 5.5 hours $ 22.55
    FinGPT v3.1 0.855 0.850 0.875 0.642 1 × A100 5.5 hours $ 22.55
    FinGPT (8bit) 0.855 0.847 0.879 0.632 1 × RTX 3090 6.47 hours $ 6.47
    FinGPT (QLoRA) 0.777 0.752 0.828 0.583 1 × RTX 3090 4.15 hours $ 4.15
    OpenAI Fine-tune 0.878 0.887 0.883 - - - -
    GPT-4 0.833 0.630 0.808 - - - -
    FinBERT 0.880 0.596 0.733 0.538 4 × NVIDIA K80 GPU - -
    Llama2-7B 0.390 0.800 0.296 0.503 2048 × A100 21 days $ 4.23 million
    BloombergGPT 0.511 0.751 - - 512 × A100 53 days $ 2.67 million

Installation

Ensure you have the required dependencies:

pip install transformers peft datasets pandas torch

```python
from peft import PeftModel, PeftConfig
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# PEFT model directory
peft_model_path = "huggnface/FinDeepSeek-R1-Distill-Qwen-1.5B-LoRA-Sentiment"  

# Load model and adapter configurations
peft_config = PeftConfig.from_pretrained(peft_model_path)
base_model = AutoModelForSequenceClassification.from_pretrained(peft_config.base_model_name_or_path,
                                                                id2label={"positive": 0, "negative": 1, "neutral": 2},
                                                                label2id={0:"positive", 1: "negative", 2:"neutral"},
                                                                num_labels=3)
inference_model = PeftModel.from_pretrained(base_model, peft_model_path)
tokenizer = AutoTokenizer.from_pretrained(peft_config.base_model_name_or_path)

# Device allocation
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inference_model.to(device)

print("Base + PEFT adapter model loaded successfully!")


def perform_inference(input_text):

        # Tokenize the input text
        inputs = tokenizer(
            input_text,
            return_tensors="pt",
            truncation=True,
            padding="max_length",
            max_length=256,  # Adjust max length if necessary
        )

        # Move inputs to the device
        inputs = {key: value.to(device) for key, value in inputs.items()}

        # Perform inference
        with torch.no_grad():
            outputs = inference_model(**inputs)

        # Extract logits and predictions
        logits = outputs.logits
        predicted_class = torch.argmax(logits, dim=-1).item()

        # Map predictions to sentiment labels
        label_mapping = inference_model.config.label2id
        predicted_label = label_mapping[predicted_class]

        return predicted_label

test = df['title'].iloc[0]
print(test)
#Flex (FLEX) Q3 Earnings Preview: What You Should Know Beyond the Headline Estimates

predictions = []

# Perform inference for each title in the DataFrame
for title in df['title'][:3]:
    predictions.append(perform_inference(title))

print(predictions)
#['neutral', 'positive', 'neutral']

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute.

Base Model

  • DeepSeek-R1-Distill-Qwen-1.5B
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for huggnface/FinDeepSeek-R1-Distill-Qwen-1.5B-LoRA-Sentiment

Finetuned
(256)
this model