Spaces:

MasterShomya
/

Sentiment_Analysis-Tweets

Sleeping

File size: 2,763 Bytes

9edefb0
29deaf7
5eebab1
 
f53d8be
9edefb0
 
 
 
 
 
 
 
1c4bb04

---
title: Tweets Sentiment Analyzer
emoji: 🚀
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
license: mit
short_description: Real-Time Tweet Sentiment Analyzer
---

# 🧠 Sentiment Analysis from Scratch (BiLSTM + Attention)

Welcome to this live interactive demo of a sentiment analysis model trained completely from scratch using a **Deep Bidirectional LSTM** architecture enhanced with a **custom attention mechanism**. This project is designed to classify short texts or tweets into **Positive** or **Negative** sentiments with a confidence score.

---

## 📌 Project Highlights

- ✅ **Trained from scratch**: The embedding layer is trained on the dataset itself (not using pretrained embeddings).
- 🧠 **Model Architecture**: 
  - Bidirectional LSTM layers
  - Custom attention layer (`BetterAttention`)
  - Final dense ANN for binary classification
- 📊 **Output**: Label (Positive/Negative) and confidence score (0–1)
- 🔠 **Tokenizer**: Also trained from scratch and saved as `tokenizer.joblib`
- 📁 **Model Format**: Saved as `.keras` and loaded efficiently during inference

---

## 🚀 Try it Out

Enter a tweet or short sentence below and see real-time prediction:

👉 *Example*:  
`"I absolutely loved the performance!"`  
**Output**: Positive (0.91)

---

## 🛠 Model Files

You can also explore/download the trained artifacts here:
- [`sentiment_model.keras`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/sentiment_model.keras)
- [`tokenizer.joblib`](https://huggingface.co/MasterShomya/Sentiment_Analysis-Tweets/blob/main/tokenizer.joblib)

---

## 🧪 How It Works

1. The input text is tokenized using the trained tokenizer (`joblib`).
2. The padded sequence is passed through:
   - `Embedding → BiLSTM → BiLSTM → Attention → Dense Layers`
3. The final sigmoid-activated output represents the **probability of positivity**.
4. A confidence-aware label is returned using Gradio’s `Label` component.

---

## 📈 Model Performance

Despite training from scratch without pretrained embeddings (like GloVe or FastText), the model performs comparably well. Experiments with `glove.27B.200d` embeddings yielded **similar accuracy**, and hence were excluded for clarity.

Training plots and confusion matrix are available in the original [Kaggle Notebook](https://www.kaggle.com/code/mastershomya/sentiment-analysis-deep-bilstm).

---

## 🧑‍💻 Author

**Shomya Soneji**  
Machine Learning & Deep Learning Enthusiast  
Connect on [Kaggle](https://www.kaggle.com/mastershomya)

---

## 🤝 Support

If you find this project helpful, please consider giving it a 🌟 and sharing it!  
Your feedback and suggestions are always welcome 💬