üß† **NeuroFeel** ‚Äì (Where AI Meets Emotion)
NeuroFeel is a tiny emotion detection model built on NeuroBERT.
It‚Äôs lightweight (~25MB), works offline, and is perfect for edge and mobile devices.


**üîç Trained on:**
üìä Boltuix Emotions Dataset ‚Äì crafted for real, short-text emotional expressions.


üîó Dataset: [Emotions Dataset](https://huggingface.co/datasets/boltuix/emotions-dataset)


üí° Use NeuroFeel in:
üì± Mobile apps
üè† Smart homes
‚åö Wearables
üí¨ Chatbots
üßò Mental health tools


‚ù§Ô∏è Understands 13 emotions like:
Happy, Sad, Angry, Loved, Scared, Excited, and more.

**‚ö° Why NeuroFeel?**
Ultra-fast and low memory

**Edge-ready**
Great for emotional intelligence in devices

**üîó Model:** [NeuroFeel Model](https://huggingface.co/boltuix/NeuroFeel)

In [None]:
import pandas as pd
from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments, TrainerCallback
from sklearn.model_selection import train_test_split
import torch
from torch.utils.data import Dataset
import shutil
from tqdm import tqdm  # Added for progress bar

# === 0. Define common model name ===
MODEL_NAME = "boltuix/NeuroBERT"
OUTPUT_DIR = "./neuro-feel"

# === Custom Callback for Progress Bar ===
class TQDMProgressBarCallback(TrainerCallback):
    def __init__(self):
        super().__init__()
        self.progress_bar = None

    def on_train_begin(self, args, state, control, **kwargs):
        self.total_steps = state.max_steps
        self.progress_bar = tqdm(total=self.total_steps, desc="Training", unit="step")

    def on_step_end(self, args, state, control, **kwargs):
        self.progress_bar.update(1)
        self.progress_bar.set_postfix({
            "epoch": f"{state.epoch:.2f}",
            "step": state.global_step
        })

    def on_train_end(self, args, state, control, **kwargs):
        if self.progress_bar is not None:
            self.progress_bar.close()
            self.progress_bar = None

# === 1. Load and preprocess data ===
dataset_path = '/content/dataset.csv'
df = pd.read_csv(dataset_path)
df = df.dropna(subset=['Label'])
df.columns = ['text', 'label']

# === 2. Encode labels ===
labels = sorted(df["label"].unique())
label_to_id = {label: idx for idx, label in enumerate(labels)}
id_to_label = {idx: label for label, idx in label_to_id.items()}
df['label'] = df['label'].map(label_to_id)

# === 3. Train/val split ===
train_texts, val_texts, train_labels, val_labels = train_test_split(
    df['text'].tolist(), df['label'].tolist(), test_size=0.2, random_state=42
)

# === 4. Tokenizer ===
tokenizer = BertTokenizer.from_pretrained(MODEL_NAME)

# === 5. Dataset class ===
class SentimentDataset(Dataset):
    def __init__(self, texts, labels, tokenizer, max_length=128):
        self.texts = texts
        self.labels = labels
        self.tokenizer = tokenizer
        self.max_length = max_length

    def __len__(self):
        return len(self.texts)

    def __getitem__(self, idx):
        encoding = self.tokenizer(
            self.texts[idx],
            padding='max_length',
            truncation=True,
            max_length=self.max_length,
            return_tensors='pt'
        )
        return {
            'input_ids': encoding['input_ids'].squeeze(0),
            'attention_mask': encoding['attention_mask'].squeeze(0),
            'labels': torch.tensor(self.labels[idx], dtype=torch.long)
        }

# === 6. Load datasets ===
train_dataset = SentimentDataset(train_texts, train_labels, tokenizer)
val_dataset = SentimentDataset(val_texts, val_labels, tokenizer)

# === 7. Load model ===
model = BertForSequenceClassification.from_pretrained(
    MODEL_NAME,
    num_labels=len(label_to_id)
)

for param in model.parameters():
    param.data = param.data.contiguous()

# === 8. Training arguments ===
training_args = TrainingArguments(
    output_dir='./results',
    run_name="NeuroFeel",
    num_train_epochs=5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
    eval_strategy="epoch",
    report_to="none"
)

# === 9. Trainer setup ===
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    callbacks=[TQDMProgressBarCallback()]  # Added progress bar callback
)

# === 10. Train and evaluate ===
trainer.train()
trainer.evaluate()

# === 11. Save model and label mappings ===
model.config.label2id = label_to_id
model.config.id2label = id_to_label
model.config.num_labels = len(label_to_id)

model.save_pretrained(OUTPUT_DIR)
tokenizer.save_pretrained(OUTPUT_DIR)

# === 12. Zip the trained model directory ===
shutil.make_archive("neuro-feel", 'zip', OUTPUT_DIR)

print("‚úÖ Training complete. Model and tokenizer saved to ./neuro-feel")
print("‚úÖ Model directory zipped to neuro-feel.zip")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/1.36k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/132 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/611 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/57.5M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at boltuix/NeuroBERT and are newly initialized: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Training:   0%|          | 1/32830 [00:01<10:58:37,  1.20s/step, epoch=0.00, step=1]

Epoch,Training Loss,Validation Loss
1,0.8597,0.986835
2,0.9175,0.912361
3,0.8556,0.881038
4,0.7531,0.896444
5,0.6632,0.905136


Training: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 32830/32830 [30:43<00:00, 17.81step/s, epoch=5.00, step=32830]


‚úÖ Training complete. Model and tokenizer saved to ./neuro-feel
‚úÖ Model directory zipped to neuro-feel.zip


In [None]:
import torch
from transformers import BertTokenizer, BertForSequenceClassification

# Load the fine-tuned model and tokenizer from Hugging Face Hub
try:
    model = BertForSequenceClassification.from_pretrained("./neuro-feel")
    tokenizer = BertTokenizer.from_pretrained("./neuro-feel")
except Exception as e:
    print(f"Error loading model or tokenizer: {e}")
    exit(1)

# Set model to evaluation mode to disable training-specific layers
model.eval()

# Define a label map to decode numerical predictions to emotion labels
# label_map = {
#     0: "sadness",
#     1: "anger",
#     2: "love",
#     3: "surprise",
#     4: "fear",
#     5: "happiness",
#     6: "neutral",
#     7: "disgust",
#     8: "shame",
#     9: "guilt",
#     10: "confusion",
#     11: "desire",
#     12: "sarcasm"
# }  # Adjust based on model‚Äôs training labels üè∑Ô∏è


label_map = model.config.id2label
label_map = {int(k): v for k, v in label_map.items()}

# Define test sentences with ground truth emotion labels
test_cases = [
    ("I miss her so much it hurts.", "sadness"),
    ("Tears won‚Äôt stop falling.", "sadness"),
    ("Everything feels so empty.", "sadness"),
    ("I feel broken inside.", "sadness"),
    ("Waking up is the hardest part.", "sadness"),
    ("It‚Äôs been so hard to cope.", "sadness"),
    ("My chest feels heavy all the time.", "sadness"),
    ("Losing him shattered me.", "sadness"),
    ("Why do you always ignore me?!", "anger"),
    ("This is absolutely ridiculous!", "anger"),
    ("You never listen to me!", "anger"),
    ("That‚Äôs the last straw!", "anger"),
    ("I‚Äôm furious with how they treated me.", "anger"),
    ("I‚Äôm done putting up with this nonsense!", "anger"),
    ("Everything you say just makes it worse!", "anger"),
    ("I hate how you always twist my words.", "anger"),
    ("You‚Äôre the reason I believe in love.", "love"),
    ("You complete me.", "love"),
    ("What? That‚Äôs unbelievable!", "surprise"),
    ("I didn‚Äôt see that coming!", "surprise"),
    ("You got me tickets? No way!", "surprise"),
    ("That was totally unexpected.", "surprise"),
    ("You're moving already? That‚Äôs so sudden!", "surprise"),
    ("Whoa, that was fast!", "surprise"),
    ("I‚Äôm so scared right now.", "fear"),
    ("I can‚Äôt do this, I‚Äôm terrified.", "fear"),
    ("I‚Äôm afraid to even look.", "fear"),
    ("My hands are shaking.", "fear"),
    ("I keep hearing noises in the dark.", "fear"),
    ("I‚Äôm panicking just thinking about it.", "fear"),
    ("Today was absolutely perfect!", "happiness"),
    ("I can‚Äôt stop smiling!", "happiness"),
    ("This made my whole week!", "happiness"),
    ("I‚Äôm full of joy right now.", "happiness"),
    ("That was so much fun!", "happiness"),
    ("I‚Äôm just working through the day.", "neutral"),
    ("It was an average lunch.", "neutral"),
    ("Nothing much happened today.", "neutral"),
    ("Just reading some news.", "neutral"),
    ("That food made me gag.", "disgust"),
    ("That‚Äôs absolutely revolting.", "disgust"),
    ("Ugh, that‚Äôs disgusting!", "disgust"),
    ("It smells horrible in here.", "disgust"),
    ("That behavior is so gross.", "disgust"),
    ("I‚Äôm repulsed by what I saw.", "disgust"),
    ("That‚Äôs just nasty.", "disgust"),
    ("I can‚Äôt even look them in the eye.", "shame"),
    ("Why did I say that? I‚Äôm so embarrassed.", "shame"),
    ("I can‚Äôt believe I acted that way.", "shame"),
    ("I shouldn‚Äôt have lied to her.", "guilt"),
    ("I feel terrible about what I did.", "guilt"),
    ("I feel sick with regret.", "guilt"),
    ("The guilt is eating me alive.", "guilt"),
    ("Wait, what just happened?", "confusion"),
    ("I don‚Äôt get it at all.", "confusion"),
    ("I‚Äôm not sure what I‚Äôm supposed to do.", "confusion"),
    ("This is all so unclear.", "confusion"),
    ("I‚Äôm lost in this situation.", "confusion"),
    ("I don‚Äôt know how to respond.", "confusion"),
    ("I‚Äôm struggling to understand.", "confusion"),
    ("I want to be the best at this.", "desire"),
    ("All I need is one more chance.", "desire"),
    ("I wish I could be with you right now.", "desire"),
    ("I need this more than anything.", "desire"),
    ("I‚Äôve been dreaming of this moment.", "desire"),
    ("I crave your attention.", "desire"),
    ("I just want to be loved.", "desire"),
    ("I desire success more than comfort.", "desire"),
    ("Oh great, another meeting‚Ä¶ just what I needed.", "sarcasm"),
    ("Wow, you‚Äôre such a genius.", "sarcasm"),
    ("Yeah, because that worked out so well last time.", "sarcasm"),
    ("Lovely, now we‚Äôre lost again.", "sarcasm"),
    ("Oh, I‚Äôm absolutely thrilled‚Ä¶ not.", "sarcasm"),
    ("Absolutely, let‚Äôs make another terrible decision.", "sarcasm"),
]



# Prediction function with error handling
def predict_label(text):
    """
    Predict the emotion label for a given text using the fine-tuned BERT model.

    Args:
        text (str): Input text to classify (e.g., "I'm feeling really down today."). üí¨

    Returns:
        str: Predicted emotion label (e.g., "sadness") or "error" if prediction fails. üòä
    """
    try:
        inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
        with torch.no_grad():
            outputs = model(**inputs)
        logits = outputs.logits
        predicted_label = torch.argmax(logits, dim=1).item()
        return label_map.get(predicted_label, "unknown")
    except Exception as e:
        print(f"Error predicting for text '{text}': {e}")
        return "error"

# Run predictions and evaluate performance
correct = 0
print("Prediction Results:\n" + "-"*40)
for idx, (sentence, true_label) in enumerate(test_cases):
    predicted = predict_label(sentence)
    is_correct = predicted == true_label
    if is_correct:
        correct += 1
    print(f"{idx+1}. Sentence: {sentence}")
    print(f"   Predicted Label: {predicted}")
    print(f"   True Label: {true_label}")
    print(f"   Correct: {'Yes' if is_correct else 'No'}\n")

# Calculate and display accuracy
total = len(test_cases)
accuracy = (correct / total) * 100 if total > 0 else 0
print("-"*40)
print(f"Total predictions made: {total}")
print(f"Correct predictions: {correct}")
print(f"Accuracy: {accuracy:.2f}%")

Prediction Results:
----------------------------------------
1. Sentence: I miss her so much it hurts.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

2. Sentence: Tears won‚Äôt stop falling.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

3. Sentence: Everything feels so empty.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

4. Sentence: I feel broken inside.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

5. Sentence: Waking up is the hardest part.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

6. Sentence: It‚Äôs been so hard to cope.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

7. Sentence: My chest feels heavy all the time.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

8. Sentence: Losing him shattered me.
   Predicted Label: sadness
   True Label: sadness
   Correct: Yes

9. Sentence: Why do you always ignore me?!
   Predicted Label: an

In [None]:
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import time
import os

model_names = [
    "boltuix/NeuroFeel", #  base model : boltuix/NeuroBERT
    "boltuix/bert-emotion",# base model :boltuix/bert-lite
    "Varnikasiva/sentiment-classification-bert-mini", # base model : prajjwal1/bert-mini
]

test_cases = [
    # Sadness üò¢
    ("I miss my family so much it hurts.", "sadness"),
    ("Everything feels meaningless lately.", "sadness"),

    # Anger üò†
    ("The driver just cut me off without even signaling!", "anger"),

    # Love ‚ù§Ô∏è
    ("I love you.", "love"),
    ("i love u very much.", "love"),

    # Surprise üò≤
    ("I didn‚Äôt expect to win the competition!", "surprise"),

    # Fear üò±
    ("I‚Äôm terrified of losing my job.", "fear"),
    ("That noise outside my window scared me to death.", "fear"),

    # Happiness üòÑ
    ("Spending time with my friends today made me so happy.", "happiness"),

    # Neutral üòê
    ("I had lunch and watched TV. Nothing special.", "neutral"),
    ("Just another ordinary day at the office.", "neutral"),

    # Disgust ü§¢
    ("The kitchen smelled awful this morning.", "disgust"),

    # Shame üôà
    ("I felt so embarrassed after forgetting her name.", "shame"),

    # Guilt üòî
    ("I shouldn‚Äôt have yelled at him. I feel guilty.", "guilt"),
    ("I forgot Mom‚Äôs birthday. I feel terrible.", "guilt"),

    # Confusion üòï
    ("I don‚Äôt understand why she‚Äôs upset with me.", "confusion"),
    ("Why is the meeting scheduled twice? I‚Äôm confused.", "confusion"),

    # Desire üî•
    ("I really want to travel the world someday.", "desire"),

    # Sarcasm üôÉ
    ("Oh, perfect, rides misplace, pickup puzzle!", "sarcasm"),
]


def get_folder_size_mb(folder):
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(folder):
        for f in filenames:
            fp = os.path.join(dirpath, f)
            total_size += os.path.getsize(fp)
    return total_size / (1024 * 1024)

def load_model_and_tokenizer(model_name):
    local_dir = f"./downloaded_models/{model_name.replace('/', '_')}"
    model = AutoModelForSequenceClassification.from_pretrained(model_name, cache_dir=local_dir)
    tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir=local_dir)
    model.eval()
    label_map = {int(k): v for k, v in model.config.id2label.items()}
    return model, tokenizer, label_map, local_dir

def predict_label(model, tokenizer, label_map, text):
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    logits = outputs.logits
    predicted_id = torch.argmax(logits, dim=1).item()
    return label_map.get(predicted_id, "unknown")

results = []
print("Loading models and evaluating...\n")
for name in model_names:
    model, tokenizer, label_map, local_dir = load_model_and_tokenizer(name)
    size_mb = get_folder_size_mb(local_dir)

    times = []
    correct = 0
    failed_cases = []  # NEW: collect failed cases

    for text, true_label in test_cases:
        start = time.time()
        pred = predict_label(model, tokenizer, label_map, text)
        end = time.time()
        times.append((end - start) * 1000)  # ms

        if pred == true_label:
            correct += 1
        else:
            failed_cases.append((text, true_label, pred))  # store mismatch

    avg_time = sum(times) / len(times)
    accuracy = (correct / len(test_cases)) * 100

    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

    results.append({
        "model_name": name,
        "size_mb": size_mb,
        "total_params_million": total_params / 1e6,
        "trainable_params_million": trainable_params / 1e6,
        "avg_inference_time_ms": avg_time,
        "accuracy_percent": accuracy,
        "failed_cases": failed_cases,  # add failed info
    })

# Summary Table
print(f"{'Model':<50} | {'Size (MB)':<10} | {'Total Params (M)':<17} | {'Trainable Params (M)':<20} | {'Avg. Inference Time (ms)':<24} | {'Accuracy (%)':<12}")
print("-" * 135)
for res in results:
    print(f"{res['model_name']:<50} | {res['size_mb']:<10.2f} | {res['total_params_million']:<17.2f} | {res['trainable_params_million']:<20.2f} | {res['avg_inference_time_ms']:<24.2f} | {res['accuracy_percent']:<12.2f}")

# Best model by accuracy
best_model = max(results, key=lambda x: x['accuracy_percent'])
print(f"\nBest model by accuracy: {best_model['model_name']} with {best_model['accuracy_percent']:.2f}% accuracy")

# Failed Cases Log
print("\n‚ùå FAILED CASES PER MODEL:\n")
for res in results:
    print(f"Model: {res['model_name']}")
    if not res["failed_cases"]:
        print("  ‚úÖ All test cases passed.\n")
        continue
    print(f"{'Input':<70} | {'Expected':<10} | {'Predicted'}")
    print("-" * 100)
    for sentence, expected, predicted in res["failed_cases"]:
        print(f"{sentence[:70]:<70} | {expected:<10} | {predicted}")
    print("\n")


Loading models and evaluating...

Model                                              | Size (MB)  | Total Params (M)  | Trainable Params (M) | Avg. Inference Time (ms) | Accuracy (%)
---------------------------------------------------------------------------------------------------------------------------------------
boltuix/NeuroFeel                                  | 109.83     | 14.33             | 14.33                | 10.86                    | 100.00      
boltuix/bert-emotion                               | 85.77      | 11.17             | 11.17                | 6.85                     | 68.42       
Varnikasiva/sentiment-classification-bert-mini     | 85.77      | 11.17             | 11.17                | 5.18                     | 73.68       

Best model by accuracy: boltuix/NeuroFeel with 100.00% accuracy

‚ùå FAILED CASES PER MODEL:

Model: boltuix/NeuroFeel
  ‚úÖ All test cases passed.

Model: boltuix/bert-emotion
Input                                                   