Daredevil47's picture
Create README.md
1e156dd verified
metadata
license: apache-2.0
datasets:
  - C-MTEB/TNews-classification
metrics:
  - accuracy
base_model:
  - openai-community/gpt2
pipeline_tag: text-classification
library_name: transformers

Model Card for Model ID

This modelcard aims to be a base template for new models. It has been generated using this raw template.

Model Details

Model Description

This is a fine-tuned version of the GPT-2 model for sentiment analysis on tweets. The model has been trained on the mteb/tweet_sentiment_extraction dataset to classify tweets into three sentiment categories: Positive, Neutral, and Negative. It uses the Hugging Face Transformers library and achieves an evaluation accuracy of 76%.

  • Developed by: Pradeep Vepada
  • Contact: [email protected]
  • Shared by [optional]: [More Information Needed]
  • Model type:
  • Architecture: GPT-2 Fine-Tuned Task: Sentiment Analysis
  • Language(s) (NLP): [More Information Needed]
  • License: [More Information Needed]
  • Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Usage:

This model is designed for sentiment analysis of tweets or other short social media text. Given an input text, it predicts the sentiment as Positive, Neutral, or Negative.

Performance:

Accuracy: 76% Evaluation Metric: Accuracy Validation Split: 10% of the dataset.

[More Information Needed]

Training Configuration:

Tokenizer: GPT-2 Tokenizer (with EOS token as pad token)
Optimizer: AdamW
Learning Rate: 1e-5
Epochs: 3
Batch Size: 1
Hardware Used: A100

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

Biases: The dataset may contain biased or harmful text, potentially influencing predictions. Limitations: Optimized for English tweets; performance may degrade on other text types or languages.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

Nvidia A100

Example Code:

from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch

Load the model and tokenizer

tokenizer = AutoTokenizer.from_pretrained("charlie1898/gpt2_finetuned_twitter_sentiment_analysis") model = AutoModelForSequenceClassification.from_pretrained("charlie1898/gpt2_finetuned_twitter_sentiment_analysis")

Example input

text = "I love using Hugging Face models!" inputs = tokenizer(text, return_tensors="pt") outputs = model(**inputs) predicted_class = torch.argmax(outputs.logits).item() print(f"Predicted sentiment class: {predicted_class}")

Limitations

  • ** Bias **: The dataset may contain biased or harmful text, potentially influencing predictions.
  • ** Domain Limitations **: Optimized for English tweets; performance may degrade on other text types or languages.

Ethical Considerations

This model should be used responsibly. Be aware of biases in the training data and avoid deploying the model in sensitive or high-stakes applications without further validation.

Acknowledgments

  • Hugging Face Transformers library
  • mteb/tweet_sentiment_extraction dataset

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]