File size: 3,870 Bytes

1cc63a2
 
 
 
 
2442baa
1cc63a2
2442baa
1cc63a2
 
 
 
 
2442baa
1cc63a2
2442baa
 
 
 
 
1cc63a2
2442baa
1cc63a2
2442baa
1cc63a2
 
 
 
 
 
2442baa
 
 
 
 
 
 
 
 
1cc63a2
 
 
 
 
2442baa
1cc63a2
 
 
2442baa
1cc63a2
2442baa
 
 
 
1cc63a2
2442baa
1cc63a2
2442baa
1cc63a2
 
 
 
 
 
 
2442baa
1cc63a2
 
 
2442baa
1cc63a2
 
 
2442baa
 
1cc63a2
 
 
2442baa
 
 
1cc63a2
 
 
2442baa
1cc63a2
2442baa
1cc63a2
 
 
 
 
 
 
2442baa
 
1cc63a2
 
 
 
2442baa
1cc63a2
 
 
2442baa
1cc63a2
 
 
 
 
2442baa
 
1cc63a2
 
 
2442baa
 
1cc63a2
2442baa
1cc63a2
 
 
2442baa
 
 
 
 
 
 
 
 
1cc63a2
 
 
2442baa
1cc63a2
2442baa
1cc63a2
2442baa
1cc63a2
2442baa
1cc63a2
2442baa
1cc63a2
 
 
2f3903a

---
library_name: transformers
tags: []
---

# Model Card for DistilBERT Text Classification

This is a DistilBERT model fine-tuned for text classification tasks.

## Model Details

### Model Description

This DistilBERT model is fine-tuned for text classification tasks. It is designed to classify texts into different categories based on the provided dataset.

- **Developed by:** Thiago Adriano
- **Model type:** DistilBERT for Sequence Classification
- **Language(s) (NLP):** Portuguese
- **License:** MIT License
- **Finetuned from model:** distilbert-base-uncased

### Model Sources

- **Repository:** [Link to your repository](https://huggingface.co/tadrianonet/distilbert-text-classification)


## How to Get Started with the Model

Use the code below to get started with the model.

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("tadrianonet/distilbert-text-classification")
model = AutoModelForSequenceClassification.from_pretrained("tadrianonet/distilbert-text-classification")

inputs = tokenizer("Sample text for classification", return_tensors="pt")
outputs = model(**inputs)
```

## Training Details

### Training Data

The training data consists of text-label pairs in Portuguese. The data is preprocessed to tokenize the text and convert labels to numerical format.

### Training Procedure

The model is fine-tuned using the Hugging Face `Trainer` API with the following hyperparameters:

- **Training regime:** fp32
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Epochs:** 3

#### Speeds, Sizes, Times

- **Training time:** Approximately 10 minutes on a single GPU

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

The testing data is a separate set of text-label pairs used to evaluate the model's performance.

#### Factors

The evaluation is disaggregated by accuracy and loss.

#### Metrics

- **Accuracy:** Measures the proportion of correct predictions
- **Loss:** Measures the error in the model's predictions

### Results

- **Evaluation Results:**
  - **Loss:** 0.692
  - **Accuracy:** 50%

#### Summary

The model achieves 50% accuracy on the evaluation dataset, indicating that further fine-tuning and evaluation on a more diverse dataset may be necessary.

## Model Examination

[More Information Needed]

## Environmental Impact

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** GPU
- **Hours used:** 0.2 hours
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]

## Technical Specifications

### Model Architecture and Objective

The model is based on DistilBERT, a smaller, faster, and cheaper version of BERT, designed for efficient text classification.

### Compute Infrastructure

#### Hardware

- **Hardware Type:** Single GPU
- **GPU Model:** [More Information Needed]

#### Software

- **Framework:** Transformers 4.x
- **Library:** PyTorch

## Citation

**BibTeX:**

1 ```bibtex
@misc{thiago_adriano_2024_distilbert,
  author = {Thiago Adriano},
  title = {DistilBERT Text Classification},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/tadrianonet/distilbert-text-classification}},
}
1 ```

**APA:**

Thiago Adriano. (2024). DistilBERT Text Classification. Hugging Face. https://huggingface.co/tadrianonet/distilbert-text-classification

## More Information

For more details, visit the [Hugging Face model page](https://huggingface.co/tadrianonet/distilbert-text-classification).

## Model Card Authors

Thiago Adriano

## Model Card Contact

For more information, contact Thiago Adriano at [[email protected]]