File size: 3,870 Bytes
1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2442baa 1cc63a2 2f3903a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
library_name: transformers
tags: []
---
# Model Card for DistilBERT Text Classification
This is a DistilBERT model fine-tuned for text classification tasks.
## Model Details
### Model Description
This DistilBERT model is fine-tuned for text classification tasks. It is designed to classify texts into different categories based on the provided dataset.
- **Developed by:** Thiago Adriano
- **Model type:** DistilBERT for Sequence Classification
- **Language(s) (NLP):** Portuguese
- **License:** MIT License
- **Finetuned from model:** distilbert-base-uncased
### Model Sources
- **Repository:** [Link to your repository](https://huggingface.co/tadrianonet/distilbert-text-classification)
## How to Get Started with the Model
Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("tadrianonet/distilbert-text-classification")
model = AutoModelForSequenceClassification.from_pretrained("tadrianonet/distilbert-text-classification")
inputs = tokenizer("Sample text for classification", return_tensors="pt")
outputs = model(**inputs)
```
## Training Details
### Training Data
The training data consists of text-label pairs in Portuguese. The data is preprocessed to tokenize the text and convert labels to numerical format.
### Training Procedure
The model is fine-tuned using the Hugging Face `Trainer` API with the following hyperparameters:
- **Training regime:** fp32
- **Learning rate:** 2e-5
- **Batch size:** 16
- **Epochs:** 3
#### Speeds, Sizes, Times
- **Training time:** Approximately 10 minutes on a single GPU
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
The testing data is a separate set of text-label pairs used to evaluate the model's performance.
#### Factors
The evaluation is disaggregated by accuracy and loss.
#### Metrics
- **Accuracy:** Measures the proportion of correct predictions
- **Loss:** Measures the error in the model's predictions
### Results
- **Evaluation Results:**
- **Loss:** 0.692
- **Accuracy:** 50%
#### Summary
The model achieves 50% accuracy on the evaluation dataset, indicating that further fine-tuning and evaluation on a more diverse dataset may be necessary.
## Model Examination
[More Information Needed]
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** GPU
- **Hours used:** 0.2 hours
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications
### Model Architecture and Objective
The model is based on DistilBERT, a smaller, faster, and cheaper version of BERT, designed for efficient text classification.
### Compute Infrastructure
#### Hardware
- **Hardware Type:** Single GPU
- **GPU Model:** [More Information Needed]
#### Software
- **Framework:** Transformers 4.x
- **Library:** PyTorch
## Citation
**BibTeX:**
1 ```bibtex
@misc{thiago_adriano_2024_distilbert,
author = {Thiago Adriano},
title = {DistilBERT Text Classification},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/tadrianonet/distilbert-text-classification}},
}
1 ```
**APA:**
Thiago Adriano. (2024). DistilBERT Text Classification. Hugging Face. https://huggingface.co/tadrianonet/distilbert-text-classification
## More Information
For more details, visit the [Hugging Face model page](https://huggingface.co/tadrianonet/distilbert-text-classification).
## Model Card Authors
Thiago Adriano
## Model Card Contact
For more information, contact Thiago Adriano at [tadriano.dev@gmail.com]
|