File size: 2,152 Bytes
bd98034
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
library_name: transformers
license: apache-2.0
datasets:
  - recogna-nlp/ultra-alpaca-ptbr
language:
  - pt
base_model:
  - Felladrin/Minueza-2-96M
tags:
  - llama-factory
---

# Minueza-2-96M-Instruct (Variant 02)

This model is a fine-tuned version of [Felladrin/Minueza-2-96M](https://huggingface.co/Felladrin/Minueza-2-96M) on the Portuguese [recogna-nlp/ultra-alpaca-ptbr](https://huggingface.co/datasets/recogna-nlp/ultra-alpaca-ptbr) dataset.

## Usage

```sh
pip install transformers==4.50.0 torch==2.6.0
```

```python
from transformers import pipeline, TextStreamer
import torch

generate_text = pipeline(
    "text-generation",
    model="Felladrin/Minueza-2-96M-Instruct-Variant-02",
    device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
)

messages = [
    {
        "role": "system",
        "content": "Você é um assistente com vasta experiência em abertura de empresas.",
    },
    {
        "role": "user",
        "content": "Olá!",
    },
    {
        "role": "assistant",
        "content": "Olá! Como posso ajudar?",
    },
    {
        "role": "user",
        "content": "Liste os principais desafios de se abrir uma empresa.",
    },
]

generate_text(
    generate_text.tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    ),
    streamer=TextStreamer(generate_text.tokenizer, skip_special_tokens=True),
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    top_k=0,
    min_p=0.1,
    repetition_penalty=1.12,
)
```

## Training hyperparameters

The following hyperparameters were used during training:

- learning_rate: 7.5e-05
- train_batch_size: 64
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 128
- optimizer: Use adamw_torch with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 2990

## Framework versions

- Transformers 4.50.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.0

## License

This model is licensed under the Apache License 2.0.