Text Generation
Transformers
Safetensors
llama
axolotl
Generated from Trainer
conversational
text-generation-inference
hardlyworking commited on
Commit
9edea57
·
verified ·
1 Parent(s): 82b4d76

End of training

Browse files
Files changed (1) hide show
  1. README.md +152 -0
README.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: cc-by-nc-4.0
4
+ base_model: hardlyworking/4BTestRC
5
+ tags:
6
+ - axolotl
7
+ - generated_from_trainer
8
+ datasets:
9
+ - ResplendentAI/Luna_NSFW_Text
10
+ - ResplendentAI/Sissification_Hypno_1k
11
+ - ResplendentAI/Synthetic_Soul_1k
12
+ model-index:
13
+ - name: Final4BRC
14
+ results: []
15
+ ---
16
+
17
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
+ should probably proofread and complete it, then remove this comment. -->
19
+
20
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
21
+ <details><summary>See axolotl config</summary>
22
+
23
+ axolotl version: `0.11.0.dev0`
24
+ ```yaml
25
+ base_model: hardlyworking/4BTestRC
26
+
27
+ load_in_8bit: false
28
+ load_in_4bit: false
29
+ strict: false
30
+
31
+ chat_template: chatml
32
+ datasets:
33
+ - path: ResplendentAI/Luna_NSFW_Text
34
+ type: completion
35
+ - path: ResplendentAI/Sissification_Hypno_1k
36
+ type: alpaca
37
+ - path: ResplendentAI/Synthetic_Soul_1k
38
+ type: alpaca
39
+
40
+ val_set_size: 0
41
+ output_dir: ./outputs/out
42
+ dataset_prepared_path: last_run_prepared
43
+ shuffle_merged_datasets: true
44
+
45
+ hub_model_id: hardlyworking/Final4BRC
46
+ hub_strategy: "all_checkpoints"
47
+ push_dataset_to_hub:
48
+ hf_use_auth_token: true
49
+
50
+ plugins:
51
+ - axolotl.integrations.liger.LigerPlugin
52
+ - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
53
+ liger_rope: true
54
+ liger_rms_norm: true
55
+ liger_layer_norm: true
56
+ liger_glu_activation: true
57
+ liger_fused_linear_cross_entropy: false
58
+ cut_cross_entropy: true
59
+
60
+ sequence_len: 32768
61
+ sample_packing: true
62
+ eval_sample_packing: true
63
+ pad_to_sequence_len: true
64
+
65
+ wandb_project: Xgen4Bnsfw
66
+ wandb_entity:
67
+ wandb_watch:
68
+ wandb_name: Xgen4Bnsfw
69
+ wandb_log_model:
70
+
71
+ evals_per_epoch:
72
+ eval_table_size:
73
+ eval_max_new_tokens:
74
+
75
+ gradient_accumulation_steps: 1
76
+ micro_batch_size: 1
77
+ num_epochs: 4
78
+ optimizer: adamw_bnb_8bit
79
+ lr_scheduler: cosine
80
+ learning_rate: 5e-5
81
+
82
+ train_on_inputs: false
83
+ group_by_length: false
84
+ bf16: auto
85
+ fp16:
86
+ tf32: false
87
+
88
+ gradient_checkpointing: offload
89
+ gradient_checkpointing_kwargs:
90
+ use_reentrant: false
91
+ early_stopping_patience:
92
+ resume_from_checkpoint:
93
+ local_rank:
94
+ logging_steps: 1
95
+ xformers_attention:
96
+ flash_attention: true
97
+ s2_attention:
98
+
99
+ deepspeed:
100
+
101
+ warmup_ratio: 0.05
102
+ saves_per_epoch: 1
103
+ debug:
104
+ weight_decay: 0.01
105
+ fsdp:
106
+ fsdp_config:
107
+ special_tokens:
108
+ pad_token:
109
+ ```
110
+
111
+ </details><br>
112
+
113
+ # Final4BRC
114
+
115
+ This model is a fine-tuned version of [hardlyworking/4BTestRC](https://huggingface.co/hardlyworking/4BTestRC) on the ResplendentAI/Luna_NSFW_Text, the ResplendentAI/Sissification_Hypno_1k and the ResplendentAI/Synthetic_Soul_1k datasets.
116
+
117
+ ## Model description
118
+
119
+ More information needed
120
+
121
+ ## Intended uses & limitations
122
+
123
+ More information needed
124
+
125
+ ## Training and evaluation data
126
+
127
+ More information needed
128
+
129
+ ## Training procedure
130
+
131
+ ### Training hyperparameters
132
+
133
+ The following hyperparameters were used during training:
134
+ - learning_rate: 5e-05
135
+ - train_batch_size: 1
136
+ - eval_batch_size: 1
137
+ - seed: 42
138
+ - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
139
+ - lr_scheduler_type: cosine
140
+ - lr_scheduler_warmup_steps: 3
141
+ - training_steps: 72
142
+
143
+ ### Training results
144
+
145
+
146
+
147
+ ### Framework versions
148
+
149
+ - Transformers 4.52.4
150
+ - Pytorch 2.6.0+cu124
151
+ - Datasets 3.6.0
152
+ - Tokenizers 0.21.1