Undi95
/

LewdMistral-7B-0.2

@@ -1,167 +1,36 @@
 ---
-base_model: alpindale/Mistral-7B-v0.2-hf
 tags:
-- generated_from_trainer
-model-index:
-- name: out
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
-<details><summary>See axolotl config</summary>
-axolotl version: `0.4.0`
-```yaml
-base_model: alpindale/Mistral-7B-v0.2-hf
-model_type: MistralForCausalLM
-tokenizer_type: LlamaTokenizer
-load_in_8bit: false
-load_in_4bit: false
-strict: false
-datasets:
-  - path: ./datasets/ToxicQAFinal.parquet
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/aesir-3-sfw_names-replaced.json
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/aesir-3-nsfw_names-replaced.json
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/aesir2_modified_sharegpt.json
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/aesir_modified_sharegpt.json
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/no-robots-sharegpt-fixed.jsonl
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/bluemoon.train.json
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/toxicsharegpt-NoWarning.jsonl
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/LimaRP-ShareGPT.json
-    type: sharegpt
-    conversation: chatml
-  - path: ./datasets/CapybaraPure_Decontaminated-ShareGPT.json
-    type: sharegpt
-    conversation: chatml
-dataset_prepared_path:
-val_set_size: 0.05
-output_dir: ./out
-sequence_len: 8192
-sample_packing: true
-pad_to_sequence_len: true
-gradient_checkpointing_kwargs:
-  use_reentrant: true
-wandb_project: MistralMaid-7B-0.2
-wandb_entity:
-wandb_watch:
-wandb_name:
-wandb_log_model:
-gradient_accumulation_steps: 1
-micro_batch_size: 3
-num_epochs: 2
-optimizer: adamw_bnb_8bit
-lr_scheduler: cosine
-learning_rate: 0.000005
-train_on_inputs: true
-group_by_length: false
-bf16: auto
-fp16:
-tf32: false
-gradient_checkpointing: true
-early_stopping_patience:
-resume_from_checkpoint:
-local_rank:
-logging_steps: 1
-xformers_attention:
-flash_attention: true
-warmup_steps: 10
-evals_per_epoch: 4
-eval_table_size:
-saves_per_epoch: 1
-debug:
-deepspeed:
-weight_decay: 0.0
-fsdp:
-fsdp_config:
-special_tokens:
-  bos_token: "<s>"
-  eos_token: "</s>"
-  unk_token: "<unk>"
 ```
-</details><br>
-# out
-This model is a fine-tuned version of [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 1.1414
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-06
-- train_batch_size: 3
-- eval_batch_size: 3
-- seed: 42
-- distributed_type: multi-GPU
-- num_devices: 2
-- total_train_batch_size: 6
-- total_eval_batch_size: 6
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 10
-- num_epochs: 2
-### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 1.4494        | 0.0   | 1    | 1.4125          |
-| 1.2942        | 0.25  | 296  | 1.1561          |
-| 1.3496        | 0.5   | 592  | 1.1433          |
-| 1.0723        | 0.75  | 888  | 1.1374          |
-| 1.3354        | 1.0   | 1184 | 1.1313          |
-| 0.9644        | 1.24  | 1480 | 1.1415          |
-| 1.1276        | 1.49  | 1776 | 1.1412          |
-| 0.9386        | 1.74  | 2072 | 1.1414          |
-### Framework versions
-- Transformers 4.40.0.dev0
-- Pytorch 2.0.1+cu118
-- Datasets 2.18.0
-- Tokenizers 0.15.0

 ---
+license: cc-by-nc-4.0
 tags:
+- not-for-all-audiences
+- nsfw
 ---
+<!-- description start -->
+## Description
+This repo contains fp16 files of LewdMistral-7B-0.2.
+It's a full finetune (on 2 epoch) of [Mistral-7B-v0.2](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) based on multiple RP datasets.
+It was made for being merged with old 0.1 model as an experiment to see if it would be possible to add new data from 0.2 into 0.1 finetunes, but since it's usable, I let is open for further train/merging.
+It was used to create [BigL](https://huggingface.co/Undi95/BigL-7B), a model who take Mistral 0.2 7B as a base, but merged with Mistral 0.1 finetunes.
+<!-- description end -->
+<!-- prompt-template start -->
+## Prompt template: Alpaca
 ```
+Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction:
+{system prompt}
+### Input:
+{prompt}
+### Response:
+{output}
+```
+If you want to support me, you can [here](https://ko-fi.com/undiai).