Spaces:

Dovakiins
/

qwerrwe

Build error

App Files Files Community

Nanobit commited on Mar 25, 2024

Commit

f1ebaa0

unverified ·

1 Parent(s): 34ba634

chore(config): refactor old mistral config (#1435)

Browse files

* chore(config): refactor old mistral config

* chore: add link to colab on readme

Files changed (7) hide show

README.md +5 -0
examples/mistral/Mistral-7b-example/README.md +0 -12
examples/mistral/Mistral-7b-example/code.ipynb +0 -0
examples/mistral/Mistral-7b-example/data.jsonl +0 -10
examples/mistral/config.yml +0 -3
examples/mistral/{Mistral-7b-example/config.yml → lora.yml} +27 -24
examples/mistral/qlora.yml +0 -3

README.md CHANGED Viewed

@@ -32,6 +32,7 @@ Features:
   - [Bare Metal Cloud GPU](#bare-metal-cloud-gpu)
   - [Windows](#windows)
   - [Mac](#mac)
   - [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
 - [Dataset](#dataset)
   - [How to Add Custom Prompts](#how-to-add-custom-prompts)
@@ -269,6 +270,10 @@ pip3 install -e '.'
 ```
 More info: [mac.md](/docs/mac.qmd)
 #### Launching on public clouds via SkyPilot
 To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html):

   - [Bare Metal Cloud GPU](#bare-metal-cloud-gpu)
   - [Windows](#windows)
   - [Mac](#mac)
+  - [Google Colab](#google-colab)
   - [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
 - [Dataset](#dataset)
   - [How to Add Custom Prompts](#how-to-add-custom-prompts)
 ```
 More info: [mac.md](/docs/mac.qmd)
+#### Google Colab
+Please use this example [notebook](examples/colab-notebooks/colab-axolotl-example.ipynb).
 #### Launching on public clouds via SkyPilot
 To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html):

examples/mistral/Mistral-7b-example/README.md DELETED Viewed

@@ -1,12 +0,0 @@
-# Description
-This repository presents an in-depth guide for fine-tuning Mistral-7b or any other compatible model using Axolotl, tailored specifically for chatbot development. It streamlines the process of fine-tuning and uploading the enhanced model to HuggingFace 🤗, thereby serving as an invaluable tool for developers in the AI and chatbot domain.
-**What’s Inside:**
-Beginner-Friendly Instructions: Comprehensive steps to guide you through fine-tuning your chosen model, including details on the data structure (jsonl), configuration, and the code itself.
-Hardware Utilized: For reference, the fine-tuning in this guide was performed using 4x NVIDIA GeForce RTX 3090 (rented 2.1.2-cuda12.1-cudnn8-devel).
-**Uploading to HuggingFace 🤗:**
-To upload your fine-tuned model to Hugging Face, include the following files:
-![Screenshot 2024-01-19 213932](https://github.com/OpenAccess-AI-Collective/axolotl/assets/138583191/d660eb84-2d76-46a1-9846-cf0aeb3006d9)

examples/mistral/Mistral-7b-example/code.ipynb DELETED Viewed

The diff for this file is too large to render. See raw diff

examples/mistral/Mistral-7b-example/data.jsonl DELETED Viewed

@@ -1,10 +0,0 @@
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: Who is the Founder of Apple\""}, {"from": "gpt", "value": "\"<Chatbot>: The founder of Apple is Steve Jobs\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the capital of France?\""}, {"from": "gpt", "value": "\"<Chatbot>: The capital of France is Paris.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: How far is the Moon from Earth?\""}, {"from": "gpt", "value": "\"<Chatbot>: The Moon is approximately 384,400 kilometers from Earth.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the tallest mountain in the world?\""}, {"from": "gpt", "value": "\"<Chatbot>: The tallest mountain in the world is Mount Everest.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: Who wrote Romeo and Juliet?\""}, {"from": "gpt", "value": "\"<Chatbot>: Romeo and Juliet was written by William Shakespeare.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the boiling point of water?\""}, {"from": "gpt", "value": "\"<Chatbot>: The boiling point of water is 100 degrees Celsius.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: When was the first man on the moon?\""}, {"from": "gpt", "value": "\"<Chatbot>: The first man landed on the moon in 1969.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the largest ocean?\""}, {"from": "gpt", "value": "\"<Chatbot>: The largest ocean is the Pacific Ocean.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: Who invented the telephone?\""}, {"from": "gpt", "value": "\"<Chatbot>: The telephone was invented by Alexander Graham Bell.\""}]}
-{"conversations": [{"from": "Customer", "value": "\"<Customer>: What is the formula for water?\""}, {"from": "gpt", "value": "\"<Chatbot>: The chemical formula for water is H2O.\""}]}

examples/mistral/config.yml CHANGED Viewed

@@ -56,6 +56,3 @@ weight_decay: 0.0
 fsdp:
 fsdp_config:
 special_tokens:
-  bos_token: "<s>"
-  eos_token: "</s>"
-  unk_token: "<unk>"

 fsdp:
 fsdp_config:
 special_tokens:

examples/mistral/{Mistral-7b-example/config.yml → lora.yml} RENAMED Viewed

@@ -1,4 +1,3 @@
-#Mistral-7b
 base_model: mistralai/Mistral-7B-v0.1
 model_type: MistralForCausalLM
 tokenizer_type: LlamaTokenizer
@@ -8,26 +7,32 @@ load_in_4bit: false
 strict: false
 datasets:
-  - path: tilemachos/Demo-Dataset #Path to json dataset file in huggingface
-    #for type,conversation arguments read axolotl readme and pick what is suited for your project, I wanted a chatbot and put sharegpt and chatml
-    type: sharegpt
-    conversation: chatml
-dataset_prepared_path: tilemachos/Demo-Dataset #Path to json dataset file in huggingface
-val_set_size: 0.05
-output_dir: ./out
-#using lora for lower cost
 adapter: lora
-lora_r: 8
 lora_alpha: 16
 lora_dropout: 0.05
 lora_target_modules:
   - q_proj
   - v_proj
-sequence_len: 512
-sample_packing: false
-pad_to_sequence_len: true
 wandb_project:
 wandb_entity:
@@ -35,18 +40,17 @@ wandb_watch:
 wandb_name:
 wandb_log_model:
-#only 2 epochs because of small dataset
-gradient_accumulation_steps: 3
 micro_batch_size: 2
-num_epochs: 2
 optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
 learning_rate: 0.0002
 train_on_inputs: false
 group_by_length: false
-bf16: true
-fp16: false
 tf32: false
 gradient_checkpointing: true
@@ -57,18 +61,17 @@ logging_steps: 1
 xformers_attention:
 flash_attention: true
 warmup_steps: 10
 evals_per_epoch: 4
 eval_table_size:
 eval_max_new_tokens: 128
 saves_per_epoch: 1
 debug:
-#default deepspeed, can use more aggresive if needed like zero2, zero3
-deepspeed: deepspeed_configs/zero1.json
 weight_decay: 0.0
 fsdp:
 fsdp_config:
 special_tokens:
-  bos_token: "<s>"
-  eos_token: "</s>"
-  unk_token: "<unk>"

 base_model: mistralai/Mistral-7B-v0.1
 model_type: MistralForCausalLM
 tokenizer_type: LlamaTokenizer
 strict: false
 datasets:
+  - path: mhenrichsen/alpaca_2k_test
+    type: alpaca
+dataset_prepared_path: last_run_prepared
+val_set_size: 0.1
+output_dir: ./lora-out
 adapter: lora
+lora_model_dir:
+sequence_len: 8192
+sample_packing: true
+pad_to_sequence_len: true
+lora_r: 32
 lora_alpha: 16
 lora_dropout: 0.05
+lora_target_linear: true
+lora_fan_in_fan_out:
 lora_target_modules:
+  - gate_proj
+  - down_proj
+  - up_proj
   - q_proj
   - v_proj
+  - k_proj
+  - o_proj
 wandb_project:
 wandb_entity:
 wandb_name:
 wandb_log_model:
+gradient_accumulation_steps: 4
 micro_batch_size: 2
+num_epochs: 1
 optimizer: adamw_bnb_8bit
 lr_scheduler: cosine
 learning_rate: 0.0002
 train_on_inputs: false
 group_by_length: false
+bf16: auto
+fp16:
 tf32: false
 gradient_checkpointing: true
 xformers_attention:
 flash_attention: true
+loss_watchdog_threshold: 5.0
+loss_watchdog_patience: 3
 warmup_steps: 10
 evals_per_epoch: 4
 eval_table_size:
 eval_max_new_tokens: 128
 saves_per_epoch: 1
 debug:
+deepspeed:
 weight_decay: 0.0
 fsdp:
 fsdp_config:
 special_tokens:

examples/mistral/qlora.yml CHANGED Viewed

@@ -75,6 +75,3 @@ weight_decay: 0.0
 fsdp:
 fsdp_config:
 special_tokens:
-  bos_token: "<s>"
-  eos_token: "</s>"
-  unk_token: "<unk>"

 fsdp:
 fsdp_config:
 special_tokens: