Commits · Dovakiins/qwerrwe

Mixtral multipack (#928)

68b227a
unverified

winglian commited on Dec 10, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

kallewooof Karl-Johan Alm commited on Dec 4, 2023

fix: remove FA for qwen examples (#900)

a48dbf6
unverified

Nanobit commited on Nov 27, 2023

Feat: Add Qwen (#894)

1115c50
unverified

Nanobit commited on Nov 25, 2023

Phi update 202311 (#876)

9bf854e
unverified

winglian commited on Nov 17, 2023

various bugfixes (#856)

1470650
unverified

winglian commited on Nov 15, 2023

don't compile deepspeed or bitsandbytes from source (#837)

f544ab2
unverified

winglian commited on Nov 9, 2023

fix eval_steps to be a sane default (#797)

8b79ff0
unverified

winglian commited on Oct 28, 2023

disable eval table w sample packing in examples (#778)

9b43e7e
unverified

winglian commited on Oct 23, 2023

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

winglian commited on Oct 23, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

Fix: lowercase `True` values in config (#713)

ace70b3
unverified

atgctg commited on Oct 10, 2023

Get qlora mistral-7b fine tuning working on a single 4090 (#708)

295b266
unverified

lukemarsden commited on Oct 10, 2023

fix unneeded space (#699)

f91db19
unverified

mhenrichsen commited on Oct 7, 2023

lint

83a950b
unverified

mhenrichsen commited on Oct 7, 2023

new lr, sample pack

4c8ddf2

mhenrichsen commited on Oct 6, 2023

Fix: Higher vram usage for mistral and sample_packing (#691)

669f1d0
unverified

Nanobit commited on Oct 6, 2023

Adding qlora config for Mistral (#675)

d4a88e4
unverified

Abhishek Mishra commited on Oct 6, 2023

prepared dataset caching, other misc fixes (#665)

e50a64e
unverified

winglian commited on Oct 3, 2023

Update mistral/README.md (#647)

b88f515
unverified

Adarsh Shirawalmath commited on Sep 28, 2023

Feat: Add example for Mistral (#644)

eb41f76
unverified

Nanobit commited on Sep 28, 2023

eval_table isn't quite stable enough to be in default llama configs (#637)

d887ad8
unverified

winglian commited on Sep 26, 2023

Feat: Add support for upstream FA2 (#626)

19a600a
unverified

Nanobit commited on Sep 26, 2023

default model changed

4fecbfe

mhenrichsen commited on Sep 24, 2023

support to disable exllama for gptq (#604)

faecff9
unverified

winglian commited on Sep 19, 2023

more sane defaults for openllama 3b used for quickstarts (#602)

674c576
unverified

winglian commited on Sep 19, 2023

btlm and falcon monkey patches for flash attn (#566)

6b9b229
unverified

winglian commited on Sep 17, 2023

make phi training work with Loras (#588)

62eaee7
unverified

winglian commited on Sep 16, 2023

Support Sample packing for phi arch (#586)

12a2dbb
unverified

winglian commited on Sep 15, 2023

Fix Codellama examples (#582)

1aa4007
unverified

Doan Minh Phuong commited on Sep 15, 2023

Phi examples (#569)

2284209
unverified

winglian commited on Sep 14, 2023

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Glavin001 commited on Sep 13, 2023

recommend padding when using sample packing (#531)

3437149
unverified

winglian commited on Sep 6, 2023

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

winglian commited on Sep 5, 2023

pad_to_worst_case_seq_len boolean, for testing memory limits (#498)

8e197f6
unverified

Birch-san

tmm1 commited on Aug 28, 2023

Feat(cfg): Add code-llama configs for all sizes (#479)

3513071
unverified

mhenrichsen mhenrichsen commited on Aug 27, 2023

Add example Llama 2 ReLoRA config (#471)

fe4d6ba
unverified

chargoddard commited on Aug 27, 2023

improve llama pad token handling (#475)

cb9797e
unverified

winglian commited on Aug 24, 2023

don't use mask expansion for inference (#392)

1687be6
unverified

winglian commited on Aug 15, 2023

new llama-2 default settings (#370)

fdffef5
unverified

mhenrichsen Mads Henrichsen commited on Aug 14, 2023

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Morgan McGuire Morgan McGuire

winglian commited on Aug 12, 2023

set group_by_length to false in examples

36fefcf

tmm1 commited on Aug 7, 2023

feat/llama-2 examples (#319)

dc71d88
unverified

mhenrichsen Mads Henrichsen commited on Aug 3, 2023

Add XGen info to README and example config

3881143

Ethan Smith commited on Jul 21, 2023

Use AutoTokenizer for redpajama example

945c419

sroecker commited on Jun 14, 2023

Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum

16bb627
unverified

winglian commited on Jun 14, 2023

Merge branch 'main' into flash-optimum

fd2c981
unverified

winglian commited on Jun 12, 2023

tweak config to work

2ba4ae8

winglian commited on Jun 12, 2023

Commit History

Mixtral multipack (#928) 68b227a unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified

fix: remove FA for qwen examples (#900) a48dbf6 unverified

Feat: Add Qwen (#894) 1115c50 unverified

Phi update 202311 (#876) 9bf854e unverified

various bugfixes (#856) 1470650 unverified

don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified

fix eval_steps to be a sane default (#797) 8b79ff0 unverified

disable eval table w sample packing in examples (#778) 9b43e7e unverified

simplify by removing duplicate base_model_config (#772) 2d8def6 unverified

Implement fused modules (#747) 15d3a65 unverified

Fix: lowercase `True` values in config (#713) ace70b3 unverified

Get qlora mistral-7b fine tuning working on a single 4090 (#708) 295b266 unverified

fix unneeded space (#699) f91db19 unverified

lint 83a950b unverified

new lr, sample pack 4c8ddf2

Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified

Adding qlora config for Mistral (#675) d4a88e4 unverified

prepared dataset caching, other misc fixes (#665) e50a64e unverified

Update mistral/README.md (#647) b88f515 unverified

Feat: Add example for Mistral (#644) eb41f76 unverified

eval_table isn't quite stable enough to be in default llama configs (#637) d887ad8 unverified

Feat: Add support for upstream FA2 (#626) 19a600a unverified

default model changed 4fecbfe

support to disable exllama for gptq (#604) faecff9 unverified

more sane defaults for openllama 3b used for quickstarts (#602) 674c576 unverified

btlm and falcon monkey patches for flash attn (#566) 6b9b229 unverified

make phi training work with Loras (#588) 62eaee7 unverified

Support Sample packing for phi arch (#586) 12a2dbb unverified

Fix Codellama examples (#582) 1aa4007 unverified

Phi examples (#569) 2284209 unverified

Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified

recommend padding when using sample packing (#531) 3437149 unverified

Add support for GPTQ using native transformers/peft (#468) 3355706 unverified

pad_to_worst_case_seq_len boolean, for testing memory limits (#498) 8e197f6 unverified

Feat(cfg): Add code-llama configs for all sizes (#479) 3513071 unverified

Add example Llama 2 ReLoRA config (#471) fe4d6ba unverified

improve llama pad token handling (#475) cb9797e unverified

don't use mask expansion for inference (#392) 1687be6 unverified

new llama-2 default settings (#370) fdffef5 unverified

Add wandb_entity to wandb options, update example configs, update README (#361) 7019509 unverified

set group_by_length to false in examples 36fefcf

feat/llama-2 examples (#319) dc71d88 unverified

Add XGen info to README and example config 3881143

Use AutoTokenizer for redpajama example 945c419

Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum 16bb627 unverified

Merge branch 'main' into flash-optimum fd2c981 unverified

tweak config to work 2ba4ae8