Commit History
support for mamba (#915)
40a6362
unverified
Feat(wandb): Refactor to be more flexible (#767)
a1da39c
unverified
feature: loss watchdog for terminating training runs that are failing (#899)
58ec8b1
unverified
fix: remove FA for qwen examples (#900)
a48dbf6
unverified
Feat: Add Qwen (#894)
1115c50
unverified
Phi update 202311 (#876)
9bf854e
unverified
various bugfixes (#856)
1470650
unverified
don't compile deepspeed or bitsandbytes from source (#837)
f544ab2
unverified
fix eval_steps to be a sane default (#797)
8b79ff0
unverified
disable eval table w sample packing in examples (#778)
9b43e7e
unverified
simplify by removing duplicate base_model_config (#772)
2d8def6
unverified
Implement fused modules (#747)
15d3a65
unverified
Fix: lowercase `True` values in config (#713)
ace70b3
unverified
atgctg
commited on
Get qlora mistral-7b fine tuning working on a single 4090 (#708)
295b266
unverified
lukemarsden
commited on
fix unneeded space (#699)
f91db19
unverified
lint
83a950b
unverified
new lr, sample pack
4c8ddf2
Fix: Higher vram usage for mistral and sample_packing (#691)
669f1d0
unverified
Adding qlora config for Mistral (#675)
d4a88e4
unverified
Abhishek Mishra
commited on
prepared dataset caching, other misc fixes (#665)
e50a64e
unverified
Update mistral/README.md (#647)
b88f515
unverified
Adarsh Shirawalmath
commited on
Feat: Add example for Mistral (#644)
eb41f76
unverified
eval_table isn't quite stable enough to be in default llama configs (#637)
d887ad8
unverified
Feat: Add support for upstream FA2 (#626)
19a600a
unverified
default model changed
4fecbfe
support to disable exllama for gptq (#604)
faecff9
unverified
more sane defaults for openllama 3b used for quickstarts (#602)
674c576
unverified
btlm and falcon monkey patches for flash attn (#566)
6b9b229
unverified
make phi training work with Loras (#588)
62eaee7
unverified
Support Sample packing for phi arch (#586)
12a2dbb
unverified
Fix Codellama examples (#582)
1aa4007
unverified
Doan Minh Phuong
commited on
Phi examples (#569)
2284209
unverified
Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified
recommend padding when using sample packing (#531)
3437149
unverified
Add support for GPTQ using native transformers/peft (#468)
3355706
unverified
pad_to_worst_case_seq_len boolean, for testing memory limits (#498)
8e197f6
unverified
Feat(cfg): Add code-llama configs for all sizes (#479)
3513071
unverified
Add example Llama 2 ReLoRA config (#471)
fe4d6ba
unverified
improve llama pad token handling (#475)
cb9797e
unverified
don't use mask expansion for inference (#392)
1687be6
unverified
new llama-2 default settings (#370)
fdffef5
unverified
Add wandb_entity to wandb options, update example configs, update README (#361)
7019509
unverified
set group_by_length to false in examples
36fefcf
feat/llama-2 examples (#319)
dc71d88
unverified
Add XGen info to README and example config
3881143
Ethan Smith
commited on