Commit History
support for mamba (#915)
		40a6362
	
		
		unverified
	fix(tokenizer): handle fast tokenizer properly for bos/eos (#914)
		fde091c
	
		
		unverified
	Support device_map=sequential & max_memory config parameters (#903)
		992e742
	
		
		unverified
	Feat(wandb): Refactor to be more flexible (#767)
		a1da39c
	
		
		unverified
	feature: loss watchdog for terminating training runs that are failing (#899)
		58ec8b1
	
		
		unverified
	fix for qwen w lora (#906)
		3e3229e
	
		
		unverified
	Determine FSDP/deepspeed settings on device select. (#883)
		71b7ea3
	
		
		unverified
	Feat: Add Qwen (#894)
		1115c50
	
		
		unverified
	fix: warning should not show if eval_batch_size not provided (#896)
		7ee3c4c
	
		
		unverified
	Feat: Add warmup_ratio (#893)
		fb12895
	
		
		unverified
	fix: revert local dir dataset load (#878)
		575a082
	
		
		unverified
	Phi update 202311 (#876)
		9bf854e
	
		
		unverified
	don't train if eval split is too small (#873)
		797f3dd
	
		
		unverified
	Feat: Add dataset loading from S3, GCS (#765)
		3cc67d2
	
		
		unverified
	allow overriding of model_config parameters from the YML (#853)
		1bc1186
	
		
		unverified
	multipack len should use max, not min (#863)
		0c2a630
	
		
		unverified
	various bugfixes (#856)
		1470650
	
		
		unverified
	cleanup the old multipack dataloader (#841)
		1a6309c
	
		
		unverified
	multipack w batch sampler  (#795)
		641e6f7
	
		
		unverified
	use accelerate logging for zero/main loggin only
		b2430ce
	
		
		
	cleanup verbosity a bit
		4c834bf
	
		
		
	update table for rwkv4 support, fix process count for dataset (#822)
		cdc71f7
	
		
		unverified
	fix model parallel (#816)
		964d858
	
		
		unverified
	fix(tokenizer): update log order after update (#806)
		10388a8
	
		
		unverified
	fix(config): Set eos/bos to tokenizer if different (#801)
		637ed09
	
		
		unverified
	refactor neft patch to be more re-usable similar to trl's impl (#796)
		827ec3d
	
		
		unverified
	Create preprocess CLI (#785)
		e50ab07
	
		
		unverified
	Threaded MultipackDistributedDataloader with prefetched samples (#759)
		05bd6f1
	
		
		unverified
	chore: refactor truthy check and fix mypy (#780)
		11d1d60
	
		
		unverified
	refactor setup trainer so we can add more hooks (#773)
		6c81c61
	
		
		unverified
	simplify by removing duplicate base_model_config (#772)
		2d8def6
	
		
		unverified
	Fix: Warn when fullfinetune without adapter (#770)
		44c9d01
	
		
		unverified
	convert exponential notation lr to floats (#771)
		ca84cca
	
		
		unverified
	Fix: eval table conflict with eval_sample_packing (#769)
		9923b72
	
		
		unverified
	Implement fused modules (#747)
		15d3a65
	
		
		unverified
	add a latest tag for regular axolotl image, cleanup extraneous print statement (#746)
		70157cc
	
		
		unverified
	Fix(model): Linear detected and added to target module with rope linear (#738)
		440c3ab
	
		
		unverified
	catch ConnectionError when checking dataset from HuggingFace (#743)
		992d57f
	
		
		unverified
	
		Napuh
		
	commited on
		
		
fixes for alpaca w chatml, and don't include attention_mask w mistral for flash attention (#728)
		3553172
	
		
		unverified
	add noisy embedding (#721)
		3bd9528
	
		
		unverified
	
		Maxime
		
		Maxime
		
	commited on
		
		
improve handling of the prepared ds path and other cfg defaults (#701)
		1c412c7
	
		
		unverified
	Save Axolotl config as WandB artifact (#716)
		490923f
	
		
		unverified
	
		Jan Philipp Harries
		
	commited on