7b-claude-32k-20250418_213809

This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 8
total_eval_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1.0

Training Loss	Epoch	Step	Validation Loss
0.3222	0.0141	1	0.3511
0.4486	0.0282	2	0.3452
0.2804	0.0423	3	0.3267
0.4094	0.0563	4	0.3112
0.3197	0.0704	5	0.3251
0.2778	0.0845	6	0.3244
0.3485	0.0986	7	0.3308
0.3833	0.1127	8	0.3231
0.3737	0.1268	9	0.3134
0.2953	0.1408	10	0.3019
0.2027	0.1549	11	0.2955
0.3399	0.1690	12	0.2910
0.351	0.1831	13	0.2871
0.2662	0.1972	14	0.2842
0.2418	0.2113	15	0.2817
0.3415	0.2254	16	0.2793
0.2894	0.2394	17	0.2771
0.2592	0.2535	18	0.2755
0.2447	0.2676	19	0.2745
0.2156	0.2817	20	0.2735
0.3091	0.2958	21	0.2721
0.3068	0.3099	22	0.2708
0.2718	0.3239	23	0.2697
0.2443	0.3380	24	0.2688
0.316	0.3521	25	0.2682
0.2631	0.3662	26	0.2677
0.2587	0.3803	27	0.2672
0.2229	0.3944	28	0.2664
0.311	0.4085	29	0.2656
0.2335	0.4225	30	0.2649
0.2943	0.4366	31	0.2642
0.2872	0.4507	32	0.2636
0.2651	0.4648	33	0.2630
0.2843	0.4789	34	0.2626
0.2546	0.4930	35	0.2622
0.2543	0.5070	36	0.2618
0.258	0.5211	37	0.2612
0.2969	0.5352	38	0.2607
0.204	0.5493	39	0.2604
0.3239	0.5634	40	0.2601
0.3191	0.5775	41	0.2597
0.2775	0.5915	42	0.2592
0.2527	0.6056	43	0.2587
0.3035	0.6197	44	0.2583
0.2455	0.6338	45	0.2578
0.2246	0.6479	46	0.2575
0.2679	0.6620	47	0.2572
0.1884	0.6761	48	0.2568
0.2283	0.6901	49	0.2566
0.2048	0.7042	50	0.2564
0.2565	0.7183	51	0.2562
0.1984	0.7324	52	0.2560
0.2199	0.7465	53	0.2559
0.2397	0.7606	54	0.2557
0.3541	0.7746	55	0.2556
0.2457	0.7887	56	0.2554
0.2237	0.8028	57	0.2553
0.2924	0.8169	58	0.2552
0.3507	0.8310	59	0.2551
0.2563	0.8451	60	0.2549
0.2409	0.8592	61	0.2549
0.2262	0.8732	62	0.2548
0.206	0.8873	63	0.2547
0.2973	0.9014	64	0.2546
0.2448	0.9155	65	0.2546
0.2217	0.9296	66	0.2545
0.2712	0.9437	67	0.2545
0.3814	0.9577	68	0.2545
0.2861	0.9718	69	0.2545
0.2529	0.9859	70	0.2545
0.1975	1.0	71	0.2545