swe-xml
This model is a fine-tuned version of Qwen/Qwen2.5-Coder-7B-Instruct on the swe-xml dataset. It achieves the following results on the evaluation set:
- Loss: 0.1605
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 4
- total_eval_batch_size: 4
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 1
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.2309 | 0.0156 | 100 | 0.2555 |
0.2066 | 0.0311 | 200 | 0.2431 |
0.2334 | 0.0467 | 300 | 0.2352 |
0.2611 | 0.0622 | 400 | 0.2318 |
0.2485 | 0.0778 | 500 | 0.2280 |
0.2496 | 0.0933 | 600 | 0.2243 |
0.2798 | 0.1089 | 700 | 0.2193 |
0.2143 | 0.1244 | 800 | 0.2171 |
0.2127 | 0.1400 | 900 | 0.2187 |
0.1501 | 0.1555 | 1000 | 0.2137 |
0.1507 | 0.1711 | 1100 | 0.2100 |
0.3055 | 0.1866 | 1200 | 0.2101 |
0.1649 | 0.2022 | 1300 | 0.2087 |
0.1152 | 0.2177 | 1400 | 0.2055 |
0.1799 | 0.2333 | 1500 | 0.2038 |
0.1547 | 0.2488 | 1600 | 0.2037 |
0.2323 | 0.2644 | 1700 | 0.1994 |
0.1962 | 0.2799 | 1800 | 0.1943 |
0.1785 | 0.2955 | 1900 | 0.1958 |
0.1977 | 0.3110 | 2000 | 0.1913 |
0.1919 | 0.3266 | 2100 | 0.1889 |
0.1463 | 0.3421 | 2200 | 0.1894 |
0.1946 | 0.3577 | 2300 | 0.1892 |
0.1867 | 0.3733 | 2400 | 0.1869 |
0.1452 | 0.3888 | 2500 | 0.1855 |
0.1442 | 0.4044 | 2600 | 0.1839 |
0.1449 | 0.4199 | 2700 | 0.1840 |
0.109 | 0.4355 | 2800 | 0.1816 |
0.1445 | 0.4510 | 2900 | 0.1804 |
0.1717 | 0.4666 | 3000 | 0.1797 |
0.1591 | 0.4821 | 3100 | 0.1795 |
0.1177 | 0.4977 | 3200 | 0.1793 |
0.221 | 0.5132 | 3300 | 0.1781 |
0.148 | 0.5288 | 3400 | 0.1780 |
0.1365 | 0.5443 | 3500 | 0.1779 |
0.2491 | 0.5599 | 3600 | 0.1728 |
0.108 | 0.5754 | 3700 | 0.1722 |
0.1334 | 0.5910 | 3800 | 0.1728 |
0.1057 | 0.6065 | 3900 | 0.1714 |
0.1513 | 0.6221 | 4000 | 0.1702 |
0.0988 | 0.6376 | 4100 | 0.1697 |
0.2126 | 0.6532 | 4200 | 0.1681 |
0.2117 | 0.6687 | 4300 | 0.1687 |
0.2683 | 0.6843 | 4400 | 0.1671 |
0.1124 | 0.6998 | 4500 | 0.1649 |
0.2138 | 0.7154 | 4600 | 0.1651 |
0.2013 | 0.7309 | 4700 | 0.1638 |
0.0985 | 0.7465 | 4800 | 0.1646 |
0.1566 | 0.7621 | 4900 | 0.1638 |
0.1004 | 0.7776 | 5000 | 0.1641 |
0.1242 | 0.7932 | 5100 | 0.1632 |
0.1069 | 0.8087 | 5200 | 0.1623 |
0.1956 | 0.8243 | 5300 | 0.1616 |
0.1319 | 0.8398 | 5400 | 0.1616 |
0.0767 | 0.8554 | 5500 | 0.1611 |
0.1163 | 0.8709 | 5600 | 0.1610 |
0.0927 | 0.8865 | 5700 | 0.1607 |
0.1271 | 0.9020 | 5800 | 0.1607 |
0.0913 | 0.9176 | 5900 | 0.1604 |
0.1398 | 0.9331 | 6000 | 0.1603 |
0.1328 | 0.9487 | 6100 | 0.1605 |
0.1169 | 0.9642 | 6200 | 0.1603 |
0.1498 | 0.9798 | 6300 | 0.1604 |
0.1662 | 0.9953 | 6400 | 0.1603 |
Framework versions
- Transformers 4.46.1
- Pytorch 2.6.0+cu124
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support