How did you manage to train this on a T4 alone?
I was shocked to find that you had managed to train all of this on a single T4 GPU. I made an attempt on fine tuning on Colab before, with the T4; but the process took 7 hours and I only had 2 hours and 30 minutes left of compute. How did your fine tuning manage to go through 22 days without all of your fine tuning progress deleted? I am interested.
I created it 22 days ago, it didn't take 22 days to train. It took around 30 minutes on colab.
Just tested the model on Spaces. I can say that the output is much more clear and easier to understand than the base Llama 3.1 8B. Here are some screenshots:
Your model:
Base
As you can see, the base model just spits out the SQL schema, without explain the logic and functionality behind it, whereas Artificium explains how each field works.
Note: I chatted with the base and Artificium for a while and I found that Artificium is more step-by-step based than the base model. Maybe it is different for others, but for me it is this way.