numind/NuExtract-2.0-4B · Why is NuExtract-2.0-8B is inferior than 4B?

12 days ago

•

I used this model recently and noticed that 4B model performs way better than 8B while underlying VL models are of same family Qwen2.5 with varying capacity like 3B and 7B model respectively.

BTW. I love the way this model work. For now, I am passing field description in place of data type in template and it is working for my use case. However, it would be great if there's a way to provide description of the field which we want to extract.

liamcripwell

NuMind org 12 days ago

Thanks for trying out the models -- it's good to hear you are liking them. :)

I'm not entirely sure why the 4B is out-performing the 8B for you. The 8B is reliably a bit stronger than the 4B across our benchmarks, so it might be something specific to your domain. Can you describe what kind of problem/data you are working on?

Btw, yes, having an official way to provide field descriptions is something we have been asked for a lot and so we are working to implement this feature asap.

Appreciate the feedback!

ikiransuryavanshi

12 days ago

@liamcripwell thanks for the prompt response. I am currently using this model to extract data from insurance document like insured name, start date, end date, policy number, premium amount, etc.

Also, may be not the right thread, but flash-attn 2.8.1 package was recently released on 10th July which is causing model to fail. I had to use old version i.e., 2.7.3 to make it work.