The base of this model is Llama-3.2-3B-Instruct, using TopiOCQA as the training data, and the training method is ConvSearch-R1.

The code is available here. Please refer to the paper here.

Downloads last month
48
Safetensors
Model size
3.21B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support