Text Classification
Transformers
Safetensors
English
bert
Inference Endpoints

Choice on pretrained model and fine-tuning.

#3
by Avditvs - opened

Hi !
The technical paper does not really elaborate on the choice of the backbone (snowflake model) for training the classifier as well as why it was frozen. Could you give more details about the implementation choices ?

HuggingFaceFW org

Hi @Avditvs ! We've experimented with RoBERTa, mixedbread-ai/mxbai-embed-large-v1, and the snowflake models. Due to significant amounts of noise (+/- 1 point) in the Llama annotations, a frozen encoder helped prevent overfitting, and (counter-intuitively) a retrieval-focused snowflake model worked best. Also snowflake-arctic-embed-m performed just as well as snowflake-arctic-embed-l, so we went with it to save on compute.

@anton-l thanks for this elaboration! any consideration to use regression instead of classification?

Sign up or log in to comment