Curiosity-16

Model Summary

  • Parameters: 404M Parameters

  • Base: GPT-2 Medium (Decoder)

  • Tokenizer: AutoTokenizer

  • Training: 2-Phase Full SFT

  • Purpose: Research Model -- Proof of Concept

  • Strengths: Short factual responses, small stories, basic reasoning

  • Limitations: Hard-limit at 1-2 Sentences, tends to misunderstand, no safety filter, prone to hallucinate

Description

  • Curiosity-16 is a small research model (based on pre-trained GPT-2 Medium) that has 404M Parameters. It uses training samples from 11 diverse HuggingFace datasets.
Downloads last month
190
Safetensors
Model size
355M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ariankharazmi/Curiosity-16

Finetuned
(127)
this model
Quantizations
1 model

Space using ariankharazmi/Curiosity-16 1