You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

image/png

Overview

HyperCLOVAX-SEED-Text-Instruct-1.5B is a model developed by NAVER that can understand and generate text. It demonstrates competitive performance on major benchmarks related to Korean language and culture. In addition, it supports a context length of up to 16k tokens, enabling it to handle a wide range of tasks.

Basic Information

  • Model Architecture: Transformer-based architecture (Dense Model)
  • Number of Parameters: 1.5B
  • Input/Output Format: Text / Text (both input and output are in text format)
  • Context Length: 16k
  • Knowledge Cutoff Date: The model was trained on data prior to August 2024.

Training and Data

The training data for HyperCLOVAX-Seed-Instruct-1.5B consists of diverse sources, including high-quality datasets. The training process was carried out in four main stages: Pretraining Stage 1, where the model learns from a large volume of documents; Pretraining Stage 2, which focuses on additional training with high-quality data; Rejection sampling Fine-Tuning (RFT), aimed at enhancing the modelโ€™s knowledge across various domains and its complex reasoning abilities; and Supervised Fine-Tuning (SFT), which improves the modelโ€™s instruction-following capabilities. Furthermore, due to the characteristics of smaller models, vulnerability to long-context handling was observed. To address this, reinforcement for long-context understanding was incorporated from the pretraining stages through to the SFT stage, enabling the model to stably support context lengths of up to 16k tokens.

Benchmark

Model KMMLU (5-shot, acc) HAE-RAE (5-shot, acc) CLiCK (5-shot, acc) KoBEST (5-shot, acc)
HyperCLOVAX-SEED-Text-Base-1.5B 0.4181 0.6370 0.5373 0.6963
HyperCLOVAX-SEED-Text-Instruct-1.5B 0.3933 0.5674 0.4947 0.6490
Qwen2.5-1.5B-instruct 0.3696 0.5160 0.4772 0.5968
gemma-3-1b-it 0.3075 0.3648 0.3724 0.5869

Huggingface Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("/path/to/ckpt")
tokenizer = AutoTokenizer.from_pretrained("/path/to/ckpt")

chat = [
  {"role": "tool_list", "content": ""},
  {"role": "system", "content": "- AI ์–ธ์–ด๋ชจ๋ธ์˜ ์ด๋ฆ„์€ \"CLOVA X\" ์ด๋ฉฐ ๋„ค์ด๋ฒ„์—์„œ ๋งŒ๋“ค์—ˆ๋‹ค.\n- ์˜ค๋Š˜์€ 2025๋…„ 04์›” 24์ผ(๋ชฉ)์ด๋‹ค."},
  {"role": "user", "content": "์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹๊ณผ ์–‘์ž์—ญํ•™์˜ ๊ด€๊ณ„๋ฅผ ์ตœ๋Œ€ํ•œ ์ž์„ธํžˆ ์•Œ๋ ค์ค˜."},
]

inputs = tokenizer.apply_chat_template(chat, add_generation_prompt=True, return_dict=True, return_tensors="pt")
output_ids = model.generate(**inputs, max_length=1024, stop_strings=["<|endofturn|>", "<|stop|>"], tokenizer=tokenizer)
print(tokenizer.batch_decode(output_ids))
Downloads last month
0
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B

Finetunes
1 model
Quantizations
3 models

Collection including naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-1.5B