davidkim205 commited on
Commit
dfb22a1
·
verified ·
1 Parent(s): 6c29c97

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ language:
4
+ - ko
5
+ pipeline_tag: text-generation
6
+ ---
7
+
8
+ # Hunminai-1.0-27b
9
+
10
+ Hunminai-1.0 is a Korean-aligned language model based on [Google's Gemma-3](https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d) architecture. To improve performance on Korean natural language tasks, the model was fine-tuned on a corpus of 100k instruction examples using Supervised Fine-Tuning (SFT) followed by Direct Preference Optimization (DPO). This approach enables the model to better align with user intents in Korean and enhances its applicability to downstream tasks such as dialogue generation, question answering, and long-form text generation.
11
+
12
+ ## Model Details
13
+ - **Base Model**: [google/gemma-3-27b-it](https://huggingface.co/google/gemma-3-27b-it)
14
+ - **Base Model Release Date**: March 12, 2025
15
+ - **Context Length**: 128k
16
+ - **License**: [gemma](https://ai.google.dev/gemma/terms)
17
+ - **Model Type**: Text Generation
18
+ - **Fine-Tuning Techniques**: Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO)
19
+
20
+ ## Usage Application Form
21
+ To use this model, please complete the application form and submit it via email [[email protected]].
22
+ Access will be granted after your application is reviewed and approved.
23
+ We appreciate your cooperation and look forward to assisting you.
24
+ ```
25
+ 1. **Name:**
26
+ - (e.g., John Doe)
27
+ 2. **Date of Birth:**
28
+ - (e.g., January 1, 1990)
29
+ 3. **Affiliation:**
30
+ - Are you applying as a company or an individual? [ ] Company [ ] Individual
31
+ - Company Name (if applicable):
32
+ - Department (if applicable):
33
+ 4. **Position/Role:**
34
+ - (e.g., Data Scientist, Researcher, etc.)
35
+ 5. **Contact Information:**
36
+ - Email:
37
+ - Phone Number:
38
+
39
+ 6. **Purpose of Use:**
40
+ - (e.g., Research and Development, Commercial use, Educational purposes, etc.)
41
+
42
+ 7. **Detailed Reason for Use:**
43
+ - 1. Name and version of the model you wish to use:
44
+ - 2. Reason for selecting this model:
45
+ - 3. Objectives to achieve using this model:
46
+ - 4. Expected use cases (please describe in as much detail as possible):
47
+
48
+ 8. **Data Security and Ethical Use Plan:**
49
+ - (Please describe your plans for data protection and ethical use.)
50
+ ```
51
+
52
+ ## Usage
53
+
54
+ Gemma 3 is supported starting from version 4.50.0 of the Transformers library.
55
+
56
+ To update to the latest version, run the following command:
57
+ ```
58
+ $ pip install -U transformers
59
+ ```
60
+
61
+ Install the required package and run the example code below to load the Hunminai-3-27B model and perform a simple Korean-language chat completion.
62
+
63
+ ```python
64
+ # pip install accelerate
65
+
66
+ from transformers import AutoProcessor, Gemma3ForConditionalGeneration
67
+ import torch
68
+
69
+ model_id = "davidkim205/Hunminai-3-27b"
70
+
71
+ model = Gemma3ForConditionalGeneration.from_pretrained(
72
+ model_id, device_map="auto"
73
+ ).eval()
74
+
75
+ processor = AutoProcessor.from_pretrained(model_id)
76
+
77
+ messages = [
78
+ {
79
+ "role": "system",
80
+ "content": [{"type": "text", "text": "당신은 유용한 AI 비서입니다."}]
81
+ },
82
+ {
83
+ "role": "user",
84
+ "content": [
85
+ {"type": "text", "text": "대한민국의 수도는 어디인가요?"}
86
+ ]
87
+ }
88
+ ]
89
+
90
+ inputs = processor.apply_chat_template(
91
+ messages, add_generation_prompt=True, tokenize=True,
92
+ return_dict=True, return_tensors="pt"
93
+ ).to(model.device, dtype=torch.bfloat16)
94
+
95
+ input_len = inputs["input_ids"].shape[-1]
96
+
97
+ with torch.inference_mode():
98
+ generation = model.generate(**inputs, max_new_tokens=128, do_sample=False)
99
+ generation = generation[0][input_len:]
100
+
101
+ decoded = processor.decode(generation, skip_special_tokens=True)
102
+ print(decoded)
103
+ ```
104
+
105
+ ## Training Dataset
106
+
107
+ The model was trained on approximately 100k high-quality Korean instruction examples. The dataset was curated to cover a wide range of Korean language contexts and tasks, with a focus on aligning model outputs with user intent and natural language generation, and is currently undisclosed.
108
+
109
+
110
+ ## Evaluation
111
+
112
+ ### Benchmarks Datasets
113
+
114
+ The table below contains a description of the Korean LLM evaluation benchmark dataset used for the model evaluation. More information on the benchmarks is available at [Blog](https://davidkim205.github.io/).
115
+
116
+ | Benchmark | Description | Abbreviation |
117
+ |------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
118
+ | [ko-bench](https://huggingface.co/datasets/davidkim205/ko-bench) | Korean-translated dataset of [MT-Bench](https://github.com/lm-sys/FastChat/blob/main/fastchat/llm_judge/data/mt_bench/question.jsonl) questions | bench |
119
+ | [ko-bench-v2](https://huggingface.co/datasets/davidkim205/ko-bench-v2) | Dataset including new questions and answers following the ko-bench format | bench2 |
120
+ | [ko-ged](https://huggingface.co/datasets/davidkim205/ko-ged) | Korean GED (elementary, middle, high school) open-ended question dataset<br/>Subjects: Korean, English, Mathematics, Science, Social Studies | ged |
121
+ | [ko-ged2](https://huggingface.co/datasets/davidkim205/ko-ged2) | Korean GED open-ended question dataset for the 2025 1st Korean GED Exam, covering all subjects | ged2 |
122
+ | [tiny-eval](https://huggingface.co/datasets/davidkim205/tiny-eval) | High-quality evaluation dataset designed to assess overall model performance with a small amount of data | tiny |
123
+ | [ko-ifeval](https://huggingface.co/datasets/davidkim205/ko-ifeval) | Instruction-following evaluation dataset translated from [IFEval](https://github.com/google-research/google-research/tree/master/instruction_following_eval), adapted for Korean language and culture | ifeval |
124
+ | [ko-ged-elementary](https://huggingface.co/datasets/davidkim205/ko-ged-elementary) | Korean elementary school GED multiple-choice question dataset | ged\:E |
125
+ | [ko-ged-middle](https://huggingface.co/datasets/davidkim205/ko-ged-middle) | Korean middle school GED multiple-choice question dataset | ged\:M |
126
+ | [ko-ged-high](https://huggingface.co/datasets/davidkim205/ko-ged-high) | Korean high school GED multiple-choice question dataset | ged\:H |
127
+ | [ko-ged2-elementary](https://huggingface.co/datasets/davidkim205/ko-ged2-middle) | Korean elementary school GED multiple-choice dataset, updated for the 2025 GED Exam | ged2\:E |
128
+ | [ko-ged2-middle](https://huggingface.co/datasets/davidkim205/ko-ged2-elementary) | Korean middle school GED multiple-choice dataset, updated for the 2025 GED Exam | ged2\:M |
129
+ | [ko-ged2-high](https://huggingface.co/datasets/davidkim205/ko-ged2-high) | Korean high school GED multiple-choice dataset, updated for the 2025 GED Exam | ged2\:H |
130
+ | [ko-gpqa](https://huggingface.co/datasets/davidkim205/ko-gpqa) | Korean version of GPQA containing challenging physics questions designed to test deep understanding and logical reasoning | gpqa |
131
+ | [ko-math-500](https://huggingface.co/datasets/davidkim205/ko-math-500) | Korean-translated subset of 500 high school-level math problems from the MATH dataset, including detailed solutions with LaTeX notation | math500 |
132
+
133
+ ### Benchmark Results
134
+
135
+
136
+ | | **davidkim205<br>ko-gemma-3-27b** | google<br>gemma-3-27b-it | unsloth<br>gemma-3-27b-it | google<br>gemma-2-27b-it |
137
+ |---------|-----------------------------------:|--------------------------:|---------------------------:|--------------------------:|
138
+ | Avg. | **8.83** | 8.74 | 8.56 | 8.08 |
139
+ | bench | 8.26 | 8.06 | **8.27** | 7.59 |
140
+ | bench2 | 8.74 | **8.79** | 8.73 | 8.21 |
141
+ | ged | **9.19** | 9.02 | 9.03 | 8.38 |
142
+ | ged2 | 8.90 | **8.91** | 8.98 | 8.38 |
143
+ | tiny | 8.50 | 9.08 | **9.12** | |
144
+ | ifeval | | | 8.10 | |
145
+ | ged:E | 9.86 | 9.86 | **9.93** | 9.51 |
146
+ | ged:M | 9.67 | 9.63 | **9.76** | 9.10 |
147
+ | ged:H | **9.60** | 9.52 | 9.52 | 9.32 |
148
+ | ged2:E | 9.77 | 9.89 | **9.94** | 9.48 |
149
+ | ged2:M | **9.75** | 9.58 | 9.46 | 9.33 |
150
+ | ged2:H | **9.48** | 9.23 | 9.40 | 9.08 |
151
+ | gpqa | **4.55** | 3.69 | 3.38 | 3.54 |
152
+ | math500 | **8.56** | 8.38 | 6.26 | 5.00 |
153
+
154
+
chat_template.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "chat_template": "{{ bos_token }}\n{%- if messages[0]['role'] == 'system' -%}\n {%- if messages[0]['content'] is string -%}\n {%- set first_user_prefix = messages[0]['content'] + '\n\n' -%}\n {%- else -%}\n {%- set first_user_prefix = messages[0]['content'][0]['text'] + '\n\n' -%}\n {%- endif -%}\n {%- set loop_messages = messages[1:] -%}\n{%- else -%}\n {%- set first_user_prefix = \"\" -%}\n {%- set loop_messages = messages -%}\n{%- endif -%}\n{%- for message in loop_messages -%}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}\n {{ raise_exception(\"Conversation roles must alternate user/assistant/user/assistant/...\") }}\n {%- endif -%}\n {%- if (message['role'] == 'assistant') -%}\n {%- set role = \"model\" -%}\n {%- else -%}\n {%- set role = message['role'] -%}\n {%- endif -%}\n {{ '<start_of_turn>' + role + '\n' + (first_user_prefix if loop.first else \"\") }}\n {%- if message['content'] is string -%}\n {{ message['content'] | trim }}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item['type'] == 'image' -%}\n {{ '<start_of_image>' }}\n {%- elif item['type'] == 'text' -%}\n {{ item['text'] | trim }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type\") }}\n {%- endif -%}\n {{ '<end_of_turn>\n' }}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {{'<start_of_turn>model\n'}}\n{%- endif -%}\n"
3
+ }
config.json ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Gemma3ForConditionalGeneration"
4
+ ],
5
+ "boi_token_index": 255999,
6
+ "eoi_token_index": 256000,
7
+ "eos_token_id": [
8
+ 1,
9
+ 106
10
+ ],
11
+ "hidden_size": 5376,
12
+ "image_token_index": 262144,
13
+ "initializer_range": 0.02,
14
+ "mm_tokens_per_image": 256,
15
+ "model_type": "gemma3",
16
+ "text_config": {
17
+ "attention_bias": false,
18
+ "attention_dropout": 0.0,
19
+ "attn_logit_softcapping": null,
20
+ "cache_implementation": "hybrid",
21
+ "final_logit_softcapping": null,
22
+ "head_dim": 128,
23
+ "hidden_activation": "gelu_pytorch_tanh",
24
+ "hidden_size": 5376,
25
+ "initializer_range": 0.02,
26
+ "intermediate_size": 21504,
27
+ "max_position_embeddings": 131072,
28
+ "model_type": "gemma3_text",
29
+ "num_attention_heads": 32,
30
+ "num_hidden_layers": 62,
31
+ "num_key_value_heads": 16,
32
+ "query_pre_attn_scalar": 168,
33
+ "rms_norm_eps": 1e-06,
34
+ "rope_local_base_freq": 10000.0,
35
+ "rope_scaling": {
36
+ "factor": 8.0,
37
+ "rope_type": "linear"
38
+ },
39
+ "rope_theta": 1000000.0,
40
+ "sliding_window": 1024,
41
+ "sliding_window_pattern": 6,
42
+ "torch_dtype": "bfloat16",
43
+ "use_cache": true,
44
+ "vocab_size": 262208
45
+ },
46
+ "torch_dtype": "bfloat16",
47
+ "transformers_version": "4.51.3",
48
+ "use_cache": true,
49
+ "vision_config": {
50
+ "attention_dropout": 0.0,
51
+ "hidden_act": "gelu_pytorch_tanh",
52
+ "hidden_size": 1152,
53
+ "image_size": 896,
54
+ "intermediate_size": 4304,
55
+ "layer_norm_eps": 1e-06,
56
+ "model_type": "siglip_vision_model",
57
+ "num_attention_heads": 16,
58
+ "num_channels": 3,
59
+ "num_hidden_layers": 27,
60
+ "patch_size": 14,
61
+ "torch_dtype": "bfloat16",
62
+ "vision_use_head": false
63
+ }
64
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 2,
3
+ "cache_implementation": "hybrid",
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 1,
7
+ 106
8
+ ],
9
+ "pad_token_id": 0,
10
+ "top_k": 64,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.51.3"
13
+ }
model-00001-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7362853ece462bc312ec687be7017321b379d1692608406989c3cd881d44a7ff
3
+ size 4854573696
model-00002-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d2913cd34e87ddfa6d542183ea911ea4d120b65c69bdb6901caca676c2dea25
3
+ size 4954792944
model-00003-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5dce660d7a80292f2acfb1e33ffc9427b5ca33474287a0f9b6dc30e56bbd178
3
+ size 4954792976
model-00004-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd44980260e7c20bf4d808e46db13fb2344df35c570a0cefbadba70630d2ba8d
3
+ size 4954793016
model-00005-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4bab80685b5a3c31e3a4596d5a6bb5b25ef4da6ef405831d395f31f62daf75ad
3
+ size 4954793016
model-00006-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f03b2da4a318451e2520e783aed8a334c7bfca96b35bbbdc0e68e6aff4c0ae0
3
+ size 4954793016
model-00007-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ba66a7a7bb63becb9b58d040c2230f70ec0281e6c299c5853b4725684eb90e44
3
+ size 4954793016
model-00008-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b321aa48dacc3d6ab511a432687c5b838bc4ee3f1c1ee92c726a8161c09fe9bf
3
+ size 4954793016
model-00009-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b2c7fe1eb69688216170faff8abe3dcedc428031e722510254a6752c02d14c6
3
+ size 4954793016
model-00010-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e673a4e2e939349674bf0c9f73d995b42a0c5b0d81318bd4a1a500dcc4a0784a
3
+ size 4954793016
model-00011-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db314f3fe55b785b904da55460bc55ee8119f59abfbe77dc5510b609f2f2c557
3
+ size 4954793016
model-00012-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dc82510643460995e0dc9b180e5ea73d0b4ad65823fe1c729d0833dc060efd0b
3
+ size 462476696
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
preprocessor_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_convert_rgb": null,
3
+ "do_normalize": true,
4
+ "do_pan_and_scan": null,
5
+ "do_rescale": true,
6
+ "do_resize": true,
7
+ "image_mean": [
8
+ 0.5,
9
+ 0.5,
10
+ 0.5
11
+ ],
12
+ "image_processor_type": "Gemma3ImageProcessor",
13
+ "image_seq_length": 256,
14
+ "image_std": [
15
+ 0.5,
16
+ 0.5,
17
+ 0.5
18
+ ],
19
+ "pan_and_scan_max_num_crops": null,
20
+ "pan_and_scan_min_crop_size": null,
21
+ "pan_and_scan_min_ratio_to_activate": null,
22
+ "processor_class": "Gemma3Processor",
23
+ "resample": 2,
24
+ "rescale_factor": 0.00392156862745098,
25
+ "size": {
26
+ "height": 896,
27
+ "width": 896
28
+ }
29
+ }
processor_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "image_seq_length": 256,
3
+ "processor_class": "Gemma3Processor"
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<end_of_turn>"
4
+ ],
5
+ "boi_token": "<start_of_image>",
6
+ "bos_token": {
7
+ "content": "<bos>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false
12
+ },
13
+ "eoi_token": "<end_of_image>",
14
+ "eos_token": {
15
+ "content": "<eos>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "image_token": "<image_soft_token>",
22
+ "pad_token": {
23
+ "content": "<pad>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false
28
+ },
29
+ "unk_token": {
30
+ "content": "<unk>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false
35
+ }
36
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4667f2089529e8e7657cfb6d1c19910ae71ff5f28aa7ab2ff2763330affad795
3
+ size 33384568
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff