このモデルは、LLM 2024(https://weblab.t.u-tokyo.ac.jp/lecture/course-list/large-language-model/)の最終課題提出のためにアップしています。 最終課題提出後、一定期間で削除します。 最終課題の評価目的以外での利用は禁止します。
llm-jp-3-13bをベースに、llm-jp-3-13b-instructをelyza-task100を利用し、task_arithmeticでマージしています。
このモデルは、RAGを用いたタスク分類とタスクに応じた適切なプロンプトエンジニアリングにより動作します。
推論時のコードは、コンペ運営から配布されている「Model_Inference_Template_20241127.ipynb」をベースに上記の処理を加えることにより、実行しています。
実行用のpyファイル(Google Colabでの動作確認したファイル)とRAG用のエクセルファイルもアップしておきます。
base_model: [] library_name: transformers tags:
- mergekit
- merge
final_merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the task arithmetic merge method using ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 as a base.
Models Merged
The following models were included in the merge:
- ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
Configuration
The following YAML configuration was used to produce this model:
base_model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
dtype: bfloat16
merge_method: task_arithmetic
parameters:
int8_mask: 1.0
normalize: 0.0
slices:
- sources:
- layer_range: [0, 2]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.2951159694588346
- layer_range: [0, 2]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [2, 4]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.628217046418056
- layer_range: [2, 4]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [4, 6]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.0422707547278394
- layer_range: [4, 6]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [6, 8]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.0683380976074854
- layer_range: [6, 8]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [8, 10]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.26203994833534333
- layer_range: [8, 10]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [10, 12]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.1263717498902737
- layer_range: [10, 12]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [12, 14]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.887708708428289
- layer_range: [12, 14]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [14, 16]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.2028184670045419
- layer_range: [14, 16]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [16, 18]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.5253943623966824
- layer_range: [16, 18]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [18, 20]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.9231084138587686
- layer_range: [18, 20]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [20, 22]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.0382986550795958
- layer_range: [20, 22]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [22, 24]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.0058822243315682
- layer_range: [22, 24]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [24, 26]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.0496562280234227
- layer_range: [24, 26]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [26, 28]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.4546744316577644
- layer_range: [26, 28]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [28, 30]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.7126849392596979
- layer_range: [28, 30]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [30, 32]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.31595188025306903
- layer_range: [30, 32]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [32, 34]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 1.2021087899996585
- layer_range: [32, 34]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [34, 36]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.9651661068819831
- layer_range: [34, 36]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [36, 38]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.8787595708487486
- layer_range: [36, 38]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
- layer_range: [38, 40]
model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
parameters:
weight: 0.3036739676118799
- layer_range: [38, 40]
model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- Downloads last month
- 6