このモデルは、LLM 2024(https://weblab.t.u-tokyo.ac.jp/lecture/course-list/large-language-model/)の最終課題提出のためにアップしています。 最終課題提出後、一定期間で削除します。 最終課題の評価目的以外での利用は禁止します。 llm-jp-3-13bをベースに、llm-jp-3-13b-instructをelyza-task100を利用し、task_arithmeticでマージしています。 このモデルは、RAGを用いたタスク分類とタスクに応じた適切なプロンプトエンジニアリングにより動作します。 推論時のコードは、コンペ運営から配布されている「Model_Inference_Template_20241127.ipynb」をベースに上記の処理を加えることにより、実行しています。 実行用のpyファイル(Google Colabでの動作確認したファイル)とRAG用のエクセルファイルもアップしておきます。 --- base_model: [] library_name: transformers tags: - mergekit - merge --- # final_merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 as a base. ### Models Merged The following models were included in the merge: * ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 dtype: bfloat16 merge_method: task_arithmetic parameters: int8_mask: 1.0 normalize: 0.0 slices: - sources: - layer_range: [0, 2] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.2951159694588346 - layer_range: [0, 2] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [2, 4] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.628217046418056 - layer_range: [2, 4] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [4, 6] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.0422707547278394 - layer_range: [4, 6] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [6, 8] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.0683380976074854 - layer_range: [6, 8] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [8, 10] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.26203994833534333 - layer_range: [8, 10] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [10, 12] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.1263717498902737 - layer_range: [10, 12] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [12, 14] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.887708708428289 - layer_range: [12, 14] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [14, 16] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.2028184670045419 - layer_range: [14, 16] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [16, 18] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.5253943623966824 - layer_range: [16, 18] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [18, 20] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.9231084138587686 - layer_range: [18, 20] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [20, 22] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.0382986550795958 - layer_range: [20, 22] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [22, 24] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.0058822243315682 - layer_range: [22, 24] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [24, 26] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.0496562280234227 - layer_range: [24, 26] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [26, 28] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.4546744316577644 - layer_range: [26, 28] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [28, 30] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.7126849392596979 - layer_range: [28, 30] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [30, 32] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.31595188025306903 - layer_range: [30, 32] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [32, 34] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 1.2021087899996585 - layer_range: [32, 34] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [34, 36] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.9651661068819831 - layer_range: [34, 36] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [36, 38] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.8787595708487486 - layer_range: [36, 38] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 - sources: - layer_range: [38, 40] model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333 parameters: weight: 0.3036739676118799 - layer_range: [38, 40] model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 ```