YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

このモデルは、LLM 2024(https://weblab.t.u-tokyo.ac.jp/lecture/course-list/large-language-model/)の最終課題提出のためにアップしています。 最終課題提出後、一定期間で削除します。 最終課題の評価目的以外での利用は禁止します。

llm-jp-3-13bをベースに、llm-jp-3-13b-instructをelyza-task100を利用し、task_arithmeticでマージしています。

このモデルは、RAGを用いたタスク分類とタスクに応じた適切なプロンプトエンジニアリングにより動作します。

推論時のコードは、コンペ運営から配布されている「Model_Inference_Template_20241127.ipynb」をベースに上記の処理を加えることにより、実行しています。

実行用のpyファイル(Google Colabでの動作確認したファイル)とRAG用のエクセルファイルもアップしておきます。


base_model: [] library_name: transformers tags:

  • mergekit
  • merge

final_merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the task arithmetic merge method using ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778 as a base.

Models Merged

The following models were included in the merge:

  • ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333

Configuration

The following YAML configuration was used to produce this model:

base_model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
dtype: bfloat16
merge_method: task_arithmetic
parameters:
  int8_mask: 1.0
  normalize: 0.0
slices:
- sources:
  - layer_range: [0, 2]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.2951159694588346
  - layer_range: [0, 2]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [2, 4]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.628217046418056
  - layer_range: [2, 4]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [4, 6]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.0422707547278394
  - layer_range: [4, 6]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [6, 8]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.0683380976074854
  - layer_range: [6, 8]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [8, 10]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.26203994833534333
  - layer_range: [8, 10]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [10, 12]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.1263717498902737
  - layer_range: [10, 12]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [12, 14]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.887708708428289
  - layer_range: [12, 14]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [14, 16]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.2028184670045419
  - layer_range: [14, 16]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [16, 18]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.5253943623966824
  - layer_range: [16, 18]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [18, 20]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.9231084138587686
  - layer_range: [18, 20]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [20, 22]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.0382986550795958
  - layer_range: [20, 22]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [22, 24]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.0058822243315682
  - layer_range: [22, 24]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [24, 26]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.0496562280234227
  - layer_range: [24, 26]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [26, 28]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.4546744316577644
  - layer_range: [26, 28]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [28, 30]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.7126849392596979
  - layer_range: [28, 30]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [30, 32]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.31595188025306903
  - layer_range: [30, 32]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [32, 34]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 1.2021087899996585
  - layer_range: [32, 34]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [34, 36]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.9651661068819831
  - layer_range: [34, 36]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [36, 38]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.8787595708487486
  - layer_range: [36, 38]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
- sources:
  - layer_range: [38, 40]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b-instruct_3918994333
    parameters:
      weight: 0.3036739676118799
  - layer_range: [38, 40]
    model: ../evol_merge_storage/input_models/llm-jp-3-13b_2129051778
Downloads last month
6
Safetensors
Model size
13.7B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .