Japanese
llama
causal-lm

This repo contains a low-rank adapter for LLaMA-13b fit on the llm-japanese-dataset dataset.

You can test this at https://huggingface.co/spaces/izumi-lab/llama-13b-japanese-lora-v0-1ep

This version of the weights was trained with the following hyperparameters:

  • Epochs: 1
  • Batch size: 130
  • Cutoff length: 256
  • Learning rate: 3e-4
  • Lora r: 4
  • Lora target modules: q_proj, v_proj
import torch
from transformers import LlamaForCausalLM, LlamaTokenizer
from peft import PeftModel

base_model = "decapoda-research/llama-13b-hf"
# Please note that the special license of decapoda-research/llama-13b-hf is applied.
model = LlamaForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
tokenizer = LlamaTokenizer.from_pretrained(base_model)
model = PeftModel.from_pretrained(
    model,
    "izumi-lab/llama-13b-japanese-lora-v0",
    torch_dtype=torch.float16,
)

To see more latest information, please go to llm.msuzuki.me.

Details

Citation:

@preprint{Hirano2023-llmj,
  title={{llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models and its Methodology}},
  autor={Masanori HIRANO and Masahiro SUZUKI and Hiroki SAKAJI},
  doi={10.48550/arXiv.2305.12720},
  archivePrefix={arXiv},
  arxivId={2305.12720},
  year={2023}
}

If you have any inquiries, such as joint research, data provision, various types of support, please email to [email protected] .

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train izumi-lab/llama-13b-japanese-lora-v0-1ep

Space using izumi-lab/llama-13b-japanese-lora-v0-1ep 1

Collection including izumi-lab/llama-13b-japanese-lora-v0-1ep