lora-backward-1000 / README.md
sijiasijia's picture
Update README.md
5ff4379 verified
metadata
base_model: NousResearch/Llama-2-7b-hf
library_name: peft
license: mit
language:
  - en
pipeline_tag: text-generation
tags:
  - llama2
  - lora
  - instruction_tuning
  - backtranslation
  - self-alignment
  - llm-course

Model Card for Model ID

7374-course-llm

Model Description

LoRA Backward Model (1000 samples)

This model is a LoRA-finetuned version of NousResearch/Llama-2-7b-hf, trained to predict the instruction (x) given the assistant response (y). This implements the backward model training from the paper:

Self-Alignment with Instruction Backtranslation

Dataset

timdettmers/openassistant-guanaco and extract pairs of the form:

### Output (y)
<assistant's answer>

### Instruction (x)
<human's original question>