metadata
base_model: NousResearch/Llama-2-7b-hf
library_name: peft
license: mit
language:
- en
pipeline_tag: text-generation
tags:
- llama2
- lora
- instruction_tuning
- backtranslation
- self-alignment
- llm-course
Model Card for Model ID
7374-course-llm
Model Description
LoRA Backward Model (1000 samples)
This model is a LoRA-finetuned version of NousResearch/Llama-2-7b-hf
, trained to predict the instruction (x) given the assistant response (y). This implements the backward model training from the paper:
Dataset
timdettmers/openassistant-guanaco and extract pairs of the form:
### Output (y)
<assistant's answer>
### Instruction (x)
<human's original question>