pranay-j's picture
added usage code
a2727e8 verified
metadata
language:
  - en
license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
base_model: mistralai/Mistral-7B-Instruct-v0.2
datasets:
  - jiacheng-ye/nl2bash
model-index:
  - name: Mistral 7B NL2BASH Agent
    results: []

Mistral 7B NL2BASH Agent

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.2 on the nl2bash dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5952

Model description

Mistral 7B NL2BASH Agent is a fine-tuned model that converts natural language queries into Linux commands. It serves as an intelligent agent capable of generating Linux commands based on user input in the form of natural language queries.

Intended uses & limitations

  • Automating the process of creating Linux commands from natural language queries.
  • Assisting users in generating complex Linux commands quickly and accurately.
  • The model's performance may vary based on the complexity and specificity of the natural language queries.
  • It may not handle all edge cases or uncommon scenarios effectively.

Installation

pip install transformers accelerate torch bitsandbytes peft  

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch 
from peft import PeftModel, PeftConfig

read_token="YOUR HUGGINGFACE TOKEN"

nf4_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.2",
    device_map='auto',
    quantization_config=nf4_config,
    use_cache=False,
    token=read_token
)


model = PeftModel.from_pretrained(model, "pranay-j/mistral-7b-nl2bash-agent",device_map='auto',token=read_token)

tokenizer=AutoTokenizer.from_pretrained("pranay-j/mistral-7b-nl2bash-agent",add_eos_token=False)
nl='Add "execute" to the permissions of all directories in the home directory tree'
prompt= f"[INST] {nl} [/INST]"

inputs=tokenizer(prompt,return_tensors="pt")
input_ids=inputs["input_ids"].to("cuda")

with torch.no_grad():
    out=model.generate(input_ids,top_p=0.5, temperature=0.7, max_new_tokens=30)

tokenizer.decode(out[0][input_ids.shape[-1]:])
# Output: find ~ -type d -exec chmod +x {} </s>

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 10
  • eval_batch_size: 10
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 40
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 5
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
1.6136 1.0 202 1.6451
1.5448 2.0 404 1.5952

Framework versions

  • PEFT 0.10.0
  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2