nmitchko
/

medfalcon-40b-lora

Text Generation

Model card Files Files and versions Community

nmitchko commited on Jun 14, 2023

Commit

496644d

•

1 Parent(s): aeef31f

Create README.md

Files changed (1) hide show

README.md +76 -0

README.md ADDED Viewed

	@@ -0,0 +1,76 @@

+---
+language:
+- en
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- medical
+license: cc-by-nc-3.0
+---
+# MedFalcon 40b LoRA
+## Model Description
+### Architecture
+`nmitchko/medfalcon-40b-lora` is a large language model LoRa specifically fine-tuned for medical domain tasks.
+It is based on [`Falcon-40b-instruct`](https://huggingface.co/tiiuae/falcon-40b-instruct/) at 40 billion parameters.
+The primary goal of this model is to improve question-answering and medical dialogue tasks.
+It was trained using [LoRA](https://arxiv.org/abs/2106.09685), specifically [QLora](https://github.com/artidoro/qlora), to reduce memory footprint.
+> This Lora supports 4-bit and 8-bit modes.
+### Requirements
+```
+bitsandbytes>=0.39.0
+peft
+transformers
+```
+Steps to load this model:
+1. Load base model using QLORA
+2. Apply LoRA using peft
+```python
+#
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+model = "tiiuae/falcon-40b-instruct/"
+LoRA = "nmitchko/medfalcon-40b-lora"
+tokenizer = AutoTokenizer.from_pretrained(model)
+model = AutoModelForCausalLM.from_pretrained(model,
+    load_in_8bit=load_8bit,
+    torch_dtype=torch.float16,
+    trust_remote_code=True,
+)
+model = PeftModel.from_pretrained(model, LoRA)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+    device_map="auto",
+)
+sequences = pipeline(
+   "What does the drug ceftrioxone do?\nDoctor:",
+    max_length=200,
+    do_sample=True,
+    top_k=40,
+    num_return_sequences=1,
+    eos_token_id=tokenizer.eos_token_id,
+)
+for seq in sequences:
+    print(f"Result: {seq['generated_text']}")
+```