mjbuehler commited on
Commit
817a93b
·
verified ·
1 Parent(s): e5b3c4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -266,7 +266,7 @@ If your GPU allows, load and run inference in half precision (`torch.float16` or
266
 
267
  ```diff
268
  model = AutoModelForVision2Seq.from_pretrained(
269
- "lamm-mit/Cephalo-Idefics-2-vision-8b-beta",
270
  + torch_dtype=torch.float16,
271
  ).to(DEVICE)
272
  ```
@@ -287,7 +287,7 @@ Mke sure to install `flash-attn`. Refer to the [original repository of Flash Att
287
 
288
  ```diff
289
  model = AutoModelForVision2Seq.from_pretrained(
290
- "lamm-mit/Cephalo-Idefics-2-vision-8b-beta",
291
  + torch_dtype=torch.bfloat16,
292
  + _attn_implementation="flash_attention_2",
293
  ).to(DEVICE)
@@ -298,7 +298,7 @@ model = AutoModelForVision2Seq.from_pretrained(
298
  **4 bit quantization with bitsandbytes**
299
 
300
  <details><summary>Click to expand.</summary>
301
- It is possible to load Idefics2 in 4bits with `bitsandbytes`. Make sure that you have `accelerate` and `bitsandbytes` installed.
302
 
303
  ```diff
304
  + from transformers import BitsAndBytesConfig
@@ -310,7 +310,7 @@ quantization_config = BitsAndBytesConfig(
310
  bnb_4bit_compute_dtype=torch.bfloat16
311
  )
312
  model = AutoModelForVision2Seq.from_pretrained(
313
- "lamm-mit/Cephalo-Idefics-2-vision-8b-beta",
314
  + torch_dtype=torch.bfloat16,
315
  + quantization_config=quantization_config,
316
  ).to(DEVICE)
 
266
 
267
  ```diff
268
  model = AutoModelForVision2Seq.from_pretrained(
269
+ "lamm-mit/Cephalo-Idefics-2-vision-10b-alpha",
270
  + torch_dtype=torch.float16,
271
  ).to(DEVICE)
272
  ```
 
287
 
288
  ```diff
289
  model = AutoModelForVision2Seq.from_pretrained(
290
+ "lamm-mit/Cephalo-Idefics-2-vision-10b-alpha",
291
  + torch_dtype=torch.bfloat16,
292
  + _attn_implementation="flash_attention_2",
293
  ).to(DEVICE)
 
298
  **4 bit quantization with bitsandbytes**
299
 
300
  <details><summary>Click to expand.</summary>
301
+ It is possible to load Cephalo-Idefics-2-vision-10b-alpha in 4bits with `bitsandbytes`. Make sure that you have `accelerate` and `bitsandbytes` installed.
302
 
303
  ```diff
304
  + from transformers import BitsAndBytesConfig
 
310
  bnb_4bit_compute_dtype=torch.bfloat16
311
  )
312
  model = AutoModelForVision2Seq.from_pretrained(
313
+ "lamm-mit/Cephalo-Idefics-2-vision-10b-alpha",
314
  + torch_dtype=torch.bfloat16,
315
  + quantization_config=quantization_config,
316
  ).to(DEVICE)