Update README.md
#5
by
csabakecskemeti
- opened
Minor fix on the example offline code:
The original was missing the model_name variable.
Additionally extended the comment with option tensor_parallel_size and pipeline_parallel_size option to load the model on multiple GPU if it does not fit to a single GPU VRAM.
csabakecskemeti
changed pull request status to
closed