make docker command more robust (#861)
Browse files* make docker command more robust
* update readme with more info
README.md
CHANGED
|
@@ -127,13 +127,15 @@ accelerate launch -m axolotl.cli.inference examples/openllama-3b/lora.yml \
|
|
| 127 |
A more powerful Docker command to run would be this:
|
| 128 |
|
| 129 |
```bash
|
| 130 |
-
docker run --gpus '"all"' --rm -it --name axolotl --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --mount type=volume,src=axolotl,target=/workspace/axolotl -v ${HOME}/.cache/huggingface:/root/.cache/huggingface winglian/axolotl:main-py3.10-cu118-2.0.1
|
| 131 |
```
|
| 132 |
|
| 133 |
It additionally:
|
| 134 |
* Prevents memory issues when running e.g. deepspeed (e.g. you could hit SIGBUS/signal 7 error) through `--ipc` and `--ulimit` args.
|
| 135 |
* Persists the downloaded HF data (models etc.) and your modifications to axolotl code through `--mount`/`-v` args.
|
| 136 |
* The `--name` argument simply makes it easier to refer to the container in vscode (`Dev Containers: Attach to Running Container...`) or in your terminal.
|
|
|
|
|
|
|
| 137 |
|
| 138 |
[More information on nvidia website](https://docs.nvidia.com/deeplearning/frameworks/user-guide/index.html#setincshmem)
|
| 139 |
|
|
|
|
| 127 |
A more powerful Docker command to run would be this:
|
| 128 |
|
| 129 |
```bash
|
| 130 |
+
docker run --privileged --gpus '"all"' --shm-size 10g --rm -it --name axolotl --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --mount type=volume,src=axolotl,target=/workspace/axolotl -v ${HOME}/.cache/huggingface:/root/.cache/huggingface winglian/axolotl:main-py3.10-cu118-2.0.1
|
| 131 |
```
|
| 132 |
|
| 133 |
It additionally:
|
| 134 |
* Prevents memory issues when running e.g. deepspeed (e.g. you could hit SIGBUS/signal 7 error) through `--ipc` and `--ulimit` args.
|
| 135 |
* Persists the downloaded HF data (models etc.) and your modifications to axolotl code through `--mount`/`-v` args.
|
| 136 |
* The `--name` argument simply makes it easier to refer to the container in vscode (`Dev Containers: Attach to Running Container...`) or in your terminal.
|
| 137 |
+
* The `--privileged` flag gives all capabilities to the container.
|
| 138 |
+
* The `--shm-size 10g` argument increases the shared memory size. Use this if you see `exitcode: -7` errors using deepspeed.
|
| 139 |
|
| 140 |
[More information on nvidia website](https://docs.nvidia.com/deeplearning/frameworks/user-guide/index.html#setincshmem)
|
| 141 |
|