Does it run on a CPU instance in sagemaker (ml.m5.2xlarge)?
#2
by
arviii
- opened
Hey, I am trying to deploy a model on a CPU instance(ml.m5.2xlarge
) on sagemaker, but it overflows the storage and best way to resolve this might be to mount a storage volume (EBS I suppose)
To do so, ideally should pass volume_size=80
in huggingface_model.deploy
parameters. But it doesn't seem to work in my case and it still throws same error about storage running out.
Model: https://huggingface.co/NumbersStation/nsql-llama-2-7B
Instance: ml.m5.2xlarge
(it works perfectly fine on ml.g5.2xlarge
)
error: "Error: Download
Error safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 28, kind: StorageFull, message: ""No space left on device"" })"
code: predictor = huggingface_model.deploy(
initial_instance_count=1,
# instance_type="ml.g5.2xlarge",
instance_type="ml.m5.2xlarge",
container_startup_health_check_timeout=300,
volume_size=80,
)
rest code is just fine as it gets deployed successfully on ml.g5.2xlarge
Thank you for sharing this information! It will be helpful for others who are interested in deploying on Sagemaker.