Model Card for Model ID

This model is an HF optimum 0.0.28 (AWS Neuron SDK 2.20.2)'s compiled verson, of the Korean fine-tuned model Qwen/Qwen2.5-7B-Instruct , available at https://huggingface.co/Qwen/Qwen2.5-7B-Instruct. It is intended for deployment on Amazon EC2 Inferentia2 and Amazon SageMaker.

Model Details

This model is compiled with HF optimum 0.0.28, neuronx-cc version: 2.15.143 v1.2-hf-tgi-0.0.28-pt-2.1.2-inf-neuronx-py310 Please refer to a guide at https://github.com/aws-samples/aws-ai-ml-workshop-kr/tree/master/neuron/hf-optimum/04-Deploy-Qwen-25-8B-Llama3-8B-HF-TGI-Docker-On-INF2

Hardware

At a minimum hardware, you can use Amazon EC2 inf2.xlarge and more powerful family such as inf2.8xlarge, inf2.24xlarge and inf2.48xlarge and them at SageMaker Inference endpoing. The detailed information is Amazon EC2 Inf2 Instances

Model Card Contact

Gonsoo Moon, [email protected]

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Gonsoo/AWS-HF-optimum-neuron-0-0-28-Qwen2.5-7B-Instruct

Base model

Qwen/Qwen2.5-7B
Finetuned
(2476)
this model