[Cache Request] meta-llama/Meta-Llama-3-8B

#87
by sanctuaire21 - opened

Please add the following model to the neuron cache

AWS Inferentia and Trainium org

The model is already available. Please check available cached configurations here: https://huggingface.co/aws-neuron/optimum-neuron-cache/blob/main/inference-cache-config/llama3.json

dacorvo changed discussion status to closed

Sign up or log in to comment