minyichen
/

Llama-3-Taiwan-70B-Instruct-GPTQ

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

minyichen commited on Jul 2

Commit

413551f

•

1 Parent(s): 135f123

Update README.md

Files changed (1) hide show

README.md +8 -10

README.md CHANGED Viewed

@@ -13,8 +13,6 @@ tags:
 - llama-3
 ---
-<img src="https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/vlfv5sHbt4hBxb3YwULlU.png" alt="Taiwan LLM Logo" width="600" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
 # Llama-3-Taiwan-70B-Instruct - GPTQ
 - Model creator: [Yen-Ting Lin](https://huggingface.co/yentinglin)
 - Original model: [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct)
@@ -24,19 +22,19 @@ tags:
 This repo contains GPTQ model files for [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct).
-Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
 <!-- description end -->
 <!-- repositories-available start -->
 * [GPTQ models for GPU inference](minyichen/Llama-3-Taiwan-70B-Instruct-GPTQ)
 * [Yen-Ting Lin's original unquantized  model](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct)
 <!-- repositories-available end -->
-<!-- prompt-template start -->
-## Prompt template: Vicuna
-```
-A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
-```
-<!-- prompt-template end -->

 - llama-3
 ---
 # Llama-3-Taiwan-70B-Instruct - GPTQ
 - Model creator: [Yen-Ting Lin](https://huggingface.co/yentinglin)
 - Original model: [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct)
 This repo contains GPTQ model files for [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct).
 <!-- description end -->
 <!-- repositories-available start -->
 * [GPTQ models for GPU inference](minyichen/Llama-3-Taiwan-70B-Instruct-GPTQ)
 * [Yen-Ting Lin's original unquantized  model](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct)
 <!-- repositories-available end -->
+## Quantization parameter
+- Bits : 4
+- Group Size : 128
+- Act Order : Yes
+- Damp % : 0.01
+- Seq Len : 2048
+- Size : 37.07 GB