qaihm-bot commited on
Commit
4199d0d
·
verified ·
1 Parent(s): aca534a

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -48,9 +48,9 @@ accross various devices, can be found [here](https://aihub.qualcomm.com/models/b
48
  - Response Rate: Rate of response generation after the first response token.
49
  - Tiny MMLU: Tiny MMLU (Massive Multitask Language Understanding) is an English language benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. This makes the benchmark more challenging and more similar to how we evaluate humans.
50
 
51
- | Model | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds) | Tiny MMLU |
52
- |---|---|---|---|---|---|---|
53
- | Baichuan2-7B | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 7.72 | 0.20804799999999998 - 6.6575359999999995 | 49.34% | Use Export Script |
54
 
55
  ## Deploying Baichuan2-7B on-device
56
 
 
48
  - Response Rate: Rate of response generation after the first response token.
49
  - Tiny MMLU: Tiny MMLU (Massive Multitask Language Understanding) is an English language benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. This makes the benchmark more challenging and more similar to how we evaluate humans.
50
 
51
+ | Model | Device | Chipset | Target Runtime | Response Rate (tokens per second) | Time To First Token (range, seconds)
52
+ |---|---|---|---|---|---|
53
+ | Baichuan2-7B | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite | QNN | 7.72 | 0.20804799999999998 - 6.6575359999999995 | -- | Use Export Script |
54
 
55
  ## Deploying Baichuan2-7B on-device
56