Text Generation
Transformers
llama
Inference Endpoints
bhenrym14 commited on
Commit
8ab4145
1 Parent(s): f38c375

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -21
README.md CHANGED
@@ -37,28 +37,15 @@ Unfortunately it has also been shown that LLM's frequently struggle to attend to
37
  Here I explore whether training on long sequences that have clear conceptual dependencies residing in the middle of the context helps attenuate the difficulties in attending to middle-context tokens. When/if I have time, I hope to perform a more rigorous assessment of the peformance with respect to this specific issue.
38
 
39
  ## Relative Performance (perplexity)
40
- | Model | Context (tokens) | Perplexity |
41
- | ---------------------------------------------------- | ----------- | ---------- |
42
- | TheBloke/airoboros-13B-gpt4-1-4-GPTQ | 512 | **7.42** |
43
- | TheBloke/airoboros-13B-gpt4-1-4-SuperHOT-8K-GPTQ | 512 | 8.86 |
44
- | **bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-GPTQ** | 512 | 7.94 |
45
- | ---------------------------------------------------- | ----------- | ---------- |
46
- | TheBloke/airoboros-13B-gpt4-1-4-GPTQ | 2048 | **5.02** |
47
- | TheBloke/airoboros-13B-gpt4-1-4-SuperHOT-8K-GPTQ | 2048 | 5.98 |
48
- | **bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-GPTQ** | 2048 | 5.28 |
49
- | ---------------------------------------------------- | ----------- | ---------- |
50
- | TheBloke/airoboros-13B-gpt4-1-4-GPTQ | 4096 | 9848.0 |
51
- | TheBloke/airoboros-13B-gpt4-1-4-SuperHOT-8K-GPTQ | 4096 | 5.80 |
52
- | **bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-GPTQ** | 4096 | **5.15** |
53
-
54
- | Context (tokens) | airophin-13b-pntk-16k-fp16| bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-GPTQ |bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16 | TheBloke/airoboros-33B-gpt4-1-4-SuperHOT-8K-GPTQ | jondurbin/airoboros-33B-gpt4-1.4-GPTQ |
55
  | ---| ------- | -----| ------ | --- | --- |
56
- | 512 | 7.62 | | 7.90 | 8.24 | **6.36** |
57
- | 1024 | 6.20 | | 6.17 | 8.06 | **5.12** |
58
- | 2048 | 5.38 | | 5.23 | 7.02 | **4.43** |
59
- | 4096 | 5.08 | | **4.91** | 6.56 | 54.5 |
60
- | 8192 | 4.90 | | -- | -- | -- |
61
- | 12000 | 4.82 | | -- | -- | -- |
62
 
63
  - This model is competitive with the Llama-1 33b variants, outperforming the best long context model for short sequences.
64
  - Not presented here, but this model outperforms the base llama-2-13b on MMLU-fs with a score of 54.9. While not an appreciable improvement, the fact there wasn't a performance regression despite the context extension is notable.
 
37
  Here I explore whether training on long sequences that have clear conceptual dependencies residing in the middle of the context helps attenuate the difficulties in attending to middle-context tokens. When/if I have time, I hope to perform a more rigorous assessment of the peformance with respect to this specific issue.
38
 
39
  ## Relative Performance (perplexity)
40
+
41
+ | Context (tokens) | airophin-13b-pntk-16k-fp16| bhenrym14/airoboros-13b-gpt4-1.4.1-PI-8192-GPTQ |bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-fp16 | jondurbin/airoboros-33B-gpt4-1.4-GPTQ |
 
 
 
 
 
 
 
 
 
 
 
 
 
42
  | ---| ------- | -----| ------ | --- | --- |
43
+ | 512 | 7.62 | 8.24 | 7.90 | **6.36** |
44
+ | 1024 | 6.20 | 6.71 | 6.17 | **5.12** |
45
+ | 2048 | 5.38 | 5.87 | 5.23 | **4.43** |
46
+ | 4096 | 5.08 | 5.50 | **4.91** | 54.5 |
47
+ | 8192 | 4.90 | 5.32 | -- | -- |
48
+ | 12000 | 4.82 | 56.1 | -- | -- |
49
 
50
  - This model is competitive with the Llama-1 33b variants, outperforming the best long context model for short sequences.
51
  - Not presented here, but this model outperforms the base llama-2-13b on MMLU-fs with a score of 54.9. While not an appreciable improvement, the fact there wasn't a performance regression despite the context extension is notable.