ssz1111 commited on
Commit
8716349
·
verified ·
1 Parent(s): 4f7377b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -13
README.md CHANGED
@@ -4,19 +4,7 @@ license: apache-2.0
4
 
5
  # GATEAU-LLaMA-7B-1K-10K
6
 
7
- The expansion of large language models to effectively handle instructions with extremely long contexts has yet to be fully investigated.
8
- The primary obstacle lies in constructing a high-quality long instruction-following dataset devised for long context alignment.
9
- Existing studies have attempted to scale up the available data volume by synthesizing long instruction-following samples.
10
- However, indiscriminately increasing the quantity of data without a well-defined strategy for ensuring data quality may introduce low-quality samples and restrict the final performance.
11
- To bridge this gap, we aim to address the unique challenge of long-context alignment, i.e., modeling the long-range dependencies for handling instructions and lengthy input contexts.
12
- We propose GATEAU, a novel framework designed to identify the influential and high-quality samples enriched with long-range dependency relations by utilizing crafted
13
- Homologous Models' Guidance (HMG) and Contextual Awareness Measurement (CAM).
14
- Specifically, HMG attempts to measure the difficulty of generating corresponding responses due to the long-range dependencies, using the perplexity scores of the response from two homologous models with different context windows.
15
- Also, the role of CAM is to measure the difficulty of understanding the long input contexts due to long-range dependencies by evaluating whether the model’s attention is focused on important segments.
16
- Built upon both proposed methods, we select the most challenging samples as the influential data to effectively frame the long-range dependencies, thereby achieving better performance of LLMs.
17
- Comprehensive experiments indicate that GATEAU effectively identifies samples enriched with long-range dependency relations and the model trained on these selected samples exhibits better instruction-following and long-context understanding capabilities.
18
-
19
- A simple demo for deployment of the model:
20
  ```python
21
  from transformers import AutoTokenizer, AutoModelForCausalLM
22
  import torch
 
4
 
5
  # GATEAU-LLaMA-7B-1K-10K
6
 
7
+ A simple demo for the deployment of the model:
 
 
 
 
 
 
 
 
 
 
 
 
8
  ```python
9
  from transformers import AutoTokenizer, AutoModelForCausalLM
10
  import torch