Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,17 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
# GATEAU-LLaMA-7B-1K-10K
|
6 |
+
|
7 |
+
The expansion of large language models to effectively handle instructions with extremely long contexts has yet to be fully investigated.
|
8 |
+
The primary obstacle lies in constructing a high-quality long instruction-following dataset devised for long context alignment.
|
9 |
+
Existing studies have attempted to scale up the available data volume by synthesizing long instruction-following samples.
|
10 |
+
However, indiscriminately increasing the quantity of data without a well-defined strategy for ensuring data quality may introduce low-quality samples and restrict the final performance.
|
11 |
+
To bridge this gap, we aim to address the unique challenge of long-context alignment, i.e., modeling the long-range dependencies for handling instructions and lengthy input contexts.
|
12 |
+
We propose \textbf{GATEAU}, a novel framework designed to identify the influential and high-quality samples enriched with long-range dependency relations by utilizing crafted
|
13 |
+
Homologous Models' Guidance (HMG) and Contextual Awareness Measurement (CAM).
|
14 |
+
Specifically, HMG attempts to measure the difficulty of generating corresponding responses due to the long-range dependencies, using the perplexity scores of the response from two homologous models with different context windows.
|
15 |
+
Also, the role of CAM is to measure the difficulty of understanding the long input contexts due to long-range dependencies by evaluating whether the model’s attention is focused on important segments.
|
16 |
+
Built upon both proposed methods, we select the most challenging samples as the influential data to effectively frame the long-range dependencies, thereby achieving better performance of LLMs.
|
17 |
+
Comprehensive experiments indicate that GATEAU effectively identifies samples enriched with long-range dependency relations and the model trained on these selected samples exhibits better instruction-following and long-context understanding capabilities.
|