ssz1111
/

GATEAU-1k-10k

Model card Files Files and versions

ssz1111 commited on Dec 13, 2024

Commit

8716349

·

verified ·

1 Parent(s): 4f7377b

Update README.md

Files changed (1) hide show

README.md +1 -13

README.md CHANGED Viewed

@@ -4,19 +4,7 @@ license: apache-2.0
 # GATEAU-LLaMA-7B-1K-10K
-The expansion of large language models to effectively handle instructions with extremely long contexts has yet to be fully investigated.
-The primary obstacle lies in constructing a high-quality long instruction-following dataset devised for long context alignment.
-Existing studies have attempted to scale up the available data volume by synthesizing long instruction-following samples.
-However, indiscriminately increasing the quantity of data without a well-defined strategy for ensuring data quality may introduce low-quality samples and restrict the final performance.
-To bridge this gap, we aim to address the unique challenge of long-context alignment, i.e., modeling the long-range dependencies for handling instructions and lengthy input contexts.
-We propose GATEAU, a novel framework designed to identify the influential and high-quality samples enriched with long-range dependency relations by utilizing crafted
-Homologous Models' Guidance (HMG) and Contextual Awareness Measurement (CAM).
-Specifically, HMG attempts to measure the difficulty of generating corresponding responses due to the long-range dependencies, using the perplexity scores of the response from two homologous models with different context windows.
-Also, the role of CAM is to measure the difficulty of understanding the long input contexts due to long-range dependencies by evaluating whether the model’s attention is focused on important segments.
-Built upon both proposed methods, we select the most challenging samples as the influential data to effectively frame the long-range dependencies, thereby achieving better performance of LLMs.
-Comprehensive experiments indicate that GATEAU effectively identifies samples enriched with long-range dependency relations and the model trained on these selected samples exhibits better instruction-following and long-context understanding capabilities.
-A simple demo for deployment of the model:
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch

 # GATEAU-LLaMA-7B-1K-10K
+A simple demo for the deployment of the model:
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch