The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper β’ 2306.01116 β’ Published Jun 1, 2023 β’ 34
LLaMA: Open and Efficient Foundation Language Models Paper β’ 2302.13971 β’ Published Feb 27, 2023 β’ 14
Focus Anywhere for Fine-grained Multi-page Document Understanding Paper β’ 2405.14295 β’ Published May 23, 2024 β’ 1
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models Paper β’ 2312.06109 β’ Published Dec 11, 2023 β’ 21
ColPali: Efficient Document Retrieval with Vision Language Models Paper β’ 2407.01449 β’ Published Jun 27, 2024 β’ 48
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper β’ 2407.03320 β’ Published Jul 3, 2024 β’ 96