Sabiá-2: A New Generation of Portuguese Large Language Models Paper • 2403.09887 • Published Mar 14, 2024
SurveySum: A Dataset for Summarizing Multiple Scientific Articles into a Survey Section Paper • 2408.16444 • Published Aug 29, 2024 • 8
ptt5-v2: A Closer Look at Continued Pretraining of T5 Models for the Portuguese Language Paper • 2406.10806 • Published Jun 16, 2024
mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset Paper • 2108.13897 • Published Aug 31, 2021
No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval Paper • 2206.02873 • Published Jun 6, 2022
InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval Paper • 2301.01820 • Published Jan 4, 2023 • 1
InPars: Data Augmentation for Information Retrieval using Large Language Models Paper • 2202.05144 • Published Feb 10, 2022
Evaluating GPT-4's Vision Capabilities on Brazilian University Admission Exams Paper • 2311.14169 • Published Nov 23, 2023
MonoByte: A Pool of Monolingual Byte-level Language Models Paper • 2209.11035 • Published Sep 22, 2022
PTT5: Pretraining and validating the T5 model on Brazilian Portuguese data Paper • 2008.09144 • Published Aug 20, 2020
BLUEX: A benchmark based on Brazilian Leading Universities Entrance eXams Paper • 2307.05410 • Published Jul 11, 2023 • 2