everything for high quality filtering of HPLT3
JQL-AI
community
AI & ML interests
None defined yet.
Recent Activity
Organization Card
JQL-AI (pronounced Jackal-AI) is a community of machine learning researchers committed to advancing the development of multilingual foundation models.
Latest Research
- Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
- Tokenizer Choice For LLM Training: Negligible or Crucial?
- Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
- Do Multilingual Large Language Models Mitigate Stereotype Bias?
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
-
5
JQL: Judging Quality Across Languages
🦊Filter multilingual data for high-quality language models
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Paper • 2505.22232 • Published • 18 -
JQL-AI/JQL-Edu-Heads
Text Ranking • Updated • 2 -
JQL-AI/JQL-LLM-Edu-Annotations
Viewer • Updated • 11.4M • 851 • 2
everything for high quality filtering of HPLT3
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
-
5
JQL: Judging Quality Across Languages
🦊Filter multilingual data for high-quality language models
-
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models
Paper • 2505.22232 • Published • 18 -
JQL-AI/JQL-Edu-Heads
Text Ranking • Updated • 2 -
JQL-AI/JQL-LLM-Edu-Annotations
Viewer • Updated • 11.4M • 851 • 2
datasets
12
JQL-AI/HPLT3-198-500k
Updated
•
3
JQL-AI/curated_embeddings
Updated
•
1.51k
JQL-AI/fw2_embeddings
Updated
•
4.04k
•
2
JQL-AI/hplt2_embeddings
Updated
•
2.56k
JQL-AI/hplt2_edu_scores
Viewer
•
Updated
•
3.36B
•
24.6k
•
1
JQL-AI/fw2_edu_scores
Viewer
•
Updated
•
4.92B
•
3.71k
•
4
JQL-AI/curated_edu_scores
Viewer
•
Updated
•
475
•
255
JQL-AI/JQL-LLM-Edu-Annotations
Viewer
•
Updated
•
11.4M
•
851
•
2
JQL-AI/JQL-Human-Edu-Annotations
Viewer
•
Updated
•
20.4k
•
109
•
4
JQL-AI/Fineweb_2_500k_removed
Viewer
•
Updated
•
11.7M
•
974