SkillSpan: Hard and Soft Skill Extraction from English Job Postings Paper • 2204.12811 • Published Apr 27, 2022 • 1
Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning Paper • 2205.01381 • Published May 3, 2022
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models Paper • 2505.22232 • Published May 28 • 18
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit Paper • 2410.05838 • Published Oct 8, 2024 • 1
Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions? Paper • 2402.13703 • Published Feb 21, 2024
Tokenizer Choice For LLM Training: Negligible or Crucial? Paper • 2310.08754 • Published Oct 12, 2023 • 2
Towards Cross-Lingual LLM Evaluation for European Languages Paper • 2410.08928 • Published Oct 11, 2024 • 2
Do Multilingual Large Language Models Mitigate Stereotype Bias? Paper • 2407.05740 • Published Jul 8, 2024
Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks Paper • 2406.13469 • Published Jun 19, 2024
Tokenizer Choice For LLM Training: Negligible or Crucial? Paper • 2310.08754 • Published Oct 12, 2023 • 2
Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions? Paper • 2402.13703 • Published Feb 21, 2024
Addressing contingency in algorithmic (mis)information classification: Toward a responsible machine learning agenda Paper • 2210.09014 • Published Oct 5, 2022
Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap Paper • 2201.11676 • Published Jan 27, 2022 • 1