arxiv:2508.15877

Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs

Published on Aug 21

Upvote

Authors:

Osma Suominen ,

Juho Inkinen ,

Mona Lehtinen

Abstract

The Annif system, using multiple efficient language models and LLMs for ranking, achieved top performance in the LLMs4Subjects shared task at GermEval-2025.

AI-generated summary

This paper presents the Annif system in the LLMs4Subjects shared task (Subtask 2) at GermEval-2025. The task required creating subject predictions for bibliographic records using large language models, with a special focus on computational efficiency. Our system, based on the Annif automated subject indexing toolkit, refines our previous system from the first LLMs4Subjects shared task, which produced excellent results. We further improved the system by using many small and efficient language models for translation and synthetic data generation and by using LLMs for ranking candidate subjects. Our system ranked 1st in the overall quantitative evaluation of and 1st in the qualitative evaluation of Subtask 2.

View arXiv page View PDF Project page GitHub 0 Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.15877 in a dataset README.md to link it from this page.

Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs

Abstract

Community

Models citing this paper 1

Datasets citing this paper 0

Spaces citing this paper 1

Collections including this paper 1