GLiClass: Generalist Lightweight Model for Sequence Classification Tasks
Abstract
GLiClass, an adaptation of GLiNER, achieves efficient and accurate sequence classification with zero-shot and few-shot capabilities, and PPO is adapted for multi-label text classification in data-sparse conditions.
Classification is one of the most widespread tasks in AI applications, serving often as the first step in filtering, sorting, and categorizing data. Since modern AI systems must handle large volumes of input data and early pipeline stages can propagate errors downstream, achieving high efficiency and accuracy is critical. Moreover, classification requirements can change dynamically based on user needs, necessitating models with strong zero-shot capabilities. While generative LLMs have become mainstream for zero-shot classification due to their versatility, they suffer from inconsistent instruction following and computational inefficiency. Cross-encoders, commonly used as rerankers in RAG pipelines, face a different bottleneck: they must process text-label pairs sequentially, significantly reducing efficiency with large label sets. Embedding-based approaches offer good efficiency but struggle with complex scenarios involving logical and semantic constraints. We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks. Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios. Additionally, we adapted proximal policy optimization (PPO) for multi-label text classification, enabling training classifiers in data-sparse conditions or from human feedback.
Community
The authors introduce GLiClass, a novel classification method adapted from the GLiNER architecture, aimed at balancing accuracy, efficiency, and zero-shot capability 🚀
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface (2025)
- GLiDRE: Generalist Lightweight model for Document-level Relation Extraction (2025)
- MSGCoOp: Multiple Semantic-Guided Context Optimization for Few-Shot Learning (2025)
- GLAD: Generalizable Tuning for Vision-Language Models (2025)
- PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers (2025)
- KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model (2025)
- Llama Nemoretriever Colembed: Top-Performing Text-Image Retrieval Model (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend