Post
593
We’re thrilled to share our latest technical paper on the multi-task GLiNER model. Our research dives into the following exciting and forward-thinking topics:
🔍 Zero-shot NER & Information Extraction: We demonstrate that with diverse and ample data, paired with the right architecture, encoders can achieve impressive results across various extraction tasks;
🛠️ Synthetic Data Generation: Leveraging open labelling by LLMs like Llama, we generated high-quality training data. Our student model even outperformed the teacher model, highlighting the potential of this approach.
🤖 Self-Learning: Our model showed consistent improvements in performance without labelled data, achieving up to a 12% increase in F1 score for initially challenging topics. This ability to learn and improve autonomously is a very perspective direction of future research!
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks (2406.12925)
knowledgator/gliner-multitask-large-v0.5
knowledgator/GLiNER_HandyLab
🔍 Zero-shot NER & Information Extraction: We demonstrate that with diverse and ample data, paired with the right architecture, encoders can achieve impressive results across various extraction tasks;
🛠️ Synthetic Data Generation: Leveraging open labelling by LLMs like Llama, we generated high-quality training data. Our student model even outperformed the teacher model, highlighting the potential of this approach.
🤖 Self-Learning: Our model showed consistent improvements in performance without labelled data, achieving up to a 12% increase in F1 score for initially challenging topics. This ability to learn and improve autonomously is a very perspective direction of future research!
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks (2406.12925)
knowledgator/gliner-multitask-large-v0.5
knowledgator/GLiNER_HandyLab
#!pip install gliner -U
from gliner import GLiNER
model = GLiNER.from_pretrained("knowledgator/gliner-multitask-large-v0.5")
text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800.
"""
labels = ["founder", "computer", "software", "position", "date"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])