This is a list of LLMs I have helped develop.
Iker García-Ferrero
Iker
AI & ML interests
NLP, multilingual models, information extraction
Organizations
NLP for low resouce languages
My contributions towards developing deep learning models and resources that enable NLP in languages with limited or no available resources
-
T-Projection: High Quality Annotation Projection for Sequence Labeling Tasks
Paper • 2212.10548 • Published • 1 -
Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings
Paper • 2210.12623 • Published -
IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named Entity Recognition using Knowledge Bases
Paper • 2304.10637 • Published -
A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings
Paper • 2001.06381 • Published
Datasets
This is a list of LLMs I have helped develop.
-
This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models
Paper • 2310.15941 • Published • 6 -
HiTZ/This-is-not-a-dataset
Viewer • Updated • 381k • 49 • 6 -
HiTZ/Multilingual-Opinion-Target-Extraction
Viewer • Updated • 12.7k • 129 • 1 -
HiTZ/Multilingual-Medical-Corpus
Viewer • Updated • 67.4M • 404 • 33
NoticIA & ClickbaitFighter
We present NoticIA, a dataset consisting of 850 Spanish news articles featuring prominent clickbait headlines paired with a summary
LLMs
This is a list of LLMs I have helped develop.
Datasets
This is a list of LLMs I have helped develop.
-
This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models
Paper • 2310.15941 • Published • 6 -
HiTZ/This-is-not-a-dataset
Viewer • Updated • 381k • 49 • 6 -
HiTZ/Multilingual-Opinion-Target-Extraction
Viewer • Updated • 12.7k • 129 • 1 -
HiTZ/Multilingual-Medical-Corpus
Viewer • Updated • 67.4M • 404 • 33
NLP for low resouce languages
My contributions towards developing deep learning models and resources that enable NLP in languages with limited or no available resources
-
T-Projection: High Quality Annotation Projection for Sequence Labeling Tasks
Paper • 2212.10548 • Published • 1 -
Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings
Paper • 2210.12623 • Published -
IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named Entity Recognition using Knowledge Bases
Paper • 2304.10637 • Published -
A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings
Paper • 2001.06381 • Published
NoticIA & ClickbaitFighter
We present NoticIA, a dataset consisting of 850 Spanish news articles featuring prominent clickbait headlines paired with a summary