--- license: mit datasets: - eriktks/conll2003 language: - en metrics: - accuracy - precision - recall - f1 base_model: - openai-community/gpt2 --- # GPT-2 Fine-Tuned on CoNLL2003 for English Named Entity Recognition (NER) This model is a fine-tuned version of [GPT-2](https://huggingface.co/openai-community/gpt2) on the [CoNLL2003](https://huggingface.co/datasets/eriktks/conll2003) dataset for Named Entity Recognition (NER) in English. The CoNLL2003 dataset contains four types of named entities: Person (PER), Location (LOC), Organization (ORG), and Miscellaneous (MISC). ## Model Details - Model Architecture: GPT-2 (Generative Pre-trained Transformer) - Pre-trained Base Model: gpt2 - Dataset: CoNLL2003 (NER task) - Languages: English - Fine-tuned for: Named Entity Recognition (NER) - Entities recognized: - PER: Person - LOC: Location - ORG: Organization - MISC: Miscellaneous entities ## Use Cases This model is ideal for tasks that require identifying and classifying named entities within English text, such as: - Information extraction from unstructured text - Content classification and tagging - Automated text summarization - Question answering systems with a focus on entity recognition ## How to Use To use this model in your code, you can load it via Hugging Face’s Transformers library: ```python from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("MrRobson9/gpt2-ner-conll2003-english") model = AutoModelForTokenClassification.from_pretrained("MrRobson9/gpt2-ner-conll2003-english") nlp_ner = pipeline("ner", model=model, tokenizer=tokenizer) result = nlp_ner("John lives in New York and works for the United Nations.") print(result) ``` ## Performance |accuracy |precision |recall |f1-score| |:-------:|:--------:|:-----:|:------:| | 0.973 | 0.783 | 0.840 | 0.810 | ## License This model is licensed under the same terms as the GPT-2 model and the CoNLL2003 dataset. Please ensure compliance with all respective licenses when using this model.