--- language: en license: mit pipeline_tag: text-classification tags: - resume - ats - pii - nlp - huggingface --- # 🤖 Resume PII Masking & ATS Optimizer A professional-grade NLP pipeline to automatically **detect and mask Personally Identifiable Information (PII)** in resumes and **evaluate resume quality based on Applicant Tracking System (ATS) scoring**. Built using the Hugging Face Transformers ecosystem and fine-tuned with custom data, this project simulates real-world applications of Natural Language Processing in HR tech and recruitment automation systems. --- ## Key Features | Feature | Description | |-------------------------------|----------------------------------------------------------------------------| | PII Masking | Detects and masks names, emails, phone numbers, and addresses using NER. | | Resume Parsing | Handles large resumes (up to 2000+ words) with tokenizer support. | | ATS Resume Optimization | Scores resumes based on keyword density, formatting, and clarity. | | Job Description Matching | Optional feature to match resumes with specific job descriptions. | | Hugging Face Integration | Fine-tune and deploy models directly on Hugging Face Hub. | | Modular Architecture | Well-organized, scalable, and production-ready codebase. | --- ## 📁 Folder Structure ```bash resume_ats_project/ ├── data/ # Contains resume samples and PII-labeled training data │ ├── resumes.json │ └── pii_train.json ├── models/ # Directory to save fine-tuned models │ └── ats_model/ ├── resume_parser.py # Tokenization, segmentation, and formatting ├── pii_trainer.py # Script to fine-tune NER model ├── optimizer.py # ATS scoring logic ├── infer.py # Combines parsing, masking, and optimization ├── app.py # (Optional) Flask or Gradio interface ├── requirements.txt └── README.md --- Installation git clone https://github.com/your-username/resume-ats-optimizer.git cd resume_ats_optimizer pip install -r requirements.txt --- Real-World Applications This project mimics systems used by: LinkedIn Talent Solutions (Resume scoring + redaction) Amazon HR Automation (Internal resume screening tools) Google Cloud AutoML NER for internal document pipelines Infosys & TCS resume filtering portals --- You can adapt it to: Job matching portals Candidate anonymization systems Large-scale recruitment automation tools --- License Licensed under the MIT License. --- Author Karthikeyan M C karthikeyanmc1925@example.com