devjas1 commited on
Commit
529bbd6
Β·
1 Parent(s): b028c2c

(DOCS): add HF Space README with usage, roadmap, contributors, citation and links

Browse files
Files changed (1) hide show
  1. README.md +39 -140
README.md CHANGED
@@ -2,174 +2,81 @@
2
  title: AI Polymer Classification
3
  emoji: πŸ”¬
4
  colorFrom: indigo
5
- colorTo: green
6
  sdk: streamlit
7
  app_file: app.py
8
  pinned: false
9
  license: apache-2.0
10
  ---
 
11
 
12
- # πŸ”¬ AI-Driven Polymer Aging Prediction and Classification System
13
 
14
- A research project developed as part of AIRE 2025. This system applies deep learning to spectral data to classify polymer aging a critical proxy for recyclability using a fully reproducible and modular ML pipeline.
15
 
16
- The broader research vision is a multi-modal evaluation platform, benchmarking not only Raman spectra but also image-based models and FTIR spectral data, ensuring reproducibility, extensibility, and scientific rigor.
17
-
18
- ---
19
-
20
- ## 🎯 Project Objective
21
-
22
- - Build a validated machine learning system for classifying polymer spectra (predict degradation levels as a proxy for recyclability)
23
- - Evaluate and compare multiple CNN architectures, beginning with Figure2CNN and ResNet variants, and expand to additional trained models.
24
- - Ensure scientific reproducibility through structured diaignostics and artifact control
25
- - Support sustainability and circular materials research through spectrum-based classification.
26
-
27
- **Reference (for Figure2CNN baseline):**
28
-
29
- > Neo, E.R.K., Low, J.S.C., Goodship, V., Debattista, K. (2023).
30
- > Deep learning for chemometric analysis of plastic spectral data from infrared and Raman databases.
31
- > Resources, Conservation & Recycling, 188, 106718.
32
- > https://doi.org/10.1016/j.resconrec.2022.106718
33
  ---
34
 
35
- ## 🧠 Model Architectures
36
 
37
- | Model| Description |
38
- |------|-------------|
39
- | `Figure2CNN` | Baseline model from literature |
40
- | `ResNet1D` | Deeper candidate model with skip connections |
41
- | `ResNet18Vision` | Image-focused CNN architecture, retrained on polymer dataset (roadmap) |
42
-
43
- Future expansions will add additional trained CNNs, supporting direct benchmarking and comparative reporting.
44
 
45
  ---
46
 
47
- ## πŸ“ Project Structure (Cleaned and Current)
48
-
49
- ```text
50
- ml-polymer-recycling/
51
- β”œβ”€β”€ datasets/
52
- β”œβ”€β”€ models/ # Model architectures
53
- β”œβ”€β”€ scripts/ # Training, inference, utilities
54
- β”œβ”€β”€ outputs/ # Artifacts: models, logs, plots
55
- β”œβ”€β”€ docs/ # Documentation & reports
56
- └── environment.yml # (local) Conda execution environment
57
- ```
58
 
59
- ![ml-polymer-gitdiagram-0](https://github.com/user-attachments/assets/bb5d93dc-7ab9-4259-8513-fb680ae59d64)
 
 
 
 
60
 
61
  ---
62
 
63
- ## βœ… Current Status
64
-
65
- | Track | Status | Test Accuracy |
66
- |-----------|----------------------|----------------|
67
- | **Raman** | βœ… Active & validated | **87.81% Β± 7.59%** |
68
- | **Image** | 🚧 Planned Expansion | N/A |
69
- | **FTIR** | ⏸️ Deferred/Modularized | N/A |
70
-
71
- ## πŸ”¬ Key Features
72
-
73
- - βœ… 10-Fold Stratified Cross-Validation
74
- - βœ… CLI Training: `train_model.py`
75
- - βœ… CLI Inference `run_inference.py`
76
- - βœ… Output artifact naming per model
77
- - βœ… Raman-only preprocessing with baseline correction, smoothing, normalization
78
- - βœ… Structured diagnostics JSON (accuracies, confusion matrices)
79
- - βœ… Canonical validation script (`validate_pipeline.sh`) confirms reproducibility of all core components
80
-
81
- ---
82
-
83
- **Environments:**
84
-
85
- ```bash
86
- # Local
87
- git checkout main
88
- conda env create -f environment.yml
89
- conda activate polymer_env
90
 
91
- # HPC
92
- git checkout hpc-main
93
- conda env create -f environment_hpc.yml
94
- conda activate polymer_env
95
- ```
96
 
97
- ## πŸ“Š Sample Training & Inference
98
 
99
- ### Training (10-Fold CV)
100
-
101
- ```bash
102
- python scripts/train_model.py --model resnet --target-len 4000 --baseline --smooth --normalize
103
- ```
104
-
105
- ### Inference (Raman)
106
-
107
- ```bash
108
- python scripts/run_inference.py --target-len 4000
109
- --input datasets/rdwp/sample123.txt --model outputs/resnet_model.pth
110
- --output outputs/inference/prediction.txt
111
- ```
112
-
113
- ### Inference Output Example:
114
-
115
- ```bash
116
- Predicted Label: 1 True Label: 1
117
- Raw Logits: [[-569.544, 427.996]]
118
- ```
119
-
120
- ### Validation Script (Raman Pipeline)
121
-
122
- ```bash
123
- ./validate_pipeline.sh
124
- # Runs preprocessing, training, inference, and plotting checks
125
- # Confirms artifact integrity and logs test results
126
- ```
127
 
128
  ---
129
 
130
- ## πŸ“š Dataset Resources
131
 
132
- | Type | Dataset | Source |
133
- |-------|---------|--------|
134
- | Raman | RDWP | [A Raman database of microplastics weathered under natural environments](https://data.mendeley.com/datasets/kpygrf9fg6/1) |
135
 
136
- | Datasets should be downloaded separately and placed here:
137
 
138
- ```bash
139
- datasets/
140
- └── rdwp/
141
- β”œβ”€β”€ sample1.txt
142
- β”œβ”€β”€ sample2.txt
143
- └── ...
144
- ```
145
 
146
- These files are intentionally excluded from version control via `.gitignore`
 
 
 
147
 
148
  ---
149
 
150
- ## πŸ›  Dependencies
151
 
152
- - `Python 3.10+`
153
- - `Conda, Git`
154
- - `PyTorch (CPU & CUDA)`
155
- - `Numpy, SciPy, Pandas`
156
- - `Scikit-learn`
157
- - `Matplotlib, Seaborn`
158
- - `ArgParse, JSON`
159
 
160
- ---
161
-
162
- ## πŸ§‘β€πŸ€β€πŸ§‘ Contributors
163
-
164
- - **Dr. Sanmukh Kuppannagari** β€” Research Mentor
165
- - **Dr. Metin Karailyan** β€” Research Mentor
166
- - **Jaser H.** β€” AIRE 2025 Intern, Developer
167
-
168
- ---
169
 
170
  ## 🎯 Strategic Expansion Objectives (Roadmap)
171
 
172
- > The roadmap defines three major expansion paths designed to broaden the system’s capabilities and impact:
173
 
174
  1. **Model Expansion: Multi-Model Dashboard**
175
 
@@ -203,11 +110,3 @@ These files are intentionally excluded from version control via `.gitignore`
203
  - **Phased Development**: Implementation details to be refined during meetings to ensure scientific rigor.
204
 
205
  This guarantees FTIR becomes a supported modality without undermining the validated Raman foundation.
206
-
207
- ## πŸ”‘ Guiding Principles
208
-
209
- - **Preserve the Raman baseline** as the reproducible ground truth
210
- - **Additive modularity**: Models, images, and FTIR added as clean, distinct layers rather than overwriting core functionality
211
- - **Transparency & reproducibility**: All expansions documented, tested, and logged with clear outputs.
212
- - **Future-oriented design**: Workflows structured to support ongoing collaboration and successor-safe research.
213
-
 
2
  title: AI Polymer Classification
3
  emoji: πŸ”¬
4
  colorFrom: indigo
5
+ colorTo: teal
6
  sdk: streamlit
7
  app_file: app.py
8
  pinned: false
9
  license: apache-2.0
10
  ---
11
+ ## AI-Driven Polymer Aging Prediction and Classification (v0.1)
12
 
13
+ This web application classifies the degradation state of polymers using Raman spectroscopy and deep learning.
14
 
15
+ It was developed as part of the AIRE 2025 internship project at the Imageomics Institute and demonstrates a prototype pipeline for evaluating multiple convolutional neural networks (CNNs) on spectral data.
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ---
18
 
19
+ ## πŸ§ͺ Current Scope
20
 
21
+ - πŸ”¬ **Modality**: Raman spectroscopy (.txt)
22
+ - 🧠 **Model**: Figure2CNN (baseline)
23
+ - πŸ“Š **Task**: Binary classification β€” Stable vs Weathered polymers
24
+ - πŸ› οΈ **Architecture**: PyTorch + Streamlit
 
 
 
25
 
26
  ---
27
 
28
+ ## 🚧 Roadmap
 
 
 
 
 
 
 
 
 
 
29
 
30
+ - [x] Inference from Raman `.txt` files
31
+ - [x] Model selection (Figure2CNN, ResNet1D)
32
+ - [ ] Add more trained CNNs for comparison
33
+ - [ ] FTIR support (modular integration planned)
34
+ - [ ] Image-based inference (future modality)
35
 
36
  ---
37
 
38
+ ## 🧭 How to Use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
+ 1. Upload a Raman spectrum `.txt` file (or select a sample)
41
+ 2. Choose a model from the sidebar
42
+ 3. Run analysis
43
+ 4. View prediction, logits, and technical information
 
44
 
45
+ Supported input:
46
 
47
+ - Plaintext `.txt` files with 1–2 columns
48
+ - Space- or comma-separated
49
+ - Comment lines (#) are ignored
50
+ - Automatically resampled to 500 points
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  ---
53
 
54
+ ## Contributors
55
 
56
+ πŸ‘¨β€πŸ« Dr. Sanmukh Kuppannagari (Mentor)
57
+ πŸ‘¨β€πŸ« Dr. Metin Karailyan (Mentor)
58
+ πŸ‘¨β€πŸ’» Jaser Hasan (Author/Developer)
59
 
60
+ ## 🧠 Model Credit
61
 
62
+ Baseline model inspired by:
 
 
 
 
 
 
63
 
64
+ Neo, E.R.K., Low, J.S.C., Goodship, V., Debattista, K. (2023).
65
+ *Deep learning for chemometric analysis of plastic spectral data from infrared and Raman databases.*
66
+ _Resources, Conservation & Recycling_, **188**, 106718.
67
+ [https://doi.org/10.1016/j.resconrec.2022.106718](https://doi.org/10.1016/j.resconrec.2022.106718)
68
 
69
  ---
70
 
71
+ ## πŸ”— Links
72
 
73
+ - πŸ’» **Live App**: [Hugging Face Space](https://huggingface.co/spaces/dev-jas/polymer-aging-ml)
74
+ - πŸ“‚ **GitHub Repo**: [ml-polymer-recycling](https://github.com/KLab-AI3/ml-polymer-recycling)
 
 
 
 
 
75
 
 
 
 
 
 
 
 
 
 
76
 
77
  ## 🎯 Strategic Expansion Objectives (Roadmap)
78
 
79
+ **The roadmap defines three major expansion paths designed to broaden the system’s capabilities and impact:**
80
 
81
  1. **Model Expansion: Multi-Model Dashboard**
82
 
 
110
  - **Phased Development**: Implementation details to be refined during meetings to ensure scientific rigor.
111
 
112
  This guarantees FTIR becomes a supported modality without undermining the validated Raman foundation.