File size: 2,927 Bytes
e484a46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# 📚 REPRODUCIBILITY.md

*AI-Driven Polymer Aging Prediction & Classification System*
*(Canonical Raman-only Pipeline)*

> **Purpose**
> A single document that lets any new user clone the repo, arquire the dataset, recreate the conda environment, and generate the validated Raman pipeline artifacts.

---

## 1. System Requirements

| Component | Minimum Version | Notes |
|-----------|-----------------|-------|
| Python | 3.10+  | Conda recommended |
| Git | 2.30+ | Any modern version |
| Conda | 23.1+ | Mamba also fine |
| OS  | Linux / MacOS / Windows | CPU run (no GPU needed) |
| Disk | ~1 GB | Dataset + artifacts |

---

## 2. Clone Repository

```bash

git clone https://github.com/dev-jaser/ai-ml-polymer-aging-prediction.git

cd ai-ml-polymer-aging-prediction

git checkout main

```

---

## 3. Create & Activate Conda Environment

```bash

conda env create -f environment.yml

conda activate polymer_env

```

> **Tip:** If you already created `polymer_env` just run `conda activate polymer_env`

---

## 4. Download RDWP Raman Dataset

1. Visit https://data.mendeley.com/datasets/kpygrf9fg6/1
2. Download the archive (**RDWP.zip or similar**) by clicking `Download Add 10.3 MB`
3. Extract all `*.txt` Raman files into:

```bash

ai-ml-polymer-aging-prediction/datasets/rdwp

```

4. Quick sanity check:

```bash

ls datasets/rdwp | grep ".txt" | wc -l # -> 170 + files expected

```

---

## 5. Validate the Entire Pipeline

Run the canonical smoke-test harness:

```bash

./validate_pipeline.sh

```

Successful run prints:

```bash

[PASS] Preprocessing

[PASS] Training & artificats

[PASS] Inference

[PASS] Plotting

All validation checks passed!

```

Artifacts created:

```bash

outputs/figure2_model.pth

outputs/logs/raman_figure2_diagnostics.json

outputs/inference/test_prediction.json

outputs/plots/validation_plot.png

```

---

## 6. Optional: Train ResNet Variant

```python

python scripts/train_model.py --model resnet --target-len 4000 --baseline --smooth --normalize

```

Check that these exist now:

```bash

outputs/resnet_model.pth

outputs/logs/raman_resnet_diagnostics.json

```

---

## 7. Clean-up & Re-Run

To re-run from a clean state:

```bash

rm -rf outputs/*

./validate_pipeline.sh

```

All artifacts will be regenerated.

---

## 8. Troubleshooting

| Symptom | Likely Cause | Fix |
|---------|--------------|-----|
| `ModuleNotFoundError` during scripts| `conda activate polymer_env` not done | Activate env|
| `CUDA not available` warning | Running on CPU | Safe to ignore |
| Fewer than 170 files in `datasets/rdwp` | Incomplete extract | Re-download archive |
| `validate_pipeline.sh: Permission denied` | Missing executable bit | `chmod +x validated_pipeline.sh` |

---

## 9. Contact

For issues or questions, open an Issue in the GitHub repo or contact @dev-jaser