Spaces:

dev-jas
/

polymer-aging-ml

Sleeping

App Files Files Community

dev-jas commited on Sep 5

Commit

bcdc411

unverified ·

2 Parent(s): 1ffa0fe 0392c68

Merge pull request #3 from devjas1/new-space-deploy

Browse files

Files changed (17) hide show

.gitignore +1 -0
CODEBASE_INVENTORY.md +99 -214
README.md +54 -60
app.py +22 -6
core_logic.py +53 -2
models/registry.py +114 -11
modules/ui_components.py +478 -51
sample_data/ftir-stable-1.txt +75 -0
sample_data/ftir-weathered-1.txt +75 -0
sample_data/stable.sample.csv +22 -0
scripts/run_inference.py +364 -61
tests/test_ftir_preprocessing.py +179 -0
tests/test_multi_format.py +218 -0
utils/multifile.py +297 -56
utils/performance_tracker.py +404 -0
utils/preprocessing.py +133 -10
utils/results_manager.py +218 -2

.gitignore CHANGED Viewed

@@ -26,3 +26,4 @@ datasets/**
 # ---------------------------------------
 __pycache__.py

 # ---------------------------------------
 __pycache__.py
+outputs/performance_tracking.db

CODEBASE_INVENTORY.md CHANGED Viewed

@@ -2,40 +2,38 @@
 ## Executive Summary
-This audit provides a complete technical inventory of the `dev-jas/polymer-aging-ml` repository, a sophisticated machine learning platform for polymer degradation classification using Raman spectroscopy. The system demonstrates production-ready architecture with comprehensive error handling, batch processing capabilities, and an extensible model framework spanning **34 files across 7 directories**.[^1_1][^1_2]
 ## 🏗️ System Architecture
 ### Core Infrastructure
-The platform employs a **Streamlit-based web application** (`app.py` - 53.7 kB) as its primary interface, supported by a modular backend architecture. The system integrates **PyTorch for deep learning**, **Docker for deployment**, and implements a plugin-based model registry for extensibility.[^1_2][^1_3][^1_4]
-### Directory Structure Analysis
-The codebase maintains clean separation of concerns across seven primary directories:[^1_1]
-**Root Level Files:**
-- `app.py` (53.7 kB) - Main Streamlit application with two-column UI layout
-- `README.md` (4.8 kB) - Comprehensive project documentation
-- `Dockerfile` (421 Bytes) - Python 3.13-slim containerization
-- `requirements.txt` (132 Bytes) - Dependency management without version pinning
-**Core Directories:**
-- `models/` - Neural network architectures with registry pattern
-- `utils/` - Shared utility modules (43.2 kB total)
-- `scripts/` - CLI tools and automation workflows
-- `outputs/` - Pre-trained model weights storage
-- `sample_data/` - Demo spectrum files for testing
-- `tests/` - Unit testing infrastructure
-- `datasets/` - Data storage directory (content ignored)
 ## 🤖 Machine Learning Framework
-### Model Registry System
-The platform implements a **sophisticated factory pattern** for model management in `models/registry.py`. This design enables dynamic model selection and provides a unified interface for different architectures:[^1_5]
 ```python
 _REGISTRY: Dict[str, Callable[[int], object]] = {
@@ -47,35 +45,31 @@ _REGISTRY: Dict[str, Callable[[int], object]] = {
 ### Neural Network Architectures
-**1. Figure2CNN (Baseline Model)**[^1_6]
-- **Architecture**: 4 convolutional layers with progressive channel expansion (1→16→32→64→128)
-- **Classification Head**: 3 fully connected layers (256→128→2 neurons)
-- **Performance**: 94.80% accuracy, 94.30% F1-score
-- **Designation**: Validated exclusively for Raman spectra input
-- **Parameters**: Dynamic flattened size calculation for input flexibility
-**2. ResNet1D (Advanced Model)**[^1_7]
-- **Architecture**: 3 residual blocks with skip connections
-- **Innovation**: 1D residual connections for spectral feature learning
-- **Performance**: 96.20% accuracy, 95.90% F1-score
-- **Efficiency**: Global average pooling reduces parameter count
-- **Parameters**: Approximately 100K (more efficient than baseline)
-**3. ResNet18Vision (Deep Architecture)**[^1_8]
-- **Design**: 1D adaptation of ResNet-18 with BasicBlock1D modules
-- **Structure**: 4 residual layers with 2 blocks each
-- **Initialization**: Kaiming normal initialization for optimal training
-- **Status**: Under evaluation for spectral analysis applications
 ## 🔧 Data Processing Infrastructure
 ### Preprocessing Pipeline
-The system implements a **modular preprocessing pipeline** in `utils/preprocessing.py` with five configurable stages:[^1_9]
 **1. Input Validation Framework:**
 - File format verification (`.txt` files exclusively)
@@ -84,16 +78,16 @@ The system implements a **modular preprocessing pipeline** in `utils/preprocessi
 - Monotonic sequence verification for spectral consistency
 - NaN value detection and automatic rejection
-**2. Core Processing Steps:**[^1_9]
 - **Linear Resampling**: Uniform grid interpolation to 500 points using `scipy.interpolate.interp1d`
 - **Baseline Correction**: Polynomial detrending (configurable degree, default=2)
 - **Savitzky-Golay Smoothing**: Noise reduction (window=11, order=2, configurable)
-- **Min-Max Normalization**: Scaling to range with constant-signal protection[^1_1]
 ### Batch Processing Framework
-The `utils/multifile.py` module (12.5 kB) provides **enterprise-grade batch processing** capabilities:[^1_10]
 - **Multi-File Upload**: Streamlit widget supporting simultaneous file selection
 - **Error-Tolerant Processing**: Individual file failures don't interrupt batch operations
@@ -123,7 +117,7 @@ The main application implements a **sophisticated two-column layout** with compr
 ### State Management System
-The application employs **advanced session state management**:[^1_2]
 - Persistent state across Streamlit reruns using `st.session_state`
 - Intelligent caching with content-based hash keys for expensive operations
@@ -134,46 +128,24 @@ The application employs **advanced session state management**:[^1_2]
 ### Centralized Error Handling
-The `utils/errors.py` module (5.51 kB) implements **production-grade error management**:[^1_11]
-```python
-class ErrorHandler:
-    @staticmethod
-    def log_error(error: Exception, context: str = "", include_traceback: bool = False)
-    @staticmethod
-    def handle_file_error(filename: str, error: Exception) -> str
-    @staticmethod
-    def handle_inference_error(model_name: str, error: Exception) -> str
-```
-**Key Features:**
-- Context-aware error messages for different operation types
-- Graceful degradation with fallback modes
-- Structured logging with configurable verbosity
-- User-friendly error translation from technical exceptions
-### Confidence Analysis System
-The `utils/confidence.py` module provides **scientific confidence metrics**
-:
-**Softmax-Based Confidence:**
-- Normalized probability distributions from model logits
-- Three-tier confidence levels: HIGH (≥80%), MEDIUM (≥60%), LOW (<60%)
-- Color-coded visual indicators with emoji representations
-- Legacy compatibility with logit margin calculations
-### Session Results Management
-The `utils/results_manager.py` module (8.16 kB) enables **comprehensive session tracking**:
-- **In-Memory Storage**: Session-wide results persistence
-- **Export Capabilities**: CSV and JSON download with timestamp formatting
-- **Statistical Analysis**: Automatic accuracy calculation when ground truth available
-- **Data Integrity**: Results survive page refreshes within session boundaries
 ## 📜 Command-Line Interface
@@ -194,17 +166,6 @@ The `scripts/train_model.py` module (6.27 kB) implements **robust model training
 - Deterministic CUDA operations when GPU available
 - Standardized train/validation splitting methodology
-### Inference Pipeline
-The `scripts/run_inference.py` module (5.88 kB) provides **automated inference capabilities**:
-**CLI Features:**
-- Preprocessing parity with web interface ensuring consistent results
-- Multiple output formats with detailed metadata inclusion
-- Safe model loading across PyTorch versions with fallback mechanisms
-- Flexible architecture selection via command-line arguments
 ### Data Utilities
 **File Discovery System:**
@@ -213,17 +174,6 @@ The `scripts/run_inference.py` module (5.88 kB) provides **automated inference c
 - Filename-based labeling convention (`sta-*` = stable, `wea-*` = weathered)
 - Dataset inventory generation with statistical summaries
-## 🐳 Deployment Infrastructure
-### Docker Configuration
-The `Dockerfile` (421 Bytes) implements **optimized containerization**:[^1_12]
-- **Base Image**: Python 3.13-slim for minimal attack surface
-- **System Dependencies**: Essential build tools and scientific libraries
-- **Health Monitoring**: HTTP endpoint checking for container wellness
-- **Caching Strategy**: Layered builds with dependency caching for faster rebuilds
 ### Dependency Management
 The `requirements.txt` specifies **core dependencies without version pinning**:[^1_12]
@@ -234,6 +184,36 @@ The `requirements.txt` specifies **core dependencies without version pinning**:[
 - **Visualization**: `matplotlib` for spectrum plotting
 - **API Framework**: `fastapi`, `uvicorn` for potential REST API expansion
 ## 🧪 Testing Framework
 ### Test Infrastructure
@@ -244,12 +224,12 @@ The `tests/` directory implements **basic validation framework**:
 - **Preprocessing Tests**: Core pipeline functionality validation in `test_preprocessing.py`
 - **Limited Coverage**: Currently covers preprocessing functions only
-**Testing Gaps Identified:**
-- No model architecture unit tests
-- Missing integration tests for UI components
-- No performance benchmarking tests
-- Limited error handling validation
 ## 🔍 Security \& Quality Assessment
@@ -271,27 +251,11 @@ The `tests/` directory implements **basic validation framework**:
 - **Error Boundaries**: Multi-level exception handling with graceful degradation
 - **Logging**: Structured logging with appropriate severity levels
-### Security Considerations
-**Current Protections:**
-- Input sanitization through strict parsing rules
-- No arbitrary code execution paths
-- Containerized deployment limiting attack surface
-- Session-based storage preventing data persistence attacks
-**Areas Requiring Enhancement:**
-- No explicit security headers in web responses
-- Basic authentication/authorization framework absent
-- File upload size limits not explicitly configured
-- No rate limiting mechanisms implemented
 ## 🚀 Extensibility Analysis
 ### Model Architecture Extensibility
-The **registry pattern enables seamless model addition**:[^1_5]
 1. **Implementation**: Create new model class with standardized interface
 2. **Registration**: Add to `models/registry.py` with factory function
@@ -344,72 +308,15 @@ The **registry pattern enables seamless model addition**:[^1_5]
 - Session state pruning for long-running sessions
 - Caching with content-based invalidation
-## 🎯 Production Readiness Evaluation
-### Strengths
-**Architecture Excellence:**
-- Clean separation of concerns with modular design
-- Production-grade error handling and logging
-- Intuitive user experience with real-time feedback
-- Scalable batch processing with progress tracking
-- Well-documented, type-hinted codebase
-**Operational Readiness:**
-- Containerized deployment with health checks
-- Comprehensive preprocessing validation
-- Multiple export formats for integration
-- Session-based results management
-### Enhancement Opportunities
-**Testing Infrastructure:**
-- Expand unit test coverage beyond preprocessing
-- Implement integration tests for UI workflows
-- Add performance regression testing
-- Include security vulnerability scanning
-**Monitoring \& Observability:**
-- Application performance monitoring integration
-- User analytics and usage patterns tracking
-- Model performance drift detection
-- Resource utilization monitoring
-**Security Hardening:**
-- Implement proper authentication mechanisms
-- Add rate limiting for API endpoints
-- Configure security headers for web responses
-- Establish audit logging for sensitive operations
 ## 🔮 Strategic Development Roadmap
-Based on the documented roadmap in `README.md`, the platform targets three strategic expansion paths:[^1_13]
-**1. Multi-Model Dashboard Evolution**
-- Comparative model evaluation framework
-- Side-by-side performance reporting
-- Automated model retraining pipelines
-- Model versioning and rollback capabilities
-**2. Multi-Modal Input Support**
-- FTIR spectroscopy integration with dedicated preprocessing
-- Image-based polymer classification via computer vision
-- Cross-modal validation and ensemble methods
-- Unified preprocessing pipeline for multiple modalities
-**3. Enterprise Integration Features**
-- RESTful API development for programmatic access
-- Database integration for persistent storage
-- User authentication and authorization systems
-- Audit trails and compliance reporting
 ## 💼 Business Logic \& Scientific Workflow
@@ -424,7 +331,7 @@ Based on the documented roadmap in `README.md`, the platform targets three strat
 ### Scientific Applications
-**Research Use Cases:**[^1_13]
 - Material science polymer degradation studies
 - Recycling viability assessment for circular economy
@@ -434,7 +341,7 @@ Based on the documented roadmap in `README.md`, the platform targets three strat
 ### Data Workflow Architecture
-```
 Input Validation → Spectrum Preprocessing → Model Inference →
 Confidence Analysis → Results Visualization → Export Options
 ```
@@ -475,10 +382,7 @@ The platform successfully bridges academic research and practical application, p
 **Risk Assessment:** Low - The codebase demonstrates mature engineering practices with appropriate validation and error handling for production deployment.
-**Recommendation:** This platform is ready for production deployment with minimal additional hardening, representing a solid foundation for polymer classification research and industrial applications.
-<span style="display:none">[^1_14][^1_15][^1_16][^1_17][^1_18]</span>
-<div style="text-align: center">⁂</div>
 ### EXTRA
@@ -529,22 +433,3 @@ The platform successfully bridges academic research and practical application, p
     Column 1 (Input): Contains the main st.radio for mode selection and the conditional logic to display the single file uploader, batch uploader, or sample selector. It also holds the "Run Analysis" and "Reset All" buttons.
     Column 2 (Results): Contains all the logic for displaying either the batch results or the detailed, tabbed results for a single file (Details, Technical, Explanation).
 ```
-[^1_1]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/tree/main
-[^1_2]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/tree/main/datasets
-[^1_3]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml
-[^1_4]: https://github.com/KLab-AI3/ml-polymer-recycling
-[^1_5]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/.gitignore
-[^1_6]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/blob/main/models/resnet_cnn.py
-[^1_7]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/utils/multifile.py
-[^1_8]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/utils/preprocessing.py
-[^1_9]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/utils/audit.py
-[^1_10]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/utils/results_manager.py
-[^1_11]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/blob/main/scripts/train_model.py
-[^1_12]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/requirements.txt
-[^1_13]: https://doi.org/10.1016/j.resconrec.2022.106718
-[^1_14]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/app.py
-[^1_15]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/Dockerfile
-[^1_16]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/utils/errors.py
-[^1_17]: https://huggingface.co/spaces/dev-jas/polymer-aging-ml/raw/main/utils/confidence.py
-[^1_18]: https://ppl-ai-code-interpreter-files.s3.amazonaws.com/web/direct-files/9fd1eb2028a28085942cb82c9241b5ae/a25e2c38-813f-4d8b-89b3-713f7d24f1fe/3e70b172.md

 ## Executive Summary
+This audit provides a technical inventory of the dev-jas/polymer-aging-ml repository—a modular machine learning platform for polymer degradation classification using Raman and FTIR spectroscopy. The system features robust error handling, multi-format batch processing, and persistent performance tracking, making it suitable for research, education, and industrial applications.
 ## 🏗️ System Architecture
 ### Core Infrastructure
+- **Streamlit-based web app** (`app.py`) as the main interface
+- **PyTorch** for deep learning
+- **Docker** for deployment
+- **SQLite** (`outputs/performance_tracking.db`) for performance metrics
+- **Plugin-based model registry** for extensibility
+### Directory Structure
+- **app.py**: Main Streamlit application
+- **README.md**: Project documentation
+- **Dockerfile**: Containerization (Python 3.13-slim)
+- **requirements.txt**: Dependency management
+- **models/**: Neural network architectures and registry
+- **utils/**: Shared utilities (preprocessing, batch, results, performance, errors, confidence)
+- **scripts/**: CLI tools for training, inference, data management
+- **outputs/**: Model weights, inference results, performance DB
+- **sample_data/**: Demo spectrum files
+- **tests/**: Unit tests (PyTest)
+- **datasets/**: Data storage
+- **pages/**: Streamlit dashboard pages
 ## 🤖 Machine Learning Framework
+### Model Registry
+Factory pattern in `models/registry.py` enables dynamic model selection:
 ```python
 _REGISTRY: Dict[str, Callable[[int], object]] = {
 ### Neural Network Architectures
+The platform supports three architectures, offering diverse options for spectral analysis:
+**Figure2CNN (Baseline Model):**
+- Architecture: 4 convolutional layers (1→16→32→64→128), 3 fully connected layers (256→128→2).
+- Performance: 94.80% accuracy, 94.30% F1-score (Raman-only).
+- Parameters: ~500K, supports dynamic input handling.
+**ResNet1D (Advanced Model):**
+- Architecture: 3 residual blocks with 1D skip connections.
+- Performance: 96.20% accuracy, 95.90% F1-score.
+- Parameters: ~100K, efficient via global average pooling.
+**ResNet18Vision (Experimental):**
+- Architecture: 1D-adapted ResNet-18 with 4 layers (2 blocks each).
+- Status: Under evaluation, ~11M parameters.
+- Opportunity: Expand validation for broader spectral applications.
 ## 🔧 Data Processing Infrastructure
 ### Preprocessing Pipeline
+The system implements a **modular preprocessing pipeline** in `utils/preprocessing.py` with five configurable stages:
 **1. Input Validation Framework:**
 - File format verification (`.txt` files exclusively)
 - Monotonic sequence verification for spectral consistency
 - NaN value detection and automatic rejection
+**2. Core Processing Steps:**
 - **Linear Resampling**: Uniform grid interpolation to 500 points using `scipy.interpolate.interp1d`
 - **Baseline Correction**: Polynomial detrending (configurable degree, default=2)
 - **Savitzky-Golay Smoothing**: Noise reduction (window=11, order=2, configurable)
+- **Min-Max Normalization**: Scaling to range with constant-signal protection
 ### Batch Processing Framework
+The `utils/multifile.py` module (12.5 kB) provides **enterprise-grade batch processing** capabilities:
 - **Multi-File Upload**: Streamlit widget supporting simultaneous file selection
 - **Error-Tolerant Processing**: Individual file failures don't interrupt batch operations
 ### State Management System
+The application employs **advanced session state management**:
 - Persistent state across Streamlit reruns using `st.session_state`
 - Intelligent caching with content-based hash keys for expensive operations
 ### Centralized Error Handling
+The `utils/errors.py` module provides with **context-aware** logging and user-friendly error messages.
+### Performance Tracking System
+The `utils/performance_tracker.py` module provides a robust system for logging and analyzing performance metrics.
+- **Database Logging**: Persists metrics to a SQLite database.
+- **Automated Tracking**: Uses a context manager to automatically track inference time, preprocessing time, and memory usage.
+- **Dashboarding**: Includes functions to generate performance visualizations and summary statistics for the UI.
+### Enhanced Results Management
+The `utils/results_manager.py` module enables comprehensive session and persistent results tracking.
+- **In-Memory Storage**: Manages results for the current session.
+- **Multi-Model Handling**: Aggregates results from multiple models for comparison.
+- **Export Capabilities**: Exports results to CSV and JSON.
+- **Statistical Analysis**: Calculates accuracy, confidence, and other metrics.
 ## 📜 Command-Line Interface
 - Deterministic CUDA operations when GPU available
 - Standardized train/validation splitting methodology
 ### Data Utilities
 **File Discovery System:**
 - Filename-based labeling convention (`sta-*` = stable, `wea-*` = weathered)
 - Dataset inventory generation with statistical summaries
 ### Dependency Management
 The `requirements.txt` specifies **core dependencies without version pinning**:[^1_12]
 - **Visualization**: `matplotlib` for spectrum plotting
 - **API Framework**: `fastapi`, `uvicorn` for potential REST API expansion
+## 🐳 Deployment Infrastructure
+### Docker Configuration
+The Dockerfile uses Python 3.13-slim for efficient containerization:
+- Includes essential build tools and scientific libraries.
+- Supports health checks for container wellness.
+- **Roadmap**: Implement multi-stage builds and environment variables for streamlined deployments.
+### Confidence Analysis System
+The `utils/confidence.py` module provides **scientific confidence metrics**
+**Softmax-Based Confidence:**
+- Normalized probability distributions from model logits
+- Three-tier confidence levels: HIGH (≥80%), MEDIUM (≥60%), LOW (<60%)
+- Color-coded visual indicators with emoji representations
+- Legacy compatibility with logit margin calculations
+### Session Results Management
+The `utils/results_manager.py` module (8.16 kB) enables **comprehensive session tracking**:
+- **In-Memory Storage**: Session-wide results persistence
+- **Export Capabilities**: CSV and JSON download with timestamp formatting
+- **Statistical Analysis**: Automatic accuracy calculation when ground truth available
+- **Data Integrity**: Results survive page refreshes within session boundaries
 ## 🧪 Testing Framework
 ### Test Infrastructure
 - **Preprocessing Tests**: Core pipeline functionality validation in `test_preprocessing.py`
 - **Limited Coverage**: Currently covers preprocessing functions only
+**Testing Coming Soon:**
+- Add model architecture unit tests
+- Integration tests for UI components
+- Performance benchmarking tests
+- Improved error handling validation
 ## 🔍 Security \& Quality Assessment
 - **Error Boundaries**: Multi-level exception handling with graceful degradation
 - **Logging**: Structured logging with appropriate severity levels
 ## 🚀 Extensibility Analysis
 ### Model Architecture Extensibility
+The **registry pattern enables seamless model addition**:
 1. **Implementation**: Create new model class with standardized interface
 2. **Registration**: Add to `models/registry.py` with factory function
 - Session state pruning for long-running sessions
 - Caching with content-based invalidation
 ## 🔮 Strategic Development Roadmap
+The project roadmap has been updated to reflect recent progress:
+- [x] **FTIR Support**: Modular integration of FTIR spectroscopy is complete.
+- [x] **Multi-Model Dashboard**: A model comparison tab has been implemented.
+- [ ] **Image-based Inference**: Future work to include image-based polymer classification.
+- [x] **Performance Tracking**: A performance tracking dashboard has been implemented.
+- [ ] **Enterprise Integration**: Future work to include a RESTful API and more advanced database integration.
 ## 💼 Business Logic \& Scientific Workflow
 ### Scientific Applications
+**Research Use Cases:**
 - Material science polymer degradation studies
 - Recycling viability assessment for circular economy
 ### Data Workflow Architecture
+```text
 Input Validation → Spectrum Preprocessing → Model Inference →
 Confidence Analysis → Results Visualization → Export Options
 ```
 **Risk Assessment:** Low - The codebase demonstrates mature engineering practices with appropriate validation and error handling for production deployment.
+**Recommendation:** This platform is ready for production deployment, representing a solid foundation for polymer classification research and industrial applications.
 ### EXTRA
     Column 1 (Input): Contains the main st.radio for mode selection and the conditional logic to display the single file uploader, batch uploader, or sample selector. It also holds the "Run Analysis" and "Reset All" buttons.
     Column 2 (Results): Contains all the logic for displaying either the batch results or the detailed, tabbed results for a single file (Details, Technical, Explanation).
 ```

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: AI Polymer Classification
 emoji: 🔬
 colorFrom: indigo
 colorTo: yellow
@@ -8,19 +8,21 @@ app_file: app.py
 pinned: false
 license: apache-2.0
 ---
-## AI-Driven Polymer Aging Prediction and Classification (v0.1)
-This web application classifies the degradation state of polymers using Raman spectroscopy and deep learning.
-It was developed as part of the AIRE 2025 internship project at the Imageomics Institute and demonstrates a prototype pipeline for evaluating multiple convolutional neural networks (CNNs) on spectral data.
 ---
 ## 🧪 Current Scope
-- 🔬 **Modality**: Raman spectroscopy (.txt)
-- 🧠 **Model**: Figure2CNN (baseline)
 - 📊 **Task**: Binary classification — Stable vs Weathered polymers
 - 🛠️ **Architecture**: PyTorch + Streamlit
 ---
@@ -29,84 +31,76 @@ It was developed as part of the AIRE 2025 internship project at the Imageomics I
 - [x] Inference from Raman `.txt` files
 - [x] Model selection (Figure2CNN, ResNet1D)
 - [ ] Add more trained CNNs for comparison
-- [ ] FTIR support (modular integration planned)
 - [ ] Image-based inference (future modality)
 ---
 ## 🧭 How to Use
-1. Upload a Raman spectrum `.txt` file (or select a sample)
-2. Choose a model from the sidebar
-3. Run analysis
-4. View prediction, logits, and technical information
-Supported input:
-- Plaintext `.txt` files with 1–2 columns
-- Space- or comma-separated
-- Comment lines (#) are ignored
-- Automatically resampled to 500 points
----
-## Contributors
-  👨‍🏫 Dr. Sanmukh Kuppannagari (Mentor)
-  👨‍🏫 Dr. Metin Karailyan (Mentor)
-  👨‍💻 Jaser Hasan (Author/Developer)
-## 🧠 Model Credit
-Baseline model inspired by:
-Neo, E.R.K., Low, J.S.C., Goodship, V., Debattista, K. (2023).
-*Deep learning for chemometric analysis of plastic spectral data from infrared and Raman databases.*
-_Resources, Conservation & Recycling_, **188**, 106718.
-[https://doi.org/10.1016/j.resconrec.2022.106718](https://doi.org/10.1016/j.resconrec.2022.106718)
 ---
-## 🔗 Links
-- 💻 **Live App**: [Hugging Face Space](https://huggingface.co/spaces/dev-jas/polymer-aging-ml)
-- 📂 **GitHub Repo**: [ml-polymer-recycling](https://github.com/KLab-AI3/ml-polymer-recycling)
-## 🎯 Strategic Expansion Objectives (Roadmap)
-**The roadmap defines three major expansion paths designed to broaden the system’s capabilities and impact:**
-1. **Model Expansion: Multi-Model Dashboard**
-    > The dashboard will evolve into a hub for multiple model architectures rather than being tied to a single baseline. Planned work includes:
-   - **Retraining & Fine-Tuning**: Incorporating publicly available vision models and retraining them with the polymer dataset.
-   - **Model Registry**: Automatically detecting available .pth weights and exposing them in the dashboard for easy selection.
-   - **Side-by-Side Reporting**: Running comparative experiments and reporting each model’s accuracy and diagnostics in a standardized format.
-   - **Reproducible Integration**: Maintaining modular scripts and pipelines so each model’s results can be replicated without conflict.
-   This ensures flexibility for future research and transparency in performance comparisons.
-2. **Image Input Modality**
-    > The system will support classification on images as an additional modality, extending beyond spectra. Key features will include:
-   - **Upload Support**: Users can upload single images or batches directly through the dashboard.
-   - **Multi-Model Execution**: Selected models from the registry can be applied to all uploaded images simultaneously.
-   - **Batch Results**: Output will be returned in a structured, accessible way, showing both individual predictions and aggregate statistics.
-   - **Enhanced Feedback**: Outputs will include predicted class, model confidence, and potentially annotated image previews.
-   This expands the system toward a multi-modal framework, supporting broader research workflows.
-3. **FTIR Dataset Integration**
-    > Although previously deferred, FTIR support will be added back in a modular, distinct fashion. Planned steps are:
-    - **Dedicated Preprocessing**: Tailored scripts to handle FTIR-specific signal characteristics (multi-layer handling, baseline correction, normalization).
-    - **Architecture Compatibility**: Ensuring existing and retrained models can process FTIR data without mixing it with Raman workflows.
-    - **UI Integration**: Introducing FTIR as a separate option in the modality selector, keeping Raman, Image, and FTIR workflows clearly delineated.
-    - **Phased Development**: Implementation details to be refined during meetings to ensure scientific rigor.
-    This guarantees FTIR becomes a supported modality without undermining the validated Raman foundation.

 ---
+title: AI Polymer Classification (Raman & FTIR)
 emoji: 🔬
 colorFrom: indigo
 colorTo: yellow
 pinned: false
 license: apache-2.0
 ---
+## AI-Driven Polymer Aging Prediction and Classification (v0.1)
+This web application classifies the degradation state of polymers using **Raman and FTIR spectroscopy** and deep learning.
+It is a prototype pipeline for evaluating multiple convolutional neural networks (CNNs) on spectral data.
 ---
 ## 🧪 Current Scope
+- 🔬 **Modalities**: Raman & FTIR spectroscopy
+- 💾 **Input Formats**: `.txt`, `.csv`, `.json` (with auto-detection)
+- 🧠 **Models**: Figure2CNN (baseline), ResNet1D, ResNet18Vision
 - 📊 **Task**: Binary classification — Stable vs Weathered polymers
+- 🚀 **Features**: Multi-model comparison, performance tracking dashboard
 - 🛠️ **Architecture**: PyTorch + Streamlit
 ---
 - [x] Inference from Raman `.txt` files
 - [x] Model selection (Figure2CNN, ResNet1D)
+- [x] **FTIR support** (modular integration complete)
+- [x] **Multi-model comparison dashboard**
+- [x] **Performance tracking dashboard**
 - [ ] Add more trained CNNs for comparison
 - [ ] Image-based inference (future modality)
+- [ ] RESTful API for programmatic access
 ---
 ## 🧭 How to Use
+The application provides three main analysis modes in a tabbed interface:
+1.  **Standard Analysis**:
+    - Upload a single spectrum file (`.txt`, `.csv`, `.json`) or a batch of files.
+    - Choose a model from the sidebar.
+    - Run analysis and view the prediction, confidence, and technical details.
+2.  **Model Comparison**:
+    - Upload a single spectrum file.
+    - The app runs inference with all available models.
+    - View a side-by-side comparison of the models' predictions and performance.
+3.  **Performance Tracking**:
+    - Explore a dashboard with visualizations of historical performance data.
+    - Compare model performance across different metrics.
+    - Export performance data in CSV or JSON format.
+### Supported Input
+- Plaintext `.txt`, `.csv`, or `.json` files.
+- Data can be space-, comma-, or tab-separated.
+- Comment lines (`#`, `%`) are ignored.
+- The app automatically detects the file format and resamples the data to a standard length.
 ---
+## Contributors
+Dr. Sanmukh Kuppannagari (Mentor)
+Dr. Metin Karailyan (Mentor)
+Jaser Hasan (Author/Developer)
+## Model Credit
+Baseline model inspired by:
+Neo, E.R.K., Low, J.S.C., Goodship, V., Debattista, K. (2023).
+_Deep learning for chemometric analysis of plastic spectral data from infrared and Raman databases._
+_Resources, Conservation & Recycling_, **188**, 106718.
+[https://doi.org/10.1016/j.resconrec.2022.106718](https://doi.org/10.1016/j.resconrec.2022.106718)
+---
+## 🔗 Links
+- **Live App**: [Hugging Face Space](https://huggingface.co/spaces/dev-jas/polymer-aging-ml)
+- **GitHub Repo**: [ml-polymer-recycling](https://github.com/KLab-AI3/ml-polymer-recycling)
+## 🚀 Technical Architecture
+**The system is built on a modular, production-ready architecture designed for scalability and maintainability.**
+- **Frontend**: A Streamlit-based web application (`app.py`) provides an interactive, multi-tab user interface.
+- **Backend**: PyTorch handles all deep learning operations, including model loading and inference.
+- **Model Management**: A registry pattern (`models/registry.py`) allows for dynamic model loading and easy integration of new architectures.
+- **Data Processing**: A robust, modality-aware preprocessing pipeline (`utils/preprocessing.py`) ensures data integrity and standardization for both Raman and FTIR data.
+- **Multi-Format Parsing**: The `utils/multifile.py` module handles parsing of `.txt`, `.csv`, and `.json` files.
+- **Results Management**: The `utils/results_manager.py` module manages session and persistent results, with support for multi-model comparison and data export.
+- **Performance Tracking**: The `utils/performance_tracker.py` module logs performance metrics to a SQLite database and provides a dashboard for visualization.
+- **Deployment**: The application is containerized using Docker (`Dockerfile`) for reproducible, cross-platform execution.

app.py CHANGED Viewed

@@ -8,6 +8,8 @@ from modules.ui_components import (
     render_sidebar,
     render_results_column,
     render_input_column,
     load_css,
 )
@@ -27,14 +29,28 @@ def main():
     load_css("static/style.css")
     init_session_state()
-    # Render UI components
     render_sidebar()
-    col1, col2 = st.columns([1, 1.35], gap="small")
-    with col1:
-        render_input_column()
-    with col2:
-        render_results_column()
 if __name__ == "__main__":

     render_sidebar,
     render_results_column,
     render_input_column,
+    render_comparison_tab,
+    render_performance_tab,
     load_css,
 )
     load_css("static/style.css")
     init_session_state()
     render_sidebar()
+    # Create main tabs for difference analysis modes
+    tab1, tab2, tab3 = st.tabs(
+        ["Standard Analysis", "Model Comparison", "Peformance Tracking"]
+    )
+    with tab1:
+        # Standard single-model analysis
+        col1, col2 = st.columns([1, 1.35], gap="small")
+        with col1:
+            render_input_column()
+        with col2:
+            render_results_column()
+    with tab2:
+        # Multi-model comparison interface
+        render_comparison_tab()
+    with tab3:
+        # Performance tracking interface
+        render_performance_tab()
 if __name__ == "__main__":

core_logic.py CHANGED Viewed

@@ -10,6 +10,7 @@ import numpy as np
 import streamlit as st
 from pathlib import Path
 from config import SAMPLE_DATA_DIR
 def label_file(filename: str) -> int:
@@ -89,16 +90,26 @@ def cleanup_memory():
 @st.cache_data
 def run_inference(y_resampled, model_choice, _cache_key=None):
-    """Run model inference and cache results"""
     model, model_loaded = load_model(model_choice)
     if not model_loaded:
         return None, None, None, None, None
     input_tensor = (
         torch.tensor(y_resampled, dtype=torch.float32).unsqueeze(0).unsqueeze(0)
     )
     start_time = time.time()
-    model.eval()
     with torch.no_grad():
         if model is None:
             raise ValueError(
@@ -108,11 +119,51 @@ def run_inference(y_resampled, model_choice, _cache_key=None):
         prediction = torch.argmax(logits, dim=1).item()
         logits_list = logits.detach().numpy().tolist()[0]
         probs = F.softmax(logits.detach(), dim=1).cpu().numpy().flatten()
     inference_time = time.time() - start_time
     cleanup_memory()
     return prediction, logits_list, probs, inference_time, logits
 @st.cache_data
 def get_sample_files():
     """Get list of sample files if available"""

 import streamlit as st
 from pathlib import Path
 from config import SAMPLE_DATA_DIR
+from datetime import datetime
 def label_file(filename: str) -> int:
 @st.cache_data
 def run_inference(y_resampled, model_choice, _cache_key=None):
+    """Run model inference and cache results with performance tracking"""
+    from utils.performance_tracker import get_performance_tracker, PerformanceMetrics
+    from datetime import datetime
     model, model_loaded = load_model(model_choice)
     if not model_loaded:
         return None, None, None, None, None
+    # Performance tracking setup
+    tracker = get_performance_tracker()
     input_tensor = (
         torch.tensor(y_resampled, dtype=torch.float32).unsqueeze(0).unsqueeze(0)
     )
+    # Track inference performance
     start_time = time.time()
+    start_memory = _get_memory_usage()
+    model.eval()  # type: ignore
     with torch.no_grad():
         if model is None:
             raise ValueError(
         prediction = torch.argmax(logits, dim=1).item()
         logits_list = logits.detach().numpy().tolist()[0]
         probs = F.softmax(logits.detach(), dim=1).cpu().numpy().flatten()
     inference_time = time.time() - start_time
+    end_memory = _get_memory_usage()
+    memory_usage = max(end_memory - start_memory, 0)
+    # Log performance metrics
+    try:
+        modality = st.session_state.get("modality_select", "raman")
+        confidence = float(max(probs)) if probs is not None and len(probs) > 0 else 0.0
+        metrics = PerformanceMetrics(
+            model_name=model_choice,
+            prediction_time=inference_time,
+            preprocessing_time=0.0,  # Will be updated by calling function if available
+            total_time=inference_time,
+            memory_usage_mb=memory_usage,
+            accuracy=None,  # Will be updated if ground truth is available
+            confidence=confidence,
+            timestamp=datetime.now().isoformat(),
+            input_size=(
+                len(y_resampled) if hasattr(y_resampled, "__len__") else TARGET_LEN
+            ),
+            modality=modality,
+        )
+        tracker.log_performance(metrics)
+    except (AttributeError, ValueError, KeyError) as e:
+        # Don't fail inference if performance tracking fails
+        print(f"Performance tracking failed: {e}")
     cleanup_memory()
     return prediction, logits_list, probs, inference_time, logits
+def _get_memory_usage() -> float:
+    """Get current memory usage in MB"""
+    try:
+        import psutil
+        process = psutil.Process()
+        return process.memory_info().rss / 1024 / 1024  # Convert to MB
+    except ImportError:
+        return 0.0  # psutil not available
 @st.cache_data
 def get_sample_files():
     """Get list of sample files if available"""

models/registry.py CHANGED Viewed

@@ -1,35 +1,138 @@
 # models/registry.py
-from typing import Callable, Dict
 from models.figure2_cnn import Figure2CNN
 from models.resnet_cnn import ResNet1D
-from models.resnet18_vision import ResNet18Vision
 # Internal registry of model builders keyed by short name.
 _REGISTRY: Dict[str, Callable[[int], object]] = {
     "figure2": lambda L: Figure2CNN(input_length=L),
     "resnet": lambda L: ResNet1D(input_length=L),
-    "resnet18vision": lambda L: ResNet18Vision(input_length=L)
 }
 def choices():
     """Return the list of available model keys."""
     return list(_REGISTRY.keys())
 def build(name: str, input_length: int):
     """Instantiate a model by short name with the given input length."""
     if name not in _REGISTRY:
         raise ValueError(f"Unknown model '{name}'. Choices: {choices()}")
     return _REGISTRY[name](input_length)
 def spec(name: str):
     """Return expected input length and number of classes for a model key."""
-    if name == "figure2":
-        return {"input_length": 500, "num_classes": 2}
-    if name == "resnet":
-        return {"input_length": 500, "num_classes": 2}
-    if name == "resnet18vision":
-        return {"input_length": 500, "num_classes": 2}
-    raise KeyError(f"Unknown model '{name}'")
-__all__ = ["choices", "build"]

 # models/registry.py
+from typing import Callable, Dict, List, Any
 from models.figure2_cnn import Figure2CNN
 from models.resnet_cnn import ResNet1D
+from models.resnet18_vision import ResNet18Vision
 # Internal registry of model builders keyed by short name.
 _REGISTRY: Dict[str, Callable[[int], object]] = {
     "figure2": lambda L: Figure2CNN(input_length=L),
     "resnet": lambda L: ResNet1D(input_length=L),
+    "resnet18vision": lambda L: ResNet18Vision(input_length=L),
 }
+# Model specifications with metadata for enhanced features
+_MODEL_SPECS: Dict[str, Dict[str, Any]] = {
+    "figure2": {
+        "input_length": 500,
+        "num_classes": 2,
+        "description": "Figure 2 baseline custom implemetation",
+        "modalities": ["raman", "ftir"],
+        "citation": "Neo et al., 2023, Resour. Conserv. Recycl., 188, 106718",
+    },
+    "resnet": {
+        "input_length": 500,
+        "num_classes": 2,
+        "description": "(Residual Network) uses skip connections to train much deeper networks",
+        "modalities": ["raman", "ftir"],
+        "citation": "Custom ResNet implementation",
+    },
+    "resnet18vision": {
+        "input_length": 500,
+        "num_classes": 2,
+        "description": "excels at image recognition tasks by using 'residual blocks' to train more efficiently",
+        "modalities": ["raman", "ftir"],
+        "citation": "ResNet18 Vision adaptation",
+    },
+}
+# Placeholder for future model expansions
+_FUTURE_MODELS = {
+    "densenet1d": {
+        "description": "DenseNet1D for spectroscopy (placeholder)",
+        "status": "planned",
+    },
+    "ensemble_cnn": {
+        "description": "Ensemble of CNN variants (placeholder)",
+        "status": "planned",
+    },
+}
 def choices():
     """Return the list of available model keys."""
     return list(_REGISTRY.keys())
+def planned_models():
+    """Return the list of planned future model keys."""
+    return list(_FUTURE_MODELS.keys())
 def build(name: str, input_length: int):
     """Instantiate a model by short name with the given input length."""
     if name not in _REGISTRY:
         raise ValueError(f"Unknown model '{name}'. Choices: {choices()}")
     return _REGISTRY[name](input_length)
+def build_multiple(names: List[str], input_length: int) -> Dict[str, Any]:
+    """Nuild multiple models for comparison."""
+    models = {}
+    for name in names:
+        if name in _REGISTRY:
+            models[name] = build(name, input_length)
+        else:
+            raise ValueError(f"Unknown model '{name}'. Available: {choices()}")
+    return models
+def register_model(
+    name: str, builder: Callable[[int], object], spec: Dict[str, Any]
+) -> None:
+    """Dynamically register a new model."""
+    if name in _REGISTRY:
+        raise ValueError(f"Model '{name}' already registered.")
+    if not callable(builder):
+        raise TypeError("Builder must be a callable that accepts an integer argument.")
+    _REGISTRY[name] = builder
+    _MODEL_SPECS[name] = spec
 def spec(name: str):
     """Return expected input length and number of classes for a model key."""
+    if name in _MODEL_SPECS:
+        return _MODEL_SPECS[name].copy()
+    raise KeyError(f"Unknown model '{name}'. Available: {choices()}")
+def get_model_info(name: str) -> Dict[str, Any]:
+    """Get comprehensive model information including metadata."""
+    if name in _MODEL_SPECS:
+        return _MODEL_SPECS[name].copy()
+    elif name in _FUTURE_MODELS:
+        return _FUTURE_MODELS[name].copy()
+    else:
+        raise KeyError(f"Unknown model '{name}'")
+def models_for_modality(modality: str) -> List[str]:
+    """Get list of models that support a specific modality."""
+    compatible = []
+    for name, spec_info in _MODEL_SPECS.items():
+        if modality in spec_info.get("modalities", []):
+            compatible.append(name)
+    return compatible
+def validate_model_list(names: List[str]) -> List[str]:
+    """Validate and return list of available models from input list."""
+    available = choices()
+    valid_models = []
+    for name in names:
+        if name is available:
+            valid_models.append(name)
+    return valid_models
+__all__ = [
+    "choices",
+    "build",
+    "spec",
+    "build_multiple",
+    "register_model",
+    "get_model_info",
+    "models_for_modality",
+    "validate_model_list",
+    "planned_models",
+]

modules/ui_components.py CHANGED Viewed

@@ -13,9 +13,9 @@ from modules.callbacks import (
     on_model_change,
     on_input_mode_change,
     on_sample_change,
     reset_ephemeral_state,
     log_message,
-    clear_batch_results,
 )
 from core_logic import (
     get_sample_files,
@@ -24,7 +24,6 @@ from core_logic import (
     parse_spectrum_data,
     label_file,
 )
-from modules.callbacks import reset_results
 from utils.results_manager import ResultsManager
 from utils.confidence import calculate_softmax_confidence
 from utils.multifile import process_multiple_files, display_batch_results
@@ -41,7 +40,7 @@ def create_spectrum_plot(x_raw, y_raw, x_resampled, y_resampled, _cache_key=None
     """Create spectrum visualization plot"""
     fig, ax = plt.subplots(1, 2, figsize=(13, 5), dpi=100)
-    # == Raw spectrum ==
     ax[0].plot(x_raw, y_raw, label="Raw", color="dimgray", linewidth=1)
     ax[0].set_title("Raw Input Spectrum")
     ax[0].set_xlabel("Wavenumber (cm⁻¹)")
@@ -49,7 +48,7 @@ def create_spectrum_plot(x_raw, y_raw, x_resampled, y_resampled, _cache_key=None
     ax[0].grid(True, alpha=0.3)
     ax[0].legend()
-    # == Resampled spectrum ==
     ax[1].plot(
         x_resampled, y_resampled, label="Resampled", color="steelblue", linewidth=1
     )
@@ -60,7 +59,7 @@ def create_spectrum_plot(x_raw, y_raw, x_resampled, y_resampled, _cache_key=None
     ax[1].legend()
     fig.tight_layout()
-    # == Convert to image ==
     buf = io.BytesIO()
     plt.savefig(buf, format="png", bbox_inches="tight", dpi=100)
     buf.seek(0)
@@ -69,6 +68,9 @@ def create_spectrum_plot(x_raw, y_raw, x_resampled, y_resampled, _cache_key=None
     return Image.open(buf)
 def render_confidence_progress(
     probs: np.ndarray,
     labels: list[str] = ["Stable", "Weathered"],
@@ -114,7 +116,10 @@ def render_confidence_progress(
                     st.markdown("")
-def render_kv_grid(d: dict = {}, ncols: int = 2):
     if d is None:
         d = {}
     if not d:
@@ -126,6 +131,9 @@ def render_kv_grid(d: dict = {}, ncols: int = 2):
             st.caption(f"**{k}:** {v}")
 def render_model_meta(model_choice: str):
     info = MODEL_CONFIG.get(model_choice, {})
     emoji = info.get("emoji", "")
@@ -143,6 +151,9 @@ def render_model_meta(model_choice: str):
         st.caption(desc)
 def get_confidence_description(logit_margin):
     """Get human-readable confidence description"""
     if logit_margin > 1000:
@@ -155,13 +166,35 @@ def get_confidence_description(logit_margin):
         return "LOW", "🔴"
 def render_sidebar():
     with st.sidebar:
         # Header
         st.header("AI-Driven Polymer Classification")
         st.caption(
-            "Predict polymer degradation (Stable vs Weathered) from Raman spectra using validated CNN models. — v0.1"
         )
         model_labels = [
             f"{MODEL_CONFIG[name]['emoji']} {name}" for name in MODEL_CONFIG.keys()
         ]
@@ -173,10 +206,10 @@ def render_sidebar():
         )
         model_choice = selected_label.split(" ", 1)[1]
-        # ===Compact metadata directly under dropdown===
         render_model_meta(model_choice)
-        # ===Collapsed info to reduce clutter===
         with st.expander("About This App", icon=":material/info:", expanded=False):
             st.markdown(
                 """
@@ -184,8 +217,9 @@ def render_sidebar():
             **Purpose**: Classify polymer degradation using AI<br>
             **Input**: Raman spectroscopy .txt files<br>
-            **Models**: CNN architectures for binary classification<br>
-            **Next**: More trained CNNs in evaluation pipeline<br>
             **Contributors**<br>
@@ -207,11 +241,7 @@ def render_sidebar():
             )
-# col1 goes here
-# In modules/ui_components.py
 def render_input_column():
     st.markdown("##### Data Input")
@@ -224,22 +254,20 @@ def render_input_column():
     )
     # == Input Mode Logic ==
-    # ... (The if/elif/else block for Upload, Batch, and Sample modes remains exactly the same) ...
-    # ==Upload tab==
     if mode == "Upload File":
         upload_key = st.session_state["current_upload_key"]
         up = st.file_uploader(
-            "Upload Raman spectrum (.txt)",
-            type="txt",
-            help="Upload a text file with wavenumber and intensity columns",
             key=upload_key,  # ← versioned key
         )
-        # ==Process change immediately (no on_change; simpler & reliable)==
         if up is not None:
             raw = up.read()
             text = raw.decode("utf-8") if isinstance(raw, bytes) else raw
-            # == only reparse if its a different file|source ==
             if (
                 st.session_state.get("filename") != getattr(up, "name", None)
                 or st.session_state.get("input_source") != "upload"
@@ -255,23 +283,20 @@ def render_input_column():
                 st.session_state["status_type"] = "success"
                 reset_results("New file uploaded")
-    # ==Batch Upload tab==
     elif mode == "Batch Upload":
         st.session_state["batch_mode"] = True
-        # --- START: BUG 1 & 3 FIX ---
         # Use a versioned key to ensure the file uploader resets properly.
         batch_upload_key = f"batch_upload_{st.session_state['uploader_version']}"
         uploaded_files = st.file_uploader(
-            "Upload multiple Raman spectrum files (.txt)",
-            type="txt",
             accept_multiple_files=True,
-            help="Upload one or more text files with wavenumber and intensity columns.",
             key=batch_upload_key,
         )
-        # --- END: BUG 1 & 3 FIX ---
         if uploaded_files:
-            # --- START: Bug 1 Fix ---
             # Use a dictionary to keep only unique files based on name and size
             unique_files = {(file.name, file.size): file for file in uploaded_files}
             unique_file_list = list(unique_files.values())
@@ -281,9 +306,7 @@ def render_input_column():
             # Optionally, inform the user that duplicates were removed
             if num_uploaded > num_unique:
-                st.info(
-                    f"ℹ️ {num_uploaded - num_unique} duplicate file(s) were removed."
-                )
             # Use the unique list
             st.session_state["batch_files"] = unique_file_list
@@ -291,7 +314,6 @@ def render_input_column():
                 f"{num_unique} ready for batch analysis"
             )
             st.session_state["status_type"] = "success"
-            # --- END: Bug 1 Fix ---
         else:
             st.session_state["batch_files"] = []
             # This check prevents resetting the status if files are already staged
@@ -301,7 +323,7 @@ def render_input_column():
                 )
                 st.session_state["status_type"] = "info"
-    # ==Sample tab==
     elif mode == "Sample Data":
         st.session_state["batch_mode"] = False
         sample_files = get_sample_files()
@@ -330,9 +352,6 @@ def render_input_column():
     else:
         st.info(msg)
-    # --- DE-NESTED LOGIC STARTS HERE ---
-    # This code now runs on EVERY execution, guaranteeing the buttons will appear.
     # Safely get model choice from session state
     model_choice = st.session_state.get("model_select", " ").split(" ", 1)[1]
     model = load_model(model_choice)
@@ -388,7 +407,7 @@ def render_input_column():
                 st.error(f"Error processing spectrum data: {e}")
-# col2 goes here
 def render_results_column():
@@ -410,7 +429,7 @@ def render_results_column():
         filename = st.session_state.get("filename", "Unknown")
         if all(v is not None for v in [x_raw, y_raw, y_resampled]):
-            # ===Run inference===
             if y_resampled is None:
                 raise ValueError(
                     "y_resampled is None. Ensure spectrum data is properly resampled before proceeding."
@@ -437,14 +456,14 @@ def render_results_column():
                 f"Inference completed in {inference_time:.2f}s, prediction: {prediction}"
             )
-            # ===Get ground truth===
             true_label_idx = label_file(filename)
             true_label_str = (
                 LABEL_MAP.get(true_label_idx, "Unknown")
                 if true_label_idx is not None
                 else "Unknown"
             )
-            # ===Get prediction===
             predicted_class = LABEL_MAP.get(int(prediction), f"Class {int(prediction)}")
             # Enhanced confidence calculation
@@ -455,7 +474,7 @@ def render_results_column():
                 )
                 confidence_desc = confidence_level
             else:
-                # Fallback to legace method
                 logit_margin = abs(
                     (logits_list[0] - logits_list[1])
                     if logits_list is not None and len(logits_list) >= 2
@@ -487,7 +506,7 @@ def render_results_column():
                 },
             )
-            # ===Precompute Stats===
             model_choice = (
                 st.session_state.get("model_select", "").split(" ", 1)[1]
                 if "model_select" in st.session_state
@@ -505,7 +524,6 @@ def render_results_column():
                 if os.path.exists(model_path)
                 else "N/A"
             )
-            # Removed unused variable 'input_tensor'
             start_render = time.time()
@@ -590,17 +608,13 @@ def render_results_column():
                     """,
                         unsafe_allow_html=True,
                     )
-                    # --- END: CONSOLIDATED CONFIDENCE ANALYSIS ---
                     st.divider()
-                    # --- START: CLEAN METADATA FOOTER ---
-                    # Secondary info is now a clean, single-line caption
                     st.caption(
                         f"Analyzed with **{st.session_state.get('model_select', 'Unknown')}** in **{inference_time:.2f}s**."
                     )
-                    # --- END: CLEAN METADATA FOOTER ---
                 st.markdown("</div>", unsafe_allow_html=True)
             elif active_tab == "Technical":
@@ -918,7 +932,7 @@ def render_results_column():
             """
             )
     else:
-        # ===Getting Started===
         st.markdown(
             """
         ##### How to Get Started
@@ -948,3 +962,416 @@ def render_results_column():
         - 🏭 Quality control in manufacturing
         """
         )

     on_model_change,
     on_input_mode_change,
     on_sample_change,
+    reset_results,
     reset_ephemeral_state,
     log_message,
 )
 from core_logic import (
     get_sample_files,
     parse_spectrum_data,
     label_file,
 )
 from utils.results_manager import ResultsManager
 from utils.confidence import calculate_softmax_confidence
 from utils.multifile import process_multiple_files, display_batch_results
     """Create spectrum visualization plot"""
     fig, ax = plt.subplots(1, 2, figsize=(13, 5), dpi=100)
+    # Raw spectrum
     ax[0].plot(x_raw, y_raw, label="Raw", color="dimgray", linewidth=1)
     ax[0].set_title("Raw Input Spectrum")
     ax[0].set_xlabel("Wavenumber (cm⁻¹)")
     ax[0].grid(True, alpha=0.3)
     ax[0].legend()
+    # Resampled spectrum
     ax[1].plot(
         x_resampled, y_resampled, label="Resampled", color="steelblue", linewidth=1
     )
     ax[1].legend()
     fig.tight_layout()
+    # Convert to image
     buf = io.BytesIO()
     plt.savefig(buf, format="png", bbox_inches="tight", dpi=100)
     buf.seek(0)
     return Image.open(buf)
+# //////////////////////////////////////////
 def render_confidence_progress(
     probs: np.ndarray,
     labels: list[str] = ["Stable", "Weathered"],
                     st.markdown("")
+from typing import Optional
+def render_kv_grid(d: Optional[dict] = None, ncols: int = 2):
     if d is None:
         d = {}
     if not d:
             st.caption(f"**{k}:** {v}")
+# //////////////////////////////////////////
 def render_model_meta(model_choice: str):
     info = MODEL_CONFIG.get(model_choice, {})
     emoji = info.get("emoji", "")
         st.caption(desc)
+# //////////////////////////////////////////
 def get_confidence_description(logit_margin):
     """Get human-readable confidence description"""
     if logit_margin > 1000:
         return "LOW", "🔴"
+# //////////////////////////////////////////
 def render_sidebar():
     with st.sidebar:
         # Header
         st.header("AI-Driven Polymer Classification")
         st.caption(
+            "Predict polymer degradation (Stable vs Weathered) from Raman/FTIR spectra using validated CNN models. — v0.01"
+        )
+        # Modality Selection
+        st.markdown("##### Spectroscopy Modality")
+        modality = st.selectbox(
+            "Choose Modality",
+            ["raman", "ftir"],
+            index=0,
+            key="modality_select",
+            format_func=lambda x: f"{'Raman' if x == 'raman' else 'FTIR'}",
         )
+        # Display modality info
+        if modality == "ftir":
+            st.info("FTIR mode: 400-4000 cm-1 range with atmospheric correction")
+        else:
+            st.info("Raman mode: 200-4000 cm-1 range with standard preprocessing")
+        # Model selection
+        st.markdown("##### AI Model Selection")
         model_labels = [
             f"{MODEL_CONFIG[name]['emoji']} {name}" for name in MODEL_CONFIG.keys()
         ]
         )
         model_choice = selected_label.split(" ", 1)[1]
+        # Compact metadata directly under dropdown
         render_model_meta(model_choice)
+        # Collapsed info to reduce clutter
         with st.expander("About This App", icon=":material/info:", expanded=False):
             st.markdown(
                 """
             **Purpose**: Classify polymer degradation using AI<br>
             **Input**: Raman spectroscopy .txt files<br>
+            **Models**: CNN architectures for classification<br>
+            **Modalities**: Raman and FTIR spectroscopy support<br>
+            **Features**: Multi-model comparison and analysis<br>
             **Contributors**<br>
             )
+# //////////////////////////////////////////
 def render_input_column():
     st.markdown("##### Data Input")
     )
     # == Input Mode Logic ==
     if mode == "Upload File":
         upload_key = st.session_state["current_upload_key"]
         up = st.file_uploader(
+            "Upload spectrum file (.txt, .csv, .json)",
+            type=["txt", "csv", "json"],
+            help="Upload spectroscopy data: TXT (2-column), CSV (with headers), or JSON format",
             key=upload_key,  # ← versioned key
         )
+        # Process change immediately
         if up is not None:
             raw = up.read()
             text = raw.decode("utf-8") if isinstance(raw, bytes) else raw
+            # only reparse if its a different file|source
             if (
                 st.session_state.get("filename") != getattr(up, "name", None)
                 or st.session_state.get("input_source") != "upload"
                 st.session_state["status_type"] = "success"
                 reset_results("New file uploaded")
+    # Batch Upload tab
     elif mode == "Batch Upload":
         st.session_state["batch_mode"] = True
         # Use a versioned key to ensure the file uploader resets properly.
         batch_upload_key = f"batch_upload_{st.session_state['uploader_version']}"
         uploaded_files = st.file_uploader(
+            "Upload multiple spectrum files (.txt, .csv, .json)",
+            type=["txt", "csv", "json"],
             accept_multiple_files=True,
+            help="Upload spectroscopy files in TXT, CSV, or JSON format.",
             key=batch_upload_key,
         )
         if uploaded_files:
             # Use a dictionary to keep only unique files based on name and size
             unique_files = {(file.name, file.size): file for file in uploaded_files}
             unique_file_list = list(unique_files.values())
             # Optionally, inform the user that duplicates were removed
             if num_uploaded > num_unique:
+                st.info(f"{num_uploaded - num_unique} duplicate file(s) were removed.")
             # Use the unique list
             st.session_state["batch_files"] = unique_file_list
                 f"{num_unique} ready for batch analysis"
             )
             st.session_state["status_type"] = "success"
         else:
             st.session_state["batch_files"] = []
             # This check prevents resetting the status if files are already staged
                 )
                 st.session_state["status_type"] = "info"
+    # Sample tab
     elif mode == "Sample Data":
         st.session_state["batch_mode"] = False
         sample_files = get_sample_files()
     else:
         st.info(msg)
     # Safely get model choice from session state
     model_choice = st.session_state.get("model_select", " ").split(" ", 1)[1]
     model = load_model(model_choice)
                 st.error(f"Error processing spectrum data: {e}")
+# //////////////////////////////////////////
 def render_results_column():
         filename = st.session_state.get("filename", "Unknown")
         if all(v is not None for v in [x_raw, y_raw, y_resampled]):
+            # Run inference
             if y_resampled is None:
                 raise ValueError(
                     "y_resampled is None. Ensure spectrum data is properly resampled before proceeding."
                 f"Inference completed in {inference_time:.2f}s, prediction: {prediction}"
             )
+            # Get ground truth
             true_label_idx = label_file(filename)
             true_label_str = (
                 LABEL_MAP.get(true_label_idx, "Unknown")
                 if true_label_idx is not None
                 else "Unknown"
             )
+            # Get prediction
             predicted_class = LABEL_MAP.get(int(prediction), f"Class {int(prediction)}")
             # Enhanced confidence calculation
                 )
                 confidence_desc = confidence_level
             else:
+                # Fallback to legacy method
                 logit_margin = abs(
                     (logits_list[0] - logits_list[1])
                     if logits_list is not None and len(logits_list) >= 2
                 },
             )
+            # Precompute Stats
             model_choice = (
                 st.session_state.get("model_select", "").split(" ", 1)[1]
                 if "model_select" in st.session_state
                 if os.path.exists(model_path)
                 else "N/A"
             )
             start_render = time.time()
                     """,
                         unsafe_allow_html=True,
                     )
                     st.divider()
+                    # METADATA FOOTER
                     st.caption(
                         f"Analyzed with **{st.session_state.get('model_select', 'Unknown')}** in **{inference_time:.2f}s**."
                     )
                 st.markdown("</div>", unsafe_allow_html=True)
             elif active_tab == "Technical":
             """
             )
     else:
+        # Getting Started
         st.markdown(
             """
         ##### How to Get Started
         - 🏭 Quality control in manufacturing
         """
         )
+# //////////////////////////////////////////
+def render_comparison_tab():
+    """Render the multi-model comparison interface"""
+    import streamlit as st
+    import matplotlib.pyplot as plt
+    from models.registry import choices, validate_model_list
+    from utils.results_manager import ResultsManager
+    from core_logic import get_sample_files, run_inference, parse_spectrum_data
+    from utils.preprocessing import preprocess_spectrum
+    from utils.multifile import parse_spectrum_data
+    import numpy as np
+    import time
+    st.markdown("### Multi-Model Comparison Analysis")
+    st.markdown(
+        "Compare predictions across different AI models for comprehensive analysis."
+    )
+    # Model selection for comparison
+    st.markdown("##### Select Models for Comparison")
+    available_models = choices()
+    selected_models = st.multiselect(
+        "Choose models to compare",
+        available_models,
+        default=(
+            available_models[:2] if len(available_models) >= 2 else available_models
+        ),
+        help="Select 2 or more models to compare their predictions side-by-side",
+    )
+    if len(selected_models) < 2:
+        st.warning("⚠️ Please select at least 2 models for comparison.")
+    # Input selection for comparison
+    col1, col2 = st.columns([1, 1.5])
+    with col1:
+        st.markdown("###### Input Data")
+        # File upload for comparison
+        comparison_file = st.file_uploader(
+            "Upload spectrum for comparison",
+            type=["txt", "csv", "json"],
+            key="comparison_file_upload",
+            help="Upload a spectrum file to test across all selected models",
+        )
+        # Or select sample data
+        selected_sample = None  # Initialize with a default value
+        sample_files = get_sample_files()
+        if sample_files:
+            sample_options = ["-- Select Sample --"] + [p.name for p in sample_files]
+            selected_sample = st.selectbox(
+                "Or choose sample data", sample_options, key="comparison_sample_select"
+            )
+        # Get modality from session state
+        modality = st.session_state.get("modality_select", "raman")
+        st.info(f"Using {modality.upper()} preprocessing parameters")
+        # Run comparison button
+        run_comparison = st.button(
+            "Run Multi-Model Comparison",
+            type="primary",
+            disabled=not (
+                comparison_file
+                or (sample_files and selected_sample != "-- Select Sample --")
+            ),
+        )
+    with col2:
+        st.markdown("###### Comparison Results")
+        if run_comparison:
+            # Determine input source
+            input_text = None
+            filename = "unknown"
+            if comparison_file:
+                raw = comparison_file.read()
+                input_text = raw.decode("utf-8") if isinstance(raw, bytes) else raw
+                filename = comparison_file.name
+            elif sample_files and selected_sample != "-- Select Sample --":
+                sample_path = next(p for p in sample_files if p.name == selected_sample)
+                with open(sample_path, "r") as f:
+                    input_text = f.read()
+                filename = selected_sample
+            if input_text:
+                try:
+                    # Parse spectrum data
+                    x_raw, y_raw = parse_spectrum_data(
+                        str(input_text), filename or "unknown_filename"
+                    )
+                    # Store results
+                    comparison_results = {}
+                    processing_times = {}
+                    progress_bar = st.progress(0)
+                    status_text = st.empty()
+                    for i, model_name in enumerate(selected_models):
+                        status_text.text(f"Running inference with {model_name}...")
+                        start_time = time.time()
+                        # Preprocess spectrum with modality-specific parameters
+                        _, y_processed = preprocess_spectrum(
+                            x_raw, y_raw, modality=modality, target_len=500
+                        )
+                        # Run inference
+                        prediction, logits_list, probs, inference_time, logits = (
+                            run_inference(y_processed, model_name)
+                        )
+                        processing_time = time.time() - start_time
+                        if prediction is not None:
+                            # Map prediction to class name
+                            class_names = ["Stable", "Weathered"]
+                            predicted_class = (
+                                class_names[int(prediction)]
+                                if prediction < len(class_names)
+                                else f"Class_{prediction}"
+                            )
+                            confidence = (
+                                max(probs)
+                                if probs is not None and len(probs) > 0
+                                else 0.0
+                            )
+                            comparison_results[model_name] = {
+                                "prediction": prediction,
+                                "predicted_class": predicted_class,
+                                "confidence": confidence,
+                                "probs": probs if probs is not None else [],
+                                "logits": (
+                                    logits_list if logits_list is not None else []
+                                ),
+                                "processing_time": processing_time,
+                            }
+                            processing_times[model_name] = processing_time
+                        progress_bar.progress((i + 1) / len(selected_models))
+                    status_text.text("Comparison complete!")
+                    # Display results
+                    if comparison_results:
+                        st.markdown("###### Model Predictions")
+                        # Create comparison table
+                        import pandas as pd
+                        table_data = []
+                        for model_name, result in comparison_results.items():
+                            row = {
+                                "Model": model_name,
+                                "Prediction": result["predicted_class"],
+                                "Confidence": f"{result['confidence']:.3f}",
+                                "Processing Time (s)": f"{result['processing_time']:.3f}",
+                            }
+                            table_data.append(row)
+                        df = pd.DataFrame(table_data)
+                        st.dataframe(df, use_container_width=True)
+                        # Show confidence comparison
+                        st.markdown("##### Confidence Comparison")
+                        conf_col1, conf_col2 = st.columns(2)
+                        with conf_col1:
+                            # Bar chart of confidences
+                            models = list(comparison_results.keys())
+                            confidences = [
+                                comparison_results[m]["confidence"] for m in models
+                            ]
+                            fig, ax = plt.subplots(figsize=(8, 5))
+                            bars = ax.bar(
+                                models,
+                                confidences,
+                                alpha=0.7,
+                                color=["steelblue", "orange", "green", "red"][
+                                    : len(models)
+                                ],
+                            )
+                            ax.set_ylabel("Confidence")
+                            ax.set_title("Model Confidence Comparison")
+                            ax.set_ylim(0, 1)
+                            plt.xticks(rotation=45)
+                            # Add value labels on bars
+                            for bar, conf in zip(bars, confidences):
+                                height = bar.get_height()
+                                ax.text(
+                                    bar.get_x() + bar.get_width() / 2.0,
+                                    height + 0.01,
+                                    f"{conf:.3f}",
+                                    ha="center",
+                                    va="bottom",
+                                )
+                            plt.tight_layout()
+                            st.pyplot(fig)
+                        with conf_col2:
+                            # Agreement analysis
+                            predictions = [
+                                comparison_results[m]["prediction"] for m in models
+                            ]
+                            unique_predictions = set(predictions)
+                            if len(unique_predictions) == 1:
+                                st.success("✅ All models agree on the prediction!")
+                            else:
+                                st.warning("⚠️ Models disagree on the prediction")
+                                # Show prediction distribution
+                                from collections import Counter
+                                pred_counts = Counter(predictions)
+                                st.markdown("**Prediction Distribution:**")
+                                for pred, count in pred_counts.items():
+                                    class_name = (
+                                        ["Stable", "Weathered"][pred]
+                                        if pred < 2
+                                        else f"Class_{pred}"
+                                    )
+                                    percentage = (count / len(predictions)) * 100
+                                    st.write(
+                                        f"- {class_name}: {count}/{len(predictions)} models ({percentage:.1f}%)"
+                                    )
+                        # Performance metrics
+                        st.markdown("##### Performance Metrics")
+                        perf_col1, perf_col2 = st.columns(2)
+                        with perf_col1:
+                            avg_time = np.mean(list(processing_times.values()))
+                            fastest_model = min(
+                                processing_times.keys(),
+                                key=lambda k: processing_times[k],
+                            )
+                            slowest_model = max(
+                                processing_times.keys(),
+                                key=lambda k: processing_times[k],
+                            )
+                            st.metric("Average Processing Time", f"{avg_time:.3f}s")
+                            st.metric(
+                                "Fastest Model",
+                                f"{fastest_model}",
+                                f"{processing_times[fastest_model]:.3f}s",
+                            )
+                            st.metric(
+                                "Slowest Model",
+                                f"{slowest_model}",
+                                f"{processing_times[slowest_model]:.3f}s",
+                            )
+                        with perf_col2:
+                            most_confident = max(
+                                comparison_results.keys(),
+                                key=lambda k: comparison_results[k]["confidence"],
+                            )
+                            least_confident = min(
+                                comparison_results.keys(),
+                                key=lambda k: comparison_results[k]["confidence"],
+                            )
+                            st.metric(
+                                "Most Confident",
+                                f"{most_confident}",
+                                f"{comparison_results[most_confident]['confidence']:.3f}",
+                            )
+                            st.metric(
+                                "Least Confident",
+                                f"{least_confident}",
+                                f"{comparison_results[least_confident]['confidence']:.3f}",
+                            )
+                            # Store results in session state for potential export
+                        # Store results in session state for potential export
+                        st.session_state["last_comparison_results"] = {
+                            "filename": filename,
+                            "modality": modality,
+                            "models": comparison_results,
+                            "summary": {
+                                "agreement": len(unique_predictions) == 1,
+                                "avg_processing_time": avg_time,
+                                "fastest_model": fastest_model,
+                                "most_confident": most_confident,
+                            },
+                        }
+                except Exception as e:
+                    st.error(f"Error during comparison: {str(e)}")
+            # Show recent comparison results if available
+            elif "last_comparison_results" in st.session_state:
+                st.info(
+                    "Previous comparison results available. Upload a new file or select a sample to run new comparison."
+                )
+    # Show comparison history
+    comparison_stats = ResultsManager.get_comparison_stats()
+    if comparison_stats:
+        st.markdown("#### Comparison History")
+        with st.expander("View detailed comparison statistics", expanded=False):
+            # Show model statistics table
+            stats_data = []
+            for model_name, stats in comparison_stats.items():
+                row = {
+                    "Model": model_name,
+                    "Total Predictions": stats["total_predictions"],
+                    "Avg Confidence": f"{stats['avg_confidence']:.3f}",
+                    "Avg Processing Time": f"{stats['avg_processing_time']:.3f}s",
+                    "Accuracy": (
+                        f"{stats['accuracy']:.3f}"
+                        if stats["accuracy"] is not None
+                        else "N/A"
+                    ),
+                }
+                stats_data.append(row)
+            if stats_data:
+                import pandas as pd
+                stats_df = pd.DataFrame(stats_data)
+                st.dataframe(stats_df, use_container_width=True)
+                # Show agreement matrix if multiple models
+                agreement_matrix = ResultsManager.get_agreement_matrix()
+                if not agreement_matrix.empty and len(agreement_matrix) > 1:
+                    st.markdown("**Model Agreement Matrix**")
+                    st.dataframe(agreement_matrix.round(3), use_container_width=True)
+                    # Plot agreement heatmap
+                    fig, ax = plt.subplots(figsize=(8, 6))
+                    im = ax.imshow(
+                        agreement_matrix.values, cmap="RdYlGn", vmin=0, vmax=1
+                    )
+                    # Add text annotations
+                    for i in range(len(agreement_matrix)):
+                        for j in range(len(agreement_matrix.columns)):
+                            text = ax.text(
+                                j,
+                                i,
+                                f"{agreement_matrix.iloc[i, j]:.2f}",
+                                ha="center",
+                                va="center",
+                                color="black",
+                            )
+                    ax.set_xticks(range(len(agreement_matrix.columns)))
+                    ax.set_yticks(range(len(agreement_matrix)))
+                    ax.set_xticklabels(agreement_matrix.columns, rotation=45)
+                    ax.set_yticklabels(agreement_matrix.index)
+                    ax.set_title("Model Agreement Matrix")
+                    plt.colorbar(im, ax=ax, label="Agreement Rate")
+                    plt.tight_layout()
+                    st.pyplot(fig)
+        # Export functionality
+        if "last_comparison_results" in st.session_state:
+            st.markdown("##### Export Results")
+        export_col1, export_col2 = st.columns(2)
+        with export_col1:
+            if st.button("📥 Export Comparison (JSON)"):
+                import json
+                results = st.session_state["last_comparison_results"]
+                json_str = json.dumps(results, indent=2, default=str)
+                st.download_button(
+                    label="Download JSON",
+                    data=json_str,
+                    file_name=f"comparison_{results['filename'].split('.')[0]}.json",
+                    mime="application/json",
+                )
+        with export_col2:
+            if st.button("📊 Export Full Report"):
+                report = ResultsManager.export_comparison_report()
+                st.download_button(
+                    label="Download Full Report",
+                    data=report,
+                    file_name="model_comparison_report.json",
+                    mime="application/json",
+                )
+# //////////////////////////////////////////
+def render_performance_tab():
+    """Render the performance tracking and analysis tab."""
+    from utils.performance_tracker import display_performance_dashboard
+    display_performance_dashboard()

sample_data/ftir-stable-1.txt ADDED Viewed

	@@ -0,0 +1,75 @@

+# Sample FTIR spectrum data - Stable polymer
+# Wavenumber (cm^-1)  Absorbance
+400.0  0.045
+450.0  0.048
+500.0  0.052
+550.0  0.056
+600.0  0.061
+650.0  0.065
+700.0  0.070
+750.0  0.075
+800.0  0.082
+850.0  0.089
+900.0  0.096
+950.0  0.104
+1000.0  0.112
+1050.0  0.121
+1100.0  0.130
+1150.0  0.140
+1200.0  0.151
+1250.0  0.162
+1300.0  0.174
+1350.0  0.187
+1400.0  0.200
+1450.0  0.215
+1500.0  0.230
+1550.0  0.246
+1600.0  0.263
+1650.0  0.281
+1700.0  0.300
+1750.0  0.320
+1800.0  0.341
+1850.0  0.363
+1900.0  0.386
+1950.0  0.410
+2000.0  0.435
+2050.0  0.461
+2100.0  0.488
+2150.0  0.516
+2200.0  0.545
+2250.0  0.575
+2300.0  0.606
+2350.0  0.638
+2400.0  0.671
+2450.0  0.705
+2500.0  0.740
+2550.0  0.776
+2600.0  0.813
+2650.0  0.851
+2700.0  0.890
+2750.0  0.930
+2800.0  0.971
+2850.0  1.013
+2900.0  1.056
+2950.0  1.100
+3000.0  1.145
+3050.0  1.191
+3100.0  1.238
+3150.0  1.286
+3200.0  1.335
+3250.0  1.385
+3300.0  1.436
+3350.0  1.488
+3400.0  1.541
+3450.0  1.595
+3500.0  1.650
+3550.0  1.706
+3600.0  1.763
+3650.0  1.821
+3700.0  1.880
+3750.0  1.940
+3800.0  2.001
+3850.0  2.063
+3900.0  2.126
+3950.0  2.190
+4000.0  2.255

sample_data/ftir-weathered-1.txt ADDED Viewed

	@@ -0,0 +1,75 @@

+# Sample FTIR spectrum data - Weathered polymer
+# Wavenumber (cm^-1)  Absorbance
+400.0  0.062
+450.0  0.069
+500.0  0.077
+550.0  0.086
+600.0  0.095
+650.0  0.105
+700.0  0.116
+750.0  0.128
+800.0  0.141
+850.0  0.155
+900.0  0.170
+950.0  0.186
+1000.0  0.203
+1050.0  0.221
+1100.0  0.240
+1150.0  0.260
+1200.0  0.281
+1250.0  0.303
+1300.0  0.326
+1350.0  0.350
+1400.0  0.375
+1450.0  0.401
+1500.0  0.428
+1550.0  0.456
+1600.0  0.485
+1650.0  0.515
+1700.0  0.546
+1750.0  0.578
+1800.0  0.611
+1850.0  0.645
+1900.0  0.680
+1950.0  0.716
+2000.0  0.753
+2050.0  0.791
+2100.0  0.830
+2150.0  0.870
+2200.0  0.911
+2250.0  0.953
+2300.0  0.996
+2350.0  1.040
+2400.0  1.085
+2450.0  1.131
+2500.0  1.178
+2550.0  1.226
+2600.0  1.275
+2650.0  1.325
+2700.0  1.376
+2750.0  1.428
+2800.0  1.481
+2850.0  1.535
+2900.0  1.590
+2950.0  1.646
+3000.0  1.703
+3050.0  1.761
+3100.0  1.820
+3150.0  1.880
+3200.0  1.941
+3250.0  2.003
+3300.0  2.066
+3350.0  2.130
+3400.0  2.195
+3450.0  2.261
+3500.0  2.328
+3550.0  2.396
+3600.0  2.465
+3650.0  2.535
+3700.0  2.606
+3750.0  2.678
+3800.0  2.751
+3850.0  2.825
+3900.0  2.900
+3950.0  2.976
+4000.0  3.053

sample_data/stable.sample.csv ADDED Viewed

	@@ -0,0 +1,22 @@

+wavenumber,intensity
+200.0,1542.3
+205.0,1543.1
+210.0,1544.8
+215.0,1546.2
+220.0,1547.9
+225.0,1549.1
+230.0,1550.4
+235.0,1551.8
+240.0,1553.2
+245.0,1554.6
+250.0,1556.1
+255.0,1557.6
+260.0,1559.1
+265.0,1560.7
+270.0,1562.3
+275.0,1563.9
+280.0,1565.6
+285.0,1567.3
+290.0,1569.0
+295.0,1570.8
+300.0,1572.6

scripts/run_inference.py CHANGED Viewed

@@ -17,144 +17,447 @@ python scripts/run_inference.py --input ... --arch resnet --weights ... --disabl
 import os
 import sys
 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
 import argparse
 import json
 import logging
 from pathlib import Path
-from typing import cast
 from torch import nn
 import numpy as np
 import torch
 import torch.nn.functional as F
-from models.registry import build, choices
 from utils.preprocessing import preprocess_spectrum, TARGET_LENGTH
 from scripts.plot_spectrum import load_spectrum
 from scripts.discover_raman_files import label_file
 def parse_args():
-    p = argparse.ArgumentParser(description="Raman spectrum inference (parity with CLI preprocessing).")
-    p.add_argument("--input", required=True, help="Path to a single Raman .txt file (2 columns: x, y).")
-    p.add_argument("--arch", required=True, choices=choices(), help="Model architecture key.")
-    p.add_argument("--weights", required=True, help="Path to model weights (.pth).")
-    p.add_argument("--target-len", type=int, default=TARGET_LENGTH, help="Resample length (default: 500).")
     # Default = ON; use disable- flags to turn steps off explicitly.
-    p.add_argument("--disable-baseline", action="store_true", help="Disable baseline correction.")
-    p.add_argument("--disable-smooth", action="store_true", help="Disable Savitzky–Golay smoothing.")
-    p.add_argument("--disable-normalize", action="store_true", help="Disable min-max normalization.")
-    p.add_argument("--output", default=None, help="Optional output JSON path (defaults to outputs/inference/<name>.json).")
-    p.add_argument("--device", default="cpu", choices=["cpu", "cuda"], help="Compute device (default: cpu).")
     return p.parse_args()
 def _load_state_dict_safe(path: str):
     """Load a state dict safely across torch versions & checkpoint formats."""
     try:
         obj = torch.load(path, map_location="cpu", weights_only=True)  # newer torch
     except TypeError:
         obj = torch.load(path, map_location="cpu")  # fallback for older torch
     # Accept either a plain state_dict or a checkpoint dict that contains one
     if isinstance(obj, dict):
         for k in ("state_dict", "model_state_dict", "model"):
             if k in obj and isinstance(obj[k], dict):
                 obj = obj[k]
                 break
     if not isinstance(obj, dict):
         raise ValueError(
             "Loaded object is not a state_dict or checkpoint with a state_dict. "
             f"Type={type(obj)} from file={path}"
         )
     # Strip DataParallel 'module.' prefixes if present
     if any(key.startswith("module.") for key in obj.keys()):
         obj = {key.replace("module.", "", 1): val for key, val in obj.items()}
     return obj
-def main():
-    logging.basicConfig(level=logging.INFO, format="INFO: %(message)s")
-    args = parse_args()
-    in_path = Path(args.input)
-    if not in_path.exists():
-        raise FileNotFoundError(f"Input file not found: {in_path}")
-    # --- Load raw spectrum
-    x_raw, y_raw = load_spectrum(str(in_path))
-    if len(x_raw) < 10:
-        raise ValueError("Input spectrum has too few points (<10).")
-    # --- Preprocess (single source of truth)
     _, y_proc = preprocess_spectrum(
-        np.array(x_raw),
-        np.array(y_raw),
         target_len=args.target_len,
         do_baseline=not args.disable_baseline,
         do_smooth=not args.disable_smooth,
         do_normalize=not args.disable_normalize,
         out_dtype="float32",
     )
-    # --- Build model & load weights (safe)
-    device = torch.device(args.device if (args.device == "cuda" and torch.cuda.is_available()) else "cpu")
-    model = cast(nn.Module, build(args.arch, args.target_len)).to(device)
-    state = _load_state_dict_safe(args.weights)
     missing, unexpected = model.load_state_dict(state, strict=False)
     if missing or unexpected:
-        logging.info("Loaded with non-strict keys. missing=%d unexpected=%d", len(missing), len(unexpected))
     model.eval()
-    # Shape: (B, C, L) = (1, 1, target_len)
     x_tensor = torch.from_numpy(y_proc[None, None, :]).to(device)
     with torch.no_grad():
-        logits = model(x_tensor).float().cpu()  # shape (1, num_classes)
         probs = F.softmax(logits, dim=1)
     probs_np = probs.numpy().ravel().tolist()
     logits_np = logits.numpy().ravel().tolist()
     pred_label = int(np.argmax(probs_np))
-    # Optional ground-truth from filename (if encoded)
-    true_label = label_file(str(in_path))
-    # --- Prepare output
-    out_dir = Path("outputs") / "inference"
-    out_dir.mkdir(parents=True, exist_ok=True)
-    out_path = Path(args.output) if args.output else (out_dir / f"{in_path.stem}_{args.arch}.json")
-    result = {
-        "input_file": str(in_path),
-        "arch": args.arch,
-        "weights": str(args.weights),
-        "target_len": args.target_len,
-        "preprocessing": {
-            "baseline": not args.disable_baseline,
-            "smooth": not args.disable_smooth,
-            "normalize": not args.disable_normalize,
-        },
-        "predicted_label": pred_label,
-        "true_label": true_label,
         "probs": probs_np,
         "logits": logits_np,
     }
-    with open(out_path, "w", encoding="utf-8") as f:
-        json.dump(result, f, indent=2)
-    logging.info("Predicted Label: %d  True Label: %s", pred_label, true_label)
-    logging.info("Raw Logits: %s", logits_np)
-    logging.info("Result saved to %s", out_path)
 if __name__ == "__main__":

 import os
 import sys
 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
 import argparse
 import json
+import csv
 import logging
 from pathlib import Path
+from typing import cast, Dict, List, Any
 from torch import nn
+import time
 import numpy as np
 import torch
 import torch.nn.functional as F
+from models.registry import build, choices, build_multiple, validate_model_list
 from utils.preprocessing import preprocess_spectrum, TARGET_LENGTH
+from utils.multifile import parse_spectrum_data, detect_file_format
 from scripts.plot_spectrum import load_spectrum
 from scripts.discover_raman_files import label_file
 def parse_args():
+    p = argparse.ArgumentParser(
+        description="Raman/FTIR spectrum inference with multi-model support."
+    )
+    p.add_argument(
+        "--input",
+        required=True,
+        help="Path to spectrum file (.txt, .csv, .json) or directory for batch processing.",
+    )
+    # Model selection - either single or multiple
+    group = p.add_mutually_exclusive_group(required=True)
+    group.add_argument(
+        "--arch", choices=choices(), help="Single model architecture key."
+    )
+    group.add_argument(
+        "--models",
+        help="Comma-separated list of models for comparison (e.g., 'figure2,resnet,resnet18vision').",
+    )
+    p.add_argument(
+        "--weights",
+        help="Path to model weights (.pth). For multi-model, use pattern with {model} placeholder.",
+    )
+    p.add_argument(
+        "--target-len",
+        type=int,
+        default=TARGET_LENGTH,
+        help="Resample length (default: 500).",
+    )
+    # Modality support
+    p.add_argument(
+        "--modality",
+        choices=["raman", "ftir"],
+        default="raman",
+        help="Spectroscopy modality for preprocessing (default: raman).",
+    )
     # Default = ON; use disable- flags to turn steps off explicitly.
+    p.add_argument(
+        "--disable-baseline", action="store_true", help="Disable baseline correction."
+    )
+    p.add_argument(
+        "--disable-smooth",
+        action="store_true",
+        help="Disable Savitzky–Golay smoothing.",
+    )
+    p.add_argument(
+        "--disable-normalize",
+        action="store_true",
+        help="Disable min-max normalization.",
+    )
+    p.add_argument(
+        "--output",
+        default=None,
+        help="Output path - JSON for single file, CSV for multi-model comparison.",
+    )
+    p.add_argument(
+        "--output-format",
+        choices=["json", "csv"],
+        default="json",
+        help="Output format for results.",
+    )
+    p.add_argument(
+        "--device",
+        default="cpu",
+        choices=["cpu", "cuda"],
+        help="Compute device (default: cpu).",
+    )
+    # File format options
+    p.add_argument(
+        "--file-format",
+        choices=["auto", "txt", "csv", "json"],
+        default="auto",
+        help="Input file format (auto-detect by default).",
+    )
     return p.parse_args()
+# /////////////////////////////////////////////////////////
 def _load_state_dict_safe(path: str):
     """Load a state dict safely across torch versions & checkpoint formats."""
     try:
         obj = torch.load(path, map_location="cpu", weights_only=True)  # newer torch
     except TypeError:
         obj = torch.load(path, map_location="cpu")  # fallback for older torch
     # Accept either a plain state_dict or a checkpoint dict that contains one
     if isinstance(obj, dict):
         for k in ("state_dict", "model_state_dict", "model"):
             if k in obj and isinstance(obj[k], dict):
                 obj = obj[k]
                 break
     if not isinstance(obj, dict):
         raise ValueError(
             "Loaded object is not a state_dict or checkpoint with a state_dict. "
             f"Type={type(obj)} from file={path}"
         )
     # Strip DataParallel 'module.' prefixes if present
     if any(key.startswith("module.") for key in obj.keys()):
         obj = {key.replace("module.", "", 1): val for key, val in obj.items()}
     return obj
+# /////////////////////////////////////////////////////////
+def run_single_model_inference(
+    x_raw: np.ndarray,
+    y_raw: np.ndarray,
+    model_name: str,
+    weights_path: str,
+    args: argparse.Namespace,
+    device: torch.device,
+) -> Dict[str, Any]:
+    """Run inference with a single model."""
+    start_time = time.time()
+    # Preprocess spectrum
     _, y_proc = preprocess_spectrum(
+        x_raw,
+        y_raw,
         target_len=args.target_len,
+        modality=args.modality,
         do_baseline=not args.disable_baseline,
         do_smooth=not args.disable_smooth,
         do_normalize=not args.disable_normalize,
         out_dtype="float32",
     )
+    # Build model & load weights
+    model = cast(nn.Module, build(model_name, args.target_len)).to(device)
+    state = _load_state_dict_safe(weights_path)
     missing, unexpected = model.load_state_dict(state, strict=False)
     if missing or unexpected:
+        logging.info(
+            f"Model {model_name}: Loaded with non-strict keys. missing={len(missing)} unexpected={len(unexpected)}"
+        )
     model.eval()
+    # Run inference
     x_tensor = torch.from_numpy(y_proc[None, None, :]).to(device)
     with torch.no_grad():
+        logits = model(x_tensor).float().cpu()
         probs = F.softmax(logits, dim=1)
+    processing_time = time.time() - start_time
     probs_np = probs.numpy().ravel().tolist()
     logits_np = logits.numpy().ravel().tolist()
     pred_label = int(np.argmax(probs_np))
+    # Map prediction to class name
+    class_names = ["Stable", "Weathered"]
+    predicted_class = (
+        class_names[pred_label]
+        if pred_label < len(class_names)
+        else f"Class_{pred_label}"
+    )
+    return {
+        "model": model_name,
+        "prediction": pred_label,
+        "predicted_class": predicted_class,
+        "confidence": max(probs_np),
         "probs": probs_np,
         "logits": logits_np,
+        "processing_time": processing_time,
     }
+# /////////////////////////////////////////////////////////
+def run_multi_model_inference(
+    x_raw: np.ndarray,
+    y_raw: np.ndarray,
+    model_names: List[str],
+    args: argparse.Namespace,
+    device: torch.device,
+) -> Dict[str, Dict[str, Any]]:
+    """Run inference with multiple models for comparison."""
+    results = {}
+    for model_name in model_names:
+        try:
+            # Generate weights path - either use pattern or assume same weights for all
+            if args.weights and "{model}" in args.weights:
+                weights_path = args.weights.format(model=model_name)
+            elif args.weights:
+                weights_path = args.weights
+            else:
+                # Default weights path pattern
+                weights_path = f"outputs/{model_name}_model.pth"
+            if not Path(weights_path).exists():
+                logging.warning(f"Weights not found for {model_name}: {weights_path}")
+                continue
+            result = run_single_model_inference(
+                x_raw, y_raw, model_name, weights_path, args, device
+            )
+            results[model_name] = result
+        except Exception as e:
+            logging.error(f"Failed to run inference with {model_name}: {str(e)}")
+            continue
+    return results
+# /////////////////////////////////////////////////////////
+def save_results(
+    results: Dict[str, Any], output_path: Path, format: str = "json"
+) -> None:
+    """Save results to file in specified format"""
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    if format == "json":
+        with open(output_path, "w", encoding="utf-8") as f:
+            json.dump(results, f, indent=2)
+    elif format == "csv":
+        # Convert to tabular format for CSV
+        if "models" in results:  # Multi-model results
+            rows = []
+            for model_name, model_result in results["models"].items():
+                row = {
+                    "model": model_name,
+                    "prediction": model_result["prediction"],
+                    "predicted_class": model_result["predicted_class"],
+                    "confidence": model_result["confidence"],
+                    "processing_time": model_result["processing_time"],
+                }
+                # Add individual class probabilities
+                if "probs" in model_result:
+                    for i, prob in enumerate(model_result["probs"]):
+                        row[f"prob_class_{i}"] = prob
+                rows.append(row)
+            # Write CSV
+            with open(output_path, "w", newline="", encoding="utf-8") as f:
+                if rows:
+                    writer = csv.DictWriter(f, fieldnames=rows[0].keys())
+                    writer.writeheader()
+                    writer.writerows(rows)
+        else:  # Single model result
+            with open(output_path, "w", newline="", encoding="utf-8") as f:
+                writer = csv.DictWriter(f, fieldnames=results.keys())
+                writer.writeheader()
+                writer.writerow(results)
+def main():
+    logging.basicConfig(level=logging.INFO, format="INFO: %(message)s")
+    args = parse_args()
+    # Input validation
+    in_path = Path(args.input)
+    if not in_path.exists():
+        raise FileNotFoundError(f"Input file not found: {in_path}")
+    # Determine if this is single or multi-model inference
+    if args.models:
+        model_names = [m.strip() for m in args.models.split(",")]
+        model_names = validate_model_list(model_names)
+        if not model_names:
+            raise ValueError(f"No valid models found in: {args.models}")
+        multi_model = True
+    else:
+        model_names = [args.arch]
+        multi_model = False
+    # Load and parse spectrum data
+    if args.file_format == "auto":
+        file_format = None  # Auto-detect
+    else:
+        file_format = args.file_format
+    try:
+        # Read file content
+        with open(in_path, "r", encoding="utf-8") as f:
+            content = f.read()
+        # Parse spectrum data with format detection
+        x_raw, y_raw = parse_spectrum_data(content, str(in_path))
+        x_raw = np.array(x_raw, dtype=np.float32)
+        y_raw = np.array(y_raw, dtype=np.float32)
+    except Exception as e:
+        x_raw, y_raw = load_spectrum(str(in_path))
+        x_raw = np.array(x_raw, dtype=np.float32)
+        y_raw = np.array(y_raw, dtype=np.float32)
+        logging.warning(
+            f"Failed to parse with new parser, falling back to original: {e}"
+        )
+        x_raw, y_raw = load_spectrum(str(in_path))
+    if len(x_raw) < 10:
+        raise ValueError("Input spectrum has too few points (<10).")
+    # Setup device
+    device = torch.device(
+        args.device if (args.device == "cuda" and torch.cuda.is_available()) else "cpu"
+    )
+    # Run inference
+    model_results = {}  # Initialize to avoid unbound variable error
+    if multi_model:
+        model_results = run_multi_model_inference(
+            np.array(x_raw, dtype=np.float32),
+            np.array(y_raw, dtype=np.float32),
+            model_names,
+            args,
+            device,
+        )
+        # Get ground truth if available
+        true_label = label_file(str(in_path))
+        # Prepare combined results
+        results = {
+            "input_file": str(in_path),
+            "modality": args.modality,
+            "models": model_results,
+            "true_label": true_label,
+            "preprocessing": {
+                "baseline": not args.disable_baseline,
+                "smooth": not args.disable_smooth,
+                "normalize": not args.disable_normalize,
+                "target_len": args.target_len,
+            },
+            "comparison": {
+                "total_models": len(model_results),
+                "agreements": (
+                    sum(
+                        1
+                        for i, (_, r1) in enumerate(model_results.items())
+                        for j, (_, r2) in enumerate(
+                            list(model_results.items())[i + 1 :]
+                        )
+                        if r1["prediction"] == r2["prediction"]
+                    )
+                    if len(model_results) > 1
+                    else 0
+                ),
+            },
+        }
+        # Default output path for multi-model
+        default_output = (
+            Path("outputs")
+            / "inference"
+            / f"{in_path.stem}_comparison.{args.output_format}"
+        )
+    else:
+        # Single model inference
+        model_result = run_single_model_inference(
+            x_raw, y_raw, model_names[0], args.weights, args, device
+        )
+        true_label = label_file(str(in_path))
+        results = {
+            "input_file": str(in_path),
+            "modality": args.modality,
+            "arch": model_names[0],
+            "weights": str(args.weights),
+            "target_len": args.target_len,
+            "preprocessing": {
+                "baseline": not args.disable_baseline,
+                "smooth": not args.disable_smooth,
+                "normalize": not args.disable_normalize,
+            },
+            "predicted_label": model_result["prediction"],
+            "predicted_class": model_result["predicted_class"],
+            "true_label": true_label,
+            "confidence": model_result["confidence"],
+            "probs": model_result["probs"],
+            "logits": model_result["logits"],
+            "processing_time": model_result["processing_time"],
+        }
+        # Default output path for single model
+        default_output = (
+            Path("outputs")
+            / "inference"
+            / f"{in_path.stem}_{model_names[0]}.{args.output_format}"
+        )
+    # Save results
+    output_path = Path(args.output) if args.output else default_output
+    save_results(results, output_path, args.output_format)
+    # Log summary
+    if multi_model:
+        logging.info(
+            f"Multi-model inference completed with {len(model_results)} models"
+        )
+        for model_name, result in model_results.items():
+            logging.info(
+                f"{model_name}: {result['predicted_class']} (confidence: {result['confidence']:.3f})"
+            )
+        logging.info(f"Results saved to {output_path}")
+    else:
+        logging.info(
+            f"Predicted Label: {results['predicted_label']} ({results['predicted_class']})"
+        )
+        logging.info(f"Confidence: {results['confidence']:.3f}")
+        logging.info(f"True Label: {results['true_label']}")
+        logging.info(f"Result saved to {output_path}")
 if __name__ == "__main__":

tests/test_ftir_preprocessing.py ADDED Viewed

	@@ -0,0 +1,179 @@

+"""Tests for FTIR preprocessing functionality."""
+import pytest
+import numpy as np
+from utils.preprocessing import (
+    preprocess_spectrum,
+    validate_spectrum_range,
+    get_modality_info,
+    MODALITY_RANGES,
+    MODALITY_PARAMS,
+)
+def test_modality_ranges():
+    """Test that modality ranges are correctly defined."""
+    assert "raman" in MODALITY_RANGES
+    assert "ftir" in MODALITY_RANGES
+    raman_range = MODALITY_RANGES["raman"]
+    ftir_range = MODALITY_RANGES["ftir"]
+    assert raman_range[0] < raman_range[1]  # Valid range
+    assert ftir_range[0] < ftir_range[1]  # Valid range
+    assert ftir_range[0] >= 400  # FTIR starts at 400 cm⁻¹
+    assert ftir_range[1] <= 4000  # FTIR ends at 4000 cm⁻¹
+def test_validate_spectrum_range():
+    """Test spectrum range validation for different modalities."""
+    # Test Raman range validation
+    raman_x = np.linspace(300, 3500, 100)  # Typical Raman range
+    assert validate_spectrum_range(raman_x, "raman") == True
+    # Test FTIR range validation
+    ftir_x = np.linspace(500, 3800, 100)  # Typical FTIR range
+    assert validate_spectrum_range(ftir_x, "ftir") == True
+    # Test out-of-range data
+    out_of_range_x = np.linspace(50, 150, 100)  # Too low for either
+    assert validate_spectrum_range(out_of_range_x, "raman") == False
+    assert validate_spectrum_range(out_of_range_x, "ftir") == False
+def test_ftir_preprocessing():
+    """Test FTIR-specific preprocessing parameters."""
+    # Generate synthetic FTIR spectrum
+    x = np.linspace(400, 4000, 200)  # FTIR range
+    y = np.sin(x / 500) + 0.1 * np.random.randn(len(x)) + 2.0  # Synthetic absorbance
+    # Test FTIR preprocessing
+    x_proc, y_proc = preprocess_spectrum(x, y, modality="ftir", target_len=500)
+    assert x_proc.shape == (500,)
+    assert y_proc.shape == (500,)
+    assert np.all(np.diff(x_proc) > 0)  # Monotonic increasing
+    assert np.min(y_proc) >= 0.0  # Normalized to [0, 1]
+    assert np.max(y_proc) <= 1.0
+def test_raman_preprocessing():
+    """Test Raman-specific preprocessing parameters."""
+    # Generate synthetic Raman spectrum
+    x = np.linspace(200, 3500, 200)  # Raman range
+    y = np.exp(-(((x - 1500) / 200) ** 2)) + 0.05 * np.random.randn(
+        len(x)
+    )  # Gaussian peak
+    # Test Raman preprocessing
+    x_proc, y_proc = preprocess_spectrum(x, y, modality="raman", target_len=500)
+    assert x_proc.shape == (500,)
+    assert y_proc.shape == (500,)
+    assert np.all(np.diff(x_proc) > 0)  # Monotonic increasing
+    assert np.min(y_proc) >= 0.0  # Normalized to [0, 1]
+    assert np.max(y_proc) <= 1.0
+def test_modality_specific_parameters():
+    """Test that different modalities use different default parameters."""
+    x = np.linspace(400, 4000, 200)
+    y = np.sin(x / 500) + 1.0
+    # Test that FTIR uses different window length than Raman
+    ftir_params = MODALITY_PARAMS["ftir"]
+    raman_params = MODALITY_PARAMS["raman"]
+    assert ftir_params["smooth_window"] != raman_params["smooth_window"]
+    # Preprocess with both modalities (should use different parameters)
+    x_raman, y_raman = preprocess_spectrum(x, y, modality="raman")
+    x_ftir, y_ftir = preprocess_spectrum(x, y, modality="ftir")
+    # Results should be slightly different due to different parameters
+    assert not np.allclose(y_raman, y_ftir, rtol=1e-10)
+def test_get_modality_info():
+    """Test modality information retrieval."""
+    raman_info = get_modality_info("raman")
+    ftir_info = get_modality_info("ftir")
+    assert "range" in raman_info
+    assert "params" in raman_info
+    assert "range" in ftir_info
+    assert "params" in ftir_info
+    # Check that ranges match expected values
+    assert raman_info["range"] == MODALITY_RANGES["raman"]
+    assert ftir_info["range"] == MODALITY_RANGES["ftir"]
+    # Check that parameters are present
+    assert "baseline_degree" in raman_info["params"]
+    assert "smooth_window" in ftir_info["params"]
+def test_invalid_modality():
+    """Test handling of invalid modality."""
+    x = np.linspace(1000, 2000, 100)
+    y = np.sin(x / 100)
+    with pytest.raises(ValueError, match="Unsupported modality"):
+        preprocess_spectrum(x, y, modality="invalid")
+    with pytest.raises(ValueError, match="Unknown modality"):
+        validate_spectrum_range(x, "invalid")
+    with pytest.raises(ValueError, match="Unknown modality"):
+        get_modality_info("invalid")
+def test_modality_parameter_override():
+    """Test that modality defaults can be overridden."""
+    x = np.linspace(400, 4000, 100)
+    y = np.sin(x / 500) + 1.0
+    # Override FTIR default window length
+    custom_window = 21  # Different from FTIR default (13)
+    x_proc, y_proc = preprocess_spectrum(
+        x, y, modality="ftir", window_length=custom_window
+    )
+    assert x_proc.shape[0] > 0
+    assert y_proc.shape[0] > 0
+def test_range_validation_warning():
+    """Test that range validation warnings work correctly."""
+    # Create spectrum outside typical FTIR range
+    x_bad = np.linspace(100, 300, 50)  # Too low for FTIR
+    y_bad = np.ones_like(x_bad)
+    # Should still process but with validation disabled
+    x_proc, y_proc = preprocess_spectrum(
+        x_bad, y_bad, modality="ftir", validate_range=False  # Disable validation
+    )
+    assert len(x_proc) > 0
+    assert len(y_proc) > 0
+def test_backwards_compatibility():
+    """Test that old preprocessing calls still work (defaults to Raman)."""
+    x = np.linspace(1000, 2000, 100)
+    y = np.sin(x / 100)
+    # Old style call (should default to Raman)
+    x_old, y_old = preprocess_spectrum(x, y)
+    # New style call with explicit Raman
+    x_new, y_new = preprocess_spectrum(x, y, modality="raman")
+    # Should be identical
+    np.testing.assert_array_equal(x_old, x_new)
+    np.testing.assert_array_equal(y_old, y_new)
+if __name__ == "__main__":
+    pytest.main([__file__])

tests/test_multi_format.py ADDED Viewed

	@@ -0,0 +1,218 @@

+"""Tests for multi-format file parsing functionality."""
+import pytest
+import numpy as np
+from utils.multifile import (
+    parse_spectrum_data,
+    detect_file_format,
+    parse_json_spectrum,
+    parse_csv_spectrum,
+    parse_txt_spectrum,
+)
+def test_detect_file_format():
+    """Test automatic file format detection."""
+    # JSON detection
+    json_content = '{"wavenumbers": [1, 2, 3], "intensities": [0.1, 0.2, 0.3]}'
+    assert detect_file_format("test.json", json_content) == "json"
+    # CSV detection
+    csv_content = "wavenumber,intensity\n1000,0.5\n1001,0.6"
+    assert detect_file_format("test.csv", csv_content) == "csv"
+    # TXT detection (default)
+    txt_content = "1000 0.5\n1001 0.6"
+    assert detect_file_format("test.txt", txt_content) == "txt"
+def test_parse_json_spectrum():
+    """Test JSON spectrum parsing."""
+    # Test object format
+    json_content = '{"wavenumbers": [1000, 1001, 1002], "intensities": [0.1, 0.2, 0.3]}'
+    x, y = parse_json_spectrum(json_content)
+    expected_x = np.array([1000, 1001, 1002])
+    expected_y = np.array([0.1, 0.2, 0.3])
+    np.testing.assert_array_equal(x, expected_x)
+    np.testing.assert_array_equal(y, expected_y)
+    # Test alternative key names
+    json_content_alt = '{"x": [1000, 1001, 1002], "y": [0.1, 0.2, 0.3]}'
+    x_alt, y_alt = parse_json_spectrum(json_content_alt)
+    np.testing.assert_array_equal(x_alt, expected_x)
+    np.testing.assert_array_equal(y_alt, expected_y)
+    # Test array of objects format
+    json_array = """[
+        {"wavenumber": 1000, "intensity": 0.1},
+        {"wavenumber": 1001, "intensity": 0.2},
+        {"wavenumber": 1002, "intensity": 0.3}
+    ]"""
+    x_arr, y_arr = parse_json_spectrum(json_array)
+    np.testing.assert_array_equal(x_arr, expected_x)
+    np.testing.assert_array_equal(y_arr, expected_y)
+def test_parse_csv_spectrum():
+    """Test CSV spectrum parsing."""
+    # Test with headers
+    csv_with_headers = """wavenumber,intensity
+1000,0.1
+1001,0.2
+1002,0.3
+1003,0.4
+1004,0.5
+1005,0.6
+1006,0.7
+1007,0.8
+1008,0.9
+1009,1.0
+1010,1.1
+1011,1.2"""
+    x, y = parse_csv_spectrum(csv_with_headers)
+    expected_x = np.array(
+        [1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011]
+    )
+    expected_y = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2])
+    np.testing.assert_array_equal(x, expected_x)
+    np.testing.assert_array_equal(y, expected_y)
+    # Test without headers
+    csv_no_headers = """1000,0.1
+1001,0.2
+1002,0.3
+1003,0.4
+1004,0.5
+1005,0.6
+1006,0.7
+1007,0.8
+1008,0.9
+1009,1.0
+1010,1.1
+1011,1.2"""
+    x_no_h, y_no_h = parse_csv_spectrum(csv_no_headers)
+    np.testing.assert_array_equal(x_no_h, expected_x)
+    np.testing.assert_array_equal(y_no_h, expected_y)
+    # Test semicolon delimiter
+    csv_semicolon = """1000;0.1
+1001;0.2
+1002;0.3
+1003;0.4
+1004;0.5
+1005;0.6
+1006;0.7
+1007;0.8
+1008;0.9
+1009;1.0
+1010;1.1
+1011;1.2"""
+    x_semi, y_semi = parse_csv_spectrum(csv_semicolon)
+    np.testing.assert_array_equal(x_semi, expected_x)
+    np.testing.assert_array_equal(y_semi, expected_y)
+def test_parse_txt_spectrum():
+    """Test TXT spectrum parsing."""
+    txt_content = """# Comment line
+1000 0.1
+1001 0.2
+1002 0.3
+1003 0.4
+1004 0.5
+1005 0.6
+1006 0.7
+1007 0.8
+1008 0.9
+1009 1.0
+1010 1.1
+1011 1.2"""
+    x, y = parse_txt_spectrum(txt_content)
+    expected_x = np.array(
+        [1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011]
+    )
+    expected_y = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2])
+    np.testing.assert_array_equal(x, expected_x)
+    np.testing.assert_array_equal(y, expected_y)
+    # Test comma-separated
+    txt_comma = """1000,0.1
+1001,0.2
+1002,0.3
+1003,0.4
+1004,0.5
+1005,0.6
+1006,0.7
+1007,0.8
+1008,0.9
+1009,1.0
+1010,1.1
+1011,1.2"""
+    x_comma, y_comma = parse_txt_spectrum(txt_comma)
+    np.testing.assert_array_equal(x_comma, expected_x)
+    np.testing.assert_array_equal(y_comma, expected_y)
+def test_parse_spectrum_data_integration():
+    """Test integrated spectrum data parsing with format detection."""
+    # Test automatic format detection and parsing
+    test_cases = [
+        (
+            '{"wavenumbers": [1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011], "intensities": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2]}',
+            "test.json",
+        ),
+        (
+            "wavenumber,intensity\n1000,0.1\n1001,0.2\n1002,0.3\n1003,0.4\n1004,0.5\n1005,0.6\n1006,0.7\n1007,0.8\n1008,0.9\n1009,1.0\n1010,1.1\n1011,1.2",
+            "test.csv",
+        ),
+        (
+            "1000 0.1\n1001 0.2\n1002 0.3\n1003 0.4\n1004 0.5\n1005 0.6\n1006 0.7\n1007 0.8\n1008 0.9\n1009 1.0\n1010 1.1\n1011 1.2",
+            "test.txt",
+        ),
+    ]
+    for content, filename in test_cases:
+        x, y = parse_spectrum_data(content, filename)
+        assert len(x) >= 10
+        assert len(y) >= 10
+        assert len(x) == len(y)
+def test_insufficient_data_points():
+    """Test handling of insufficient data points."""
+    # Test with too few points
+    insufficient_data = "1000 0.1\n1001 0.2"  # Only 2 points, need at least 10
+    with pytest.raises(ValueError, match="Insufficient data points"):
+        parse_txt_spectrum(insufficient_data, "test.txt")
+def test_invalid_json():
+    """Test handling of invalid JSON."""
+    invalid_json = (
+        '{"wavenumbers": [1000, 1001], "intensities": [0.1}'  # Missing closing bracket
+    )
+    with pytest.raises(ValueError, match="Invalid JSON format"):
+        parse_json_spectrum(invalid_json)
+def test_empty_file():
+    """Test handling of empty files."""
+    empty_content = ""
+    with pytest.raises(ValueError, match="No data lines found"):
+        parse_txt_spectrum(empty_content, "empty.txt")
+if __name__ == "__main__":
+    pytest.main([__file__])

utils/multifile.py CHANGED Viewed

@@ -1,11 +1,16 @@
-"""Multi-file processing utiltities for batch inference.
-Handles multiple file uploads and iterative processing."""
-from typing import List, Dict, Any, Tuple, Optional
 import time
 import streamlit as st
 import numpy as np
 import pandas as pd
 from .preprocessing import resample_spectrum
 from .errors import ErrorHandler, safe_execute
@@ -13,83 +18,230 @@ from .results_manager import ResultsManager
 from .confidence import calculate_softmax_confidence
-def parse_spectrum_data(
-    text_content: str, filename: str = "unknown"
-) -> Tuple[np.ndarray, np.ndarray]:
-    """
-    Parse spectrum data from text content
     Args:
-        text_content: Raw text content of the spectrum file
-        filename: Name of the file for error reporting
     Returns:
-        Tuple of (x_values, y_values) as numpy arrays
-    Raises:
-        ValueError: If the data cannot be parsed
     """
-    try:
-        lines = text_content.strip().split("\n")
-        # ==Remove empty lines and comments==
-        data_lines = []
-        for line in lines:
-            line = line.strip()
-            if line and not line.startswith("#") and not line.startswith("%"):
-                data_lines.append(line)
-        if not data_lines:
-            raise ValueError("No data lines found in file")
-        # ==Try to parse==
-        x_vals, y_vals = [], []
-        for i, line in enumerate(data_lines):
-            try:
-                # Handle different separators
-                parts = line.replace(",", " ").split()
-                numbers = [
-                    p
-                    for p in parts
-                    if p.replace(".", "", 1)
-                    .replace("-", "", 1)
-                    .replace("+", "", 1)
-                    .isdigit()
-                ]
-                if len(numbers) >= 2:
-                    x_val = float(numbers[0])
-                    y_val = float(numbers[1])
                     x_vals.append(x_val)
                     y_vals.append(y_val)
             except ValueError:
                 ErrorHandler.log_warning(
-                    f"Could not parse line {i+1}: {line}", f"Parsing {filename}"
                 )
                 continue
-        if len(x_vals) < 10:  # ==Need minimum points for interpolation==
             raise ValueError(
                 f"Insufficient data points ({len(x_vals)}). Need at least 10 points."
             )
-        x = np.array(x_vals)
-        y = np.array(y_vals)
-        # Check for NaNs
-        if np.any(np.isnan(x)) or np.any(np.isnan(y)):
-            raise ValueError("Input data contains NaN values")
-        # Check monotonic increasing x
-        if not np.all(np.diff(x) > 0):
-            raise ValueError("Wavenumbers must be strictly increasing")
-        # Check reasonable range for Raman spectroscopy
-        if min(x) < 0 or max(x) > 10000 or (max(x) - min(x)) < 100:
-            raise ValueError(
-                f"Invalid wavenumber range: {min(x)} - {max(x)}. Expected ~400-4000 cm⁻¹ with span >100"
-            )
         return x, y
@@ -97,6 +249,95 @@ def parse_spectrum_data(
         raise ValueError(f"Failed to parse spectrum data: {str(e)}")
 def process_single_file(
     filename: str,
     text_content: str,

+"""Multi-file processing utilities for batch inference.
+Handles multiple file uploads and iterative processing.
+Supports TXT, CSV, and JSON file formats with automatic detection."""
+from typing import List, Dict, Any, Tuple, Optional, Union
 import time
 import streamlit as st
 import numpy as np
 import pandas as pd
+import json
+import csv
+import io
+from pathlib import Path
 from .preprocessing import resample_spectrum
 from .errors import ErrorHandler, safe_execute
 from .confidence import calculate_softmax_confidence
+def detect_file_format(filename: str, content: str) -> str:
+    """Automatically detect file format based on exstention and content
     Args:
+        filename: Name of the file
+        content: Content of the file
     Returns:
+        File format: .'txt', .'csv', .'json'
     """
+    # First try by extension
+    suffix = Path(filename).suffix.lower()
+    if suffix == ".json":
+        try:
+            json.loads(content)
+            return "json"
+        except:
+            pass
+    elif suffix == ".csv":
+        return "csv"
+    elif suffix == ".txt":
+        return "txt"
+    # If extension doesn't match or is unclear, try content detection
+    content_stripped = content.strip()
+    # Try JSON
+    if content_stripped.startswith(("{", "[")):
+        try:
+            json.loads(content)
+            return "json"
+        except:
+            pass
+    # Try CSV (look for commas in first few lines)
+    lines = content_stripped.split("\n")[:5]
+    comma_count = sum(line.count(",") for line in lines)
+    if comma_count > len(lines):  # More commas than lines suggests CSV
+        return "csv"
+    # Default to TXT
+    return "txt"
+# /////////////////////////////////////////////////////
+def parse_json_spectrum(
+    content: str, filename: str = "unknown"
+) -> Tuple[np.ndarray, np.ndarray]:
+    """
+    Parse spectrum data from JSON format.
+    Expected formats:
+    - {"wavenumbers": [...], "intensities": [...]}
+    - {"x": [...], "y": [...]}
+    - [{"wavenumber": val, "intensity": val}, ...]
+    """
+    try:
+        data = json.load(content)
+        # Format 1: Object with arrays
+        if isinstance(data, dict):
+            x_key = None
+            y_key = None
+            # Try common key names for x-axis
+            for key in ["wavenumbers", "wavenumber", "x", "freq", "frequency"]:
+                if key in data:
+                    x_key = key
+                    break
+            # Try common key names for y-axis
+            for key in ["intensities", "intensity", "y", "counts", "absorbance"]:
+                if key in data:
+                    y_key = key
+                    break
+            if x_key and y_key:
+                x_vals = np.array(data[x_key], dtype=float)
+                y_vals = np.array(data[y_key], dtype=float)
+                return x_vals, y_vals
+        # Format 2: Array of objects
+        elif isinstance(data, list) and len(data) > 0 and isinstance(data[0], dict):
+            x_vals = []
+            y_vals = []
+            for item in data:
+                # Try to find x and y values
+                x_val = None
+                y_val = None
+                for x_key in ["wavenumber", "wavenumbers", "x", "freq"]:
+                    if x_key in item:
+                        x_val = float(item[x_key])
+                        break
+                for y_key in ["intensity", "intensities", "y", "counts"]:
+                    if y_key in item:
+                        y_val = float(item[y_key])
+                        break
+                if x_val is not None and y_val is not None:
                     x_vals.append(x_val)
                     y_vals.append(y_val)
+            if x_vals and y_vals:
+                return np.array(x_vals), np.array(y_vals)
+        raise ValueError(
+            "JSON format not recognized. Expected wavenumber/intensity pairs."
+        )
+    except json.JSONDecodeError as e:
+        raise ValueError(f"Invalid JSON format: {str(e)}")
+    except Exception as e:
+        raise ValueError(f"Failed to parse JSON spectrum: {str(e)}")
+# /////////////////////////////////////////////////////
+def parse_csv_spectrum(
+    content: str, filename: str = "unknown"
+) -> Tuple[np.ndarray, np.ndarray]:
+    """
+    Parse spectrum data from CSV format.
+    Handles various CSV formats with headers or without.
+    """
+    try:
+        # Use StringIO to treat string as file-like object
+        csv_file = io.StringIO(content)
+        # Try to detect delimiter
+        sample = content[:1024]
+        delimiter = ","
+        if sample.count(";") > sample.count(","):
+            delimiter = ";"
+        elif sample.count("\t") > sample.count(","):
+            delimiter = "\t"
+        # Read CSV
+        csv_reader = csv.reader(csv_file, delimiter=delimiter)
+        rows = list(csv_reader)
+        if not rows:
+            raise ValueError("Empty CSV file")
+        # Check if first row is header
+        has_header = False
+        try:
+            # If first row contains non-numeric data, it's likely a header
+            float(rows[0][0])
+            float(rows[0][1])
+        except (ValueError, IndexError):
+            has_header = True
+        data_rows = rows[1:] if has_header else rows
+        # Extract x and y values
+        x_vals = []
+        y_vals = []
+        for i, row in enumerate(data_rows):
+            if len(row) < 2:
+                continue
+            try:
+                x_val = float(row[0])
+                y_val = float(row[1])
+                x_vals.append(x_val)
+                y_vals.append(y_val)
             except ValueError:
                 ErrorHandler.log_warning(
+                    f"Could not parse CSV row {i+1}: {row}", f"Parsing {filename}"
                 )
                 continue
+        if len(x_vals) < 10:
             raise ValueError(
                 f"Insufficient data points ({len(x_vals)}). Need at least 10 points."
             )
+        return np.array(x_vals), np.array(y_vals)
+    except Exception as e:
+        raise ValueError(f"Failed to parse CSV spectrum: {str(e)}")
+# /////////////////////////////////////////////////////
+def parse_spectrum_data(
+    text_content: str, filename: str = "unknown", file_format: Optional[str] = None
+) -> Tuple[np.ndarray, np.ndarray]:
+    """
+    Parse spectrum data from text content with automatic format detection.
+    Args:
+        text_content: Raw text content of the spectrum file
+        filename: Name of the file for error reporting
+        file_format: Force specific format ('txt', 'csv', 'json') or None for auto-detection
+    Returns:
+        Tuple of (x_values, y_values) as numpy arrays
+    Raises:
+        ValueError: If the data cannot be parsed
+    """
+    try:
+        # Detect format if not specified
+        if file_format is None:
+            file_format = detect_file_format(filename, text_content)
+        # Parse based on detected/specified format
+        if file_format == "json":
+            x, y = parse_json_spectrum(text_content, filename)
+        elif file_format == "csv":
+            x, y = parse_csv_spectrum(text_content, filename)
+        else:  # Default to TXT format
+            x, y = parse_txt_spectrum(text_content, filename)
+        # Common validation for all formats
+        validate_spectrum_data(x, y, filename)
         return x, y
         raise ValueError(f"Failed to parse spectrum data: {str(e)}")
+# /////////////////////////////////////////////////////
+def parse_txt_spectrum(
+    content: str, filename: str = "unknown"
+) -> Tuple[np.ndarray, np.ndarray]:
+    """
+    Parse spectrum data from TXT format (original implementation).
+    """
+    lines = content.strip().split("\n")
+    # ==Remove empty lines and comments==
+    data_lines = []
+    for line in lines:
+        line = line.strip()
+        if line and not line.startswith("#") and not line.startswith("%"):
+            data_lines.append(line)
+    if not data_lines:
+        raise ValueError("No data lines found in file")
+    # ==Try to parse==
+    x_vals, y_vals = [], []
+    for i, line in enumerate(data_lines):
+        try:
+            # Handle different separators
+            parts = line.replace(",", " ").split()
+            numbers = [
+                p
+                for p in parts
+                if p.replace(".", "", 1)
+                .replace("-", "", 1)
+                .replace("+", "", 1)
+                .isdigit()
+            ]
+            if len(numbers) >= 2:
+                x_val = float(numbers[0])
+                y_val = float(numbers[1])
+                x_vals.append(x_val)
+                y_vals.append(y_val)
+        except ValueError:
+            ErrorHandler.log_warning(
+                f"Could not parse line {i+1}: {line}", f"Parsing {filename}"
+            )
+            continue
+    if len(x_vals) < 10:  # ==Need minimum points for interpolation==
+        raise ValueError(
+            f"Insufficient data points ({len(x_vals)}). Need at least 10 points."
+        )
+    return np.array(x_vals), np.array(y_vals)
+# /////////////////////////////////////////////////////
+def validate_spectrum_data(x: np.ndarray, y: np.ndarray, filename: str) -> None:
+    """
+    Validate parsed spectrum data for common issues.
+    """
+    # Check for NaNs
+    if np.any(np.isnan(x)) or np.any(np.isnan(y)):
+        raise ValueError("Input data contains NaN values")
+    # Check monotonic increasing x (sort if needed)
+    if not np.all(np.diff(x) >= 0):
+        # Sort by x values if not monotonic
+        sort_idx = np.argsort(x)
+        x = x[sort_idx]
+        y = y[sort_idx]
+        ErrorHandler.log_warning(
+            "Wavenumbers were not monotonic - data has been sorted",
+            f"Parsing {filename}",
+        )
+    # Check reasonable range for spectroscopy
+    if min(x) < 0 or max(x) > 10000 or (max(x) - min(x)) < 100:
+        ErrorHandler.log_warning(
+            f"Unusual wavenumber range: {min(x):.1f} - {max(x):.1f} cm⁻¹",
+            f"Parsing {filename}",
+        )
+# /////////////////////////////////////////////////////
 def process_single_file(
     filename: str,
     text_content: str,

utils/performance_tracker.py ADDED Viewed

	@@ -0,0 +1,404 @@

+"""Performance tracking and logging utilities for POLYMEROS platform."""
+import time
+import json
+import sqlite3
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, List, Any, Optional
+import numpy as np
+import matplotlib.pyplot as plt
+import streamlit as st
+from dataclasses import dataclass, asdict
+from contextlib import contextmanager
+@dataclass
+class PerformanceMetrics:
+    """Data class for performance metrics."""
+    model_name: str
+    prediction_time: float
+    preprocessing_time: float
+    total_time: float
+    memory_usage_mb: float
+    accuracy: Optional[float]
+    confidence: float
+    timestamp: str
+    input_size: int
+    modality: str
+    def to_dict(self) -> Dict[str, Any]:
+        return asdict(self)
+class PerformanceTracker:
+    """Automatic performance tracking and logging system."""
+    def __init__(self, db_path: str = "outputs/performance_tracking.db"):
+        self.db_path = Path(db_path)
+        self.db_path.parent.mkdir(parents=True, exist_ok=True)
+        self._init_database()
+    def _init_database(self):
+        """Initialize SQLite database for performance tracking."""
+        with sqlite3.connect(self.db_path) as conn:
+            conn.execute(
+                """
+                CREATE TABLE IF NOT EXISTS performance_metrics (
+                    id INTEGER PRIMARY KEY AUTOINCREMENT,
+                    model_name TEXT NOT NULL,
+                    prediction_time REAL NOT NULL,
+                    preprocessing_time REAL NOT NULL,
+                    total_time REAL NOT NULL,
+                    memory_usage_mb REAL,
+                    accuracy REAL,
+                    confidence REAL NOT NULL,
+                    timestamp TEXT NOT NULL,
+                    input_size INTEGER NOT NULL,
+                    modality TEXT NOT NULL
+                )
+            """
+            )
+            conn.commit()
+    def log_performance(self, metrics: PerformanceMetrics):
+        """Log performance metrics to database."""
+        with sqlite3.connect(self.db_path) as conn:
+            conn.execute(
+                """
+                INSERT INTO performance_metrics
+                (model_name, prediction_time, preprocessing_time, total_time,
+                 memory_usage_mb, accuracy, confidence, timestamp, input_size, modality)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+                (
+                    metrics.model_name,
+                    metrics.prediction_time,
+                    metrics.preprocessing_time,
+                    metrics.total_time,
+                    metrics.memory_usage_mb,
+                    metrics.accuracy,
+                    metrics.confidence,
+                    metrics.timestamp,
+                    metrics.input_size,
+                    metrics.modality,
+                ),
+            )
+            conn.commit()
+    @contextmanager
+    def track_inference(self, model_name: str, modality: str = "raman"):
+        """Context manager for automatic performance tracking."""
+        start_time = time.time()
+        start_memory = self._get_memory_usage()
+        tracking_data = {
+            "model_name": model_name,
+            "modality": modality,
+            "start_time": start_time,
+            "start_memory": start_memory,
+            "preprocessing_time": 0.0,
+        }
+        try:
+            yield tracking_data
+        finally:
+            end_time = time.time()
+            end_memory = self._get_memory_usage()
+            total_time = end_time - start_time
+            memory_usage = max(end_memory - start_memory, 0)
+            # Create metrics object if not provided
+            if "metrics" not in tracking_data:
+                metrics = PerformanceMetrics(
+                    model_name=model_name,
+                    prediction_time=tracking_data.get("prediction_time", total_time),
+                    preprocessing_time=tracking_data.get("preprocessing_time", 0.0),
+                    total_time=total_time,
+                    memory_usage_mb=memory_usage,
+                    accuracy=tracking_data.get("accuracy"),
+                    confidence=tracking_data.get("confidence", 0.0),
+                    timestamp=datetime.now().isoformat(),
+                    input_size=tracking_data.get("input_size", 0),
+                    modality=modality,
+                )
+                self.log_performance(metrics)
+    def _get_memory_usage(self) -> float:
+        """Get current memory usage in MB."""
+        try:
+            import psutil
+            process = psutil.Process()
+            return process.memory_info().rss / 1024 / 1024  # Convert to MB
+        except ImportError:
+            return 0.0  # psutil not available
+    def get_recent_metrics(self, limit: int = 100) -> List[Dict[str, Any]]:
+        """Get recent performance metrics."""
+        with sqlite3.connect(self.db_path) as conn:
+            conn.row_factory = sqlite3.Row  # Enable column access by name
+            cursor = conn.execute(
+                """
+                SELECT * FROM performance_metrics
+                ORDER BY timestamp DESC
+                LIMIT ?
+            """,
+                (limit,),
+            )
+            return [dict(row) for row in cursor.fetchall()]
+    def get_model_statistics(self, model_name: Optional[str] = None) -> Dict[str, Any]:
+        """Get statistical summary of model performance."""
+        where_clause = "WHERE model_name = ?" if model_name else ""
+        params = (model_name,) if model_name else ()
+        with sqlite3.connect(self.db_path) as conn:
+            cursor = conn.execute(
+                f"""
+                SELECT
+                    model_name,
+                    COUNT(*) as total_inferences,
+                    AVG(prediction_time) as avg_prediction_time,
+                    AVG(preprocessing_time) as avg_preprocessing_time,
+                    AVG(total_time) as avg_total_time,
+                    AVG(memory_usage_mb) as avg_memory_usage,
+                    AVG(confidence) as avg_confidence,
+                    MIN(total_time) as fastest_inference,
+                    MAX(total_time) as slowest_inference
+                FROM performance_metrics
+                {where_clause}
+                GROUP BY model_name
+            """,
+                params,
+            )
+            results = cursor.fetchall()
+            if model_name and results:
+                # Return single model stats as dict
+                row = results[0]
+                return {
+                    "model_name": row[0],
+                    "total_inferences": row[1],
+                    "avg_prediction_time": row[2],
+                    "avg_preprocessing_time": row[3],
+                    "avg_total_time": row[4],
+                    "avg_memory_usage": row[5],
+                    "avg_confidence": row[6],
+                    "fastest_inference": row[7],
+                    "slowest_inference": row[8],
+                }
+            elif not model_name:
+                # Return all models stats as dict of dicts
+                return {
+                    row[0]: {
+                        "model_name": row[0],
+                        "total_inferences": row[1],
+                        "avg_prediction_time": row[2],
+                        "avg_preprocessing_time": row[3],
+                        "avg_total_time": row[4],
+                        "avg_memory_usage": row[5],
+                        "avg_confidence": row[6],
+                        "fastest_inference": row[7],
+                        "slowest_inference": row[8],
+                    }
+                    for row in results
+                }
+            else:
+                return {}
+    def create_performance_visualization(self) -> plt.Figure:
+        """Create performance visualization charts."""
+        metrics = self.get_recent_metrics(50)
+        if not metrics:
+            return None
+        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 8))
+        # Convert to convenient format
+        models = [m["model_name"] for m in metrics]
+        times = [m["total_time"] for m in metrics]
+        confidences = [m["confidence"] for m in metrics]
+        timestamps = [datetime.fromisoformat(m["timestamp"]) for m in metrics]
+        # 1. Inference Time Over Time
+        ax1.plot(timestamps, times, "o-", alpha=0.7)
+        ax1.set_title("Inference Time Over Time")
+        ax1.set_ylabel("Time (seconds)")
+        ax1.tick_params(axis="x", rotation=45)
+        # 2. Performance by Model
+        model_stats = self.get_model_statistics()
+        if model_stats:
+            model_names = list(model_stats.keys())
+            avg_times = [model_stats[m]["avg_total_time"] for m in model_names]
+            ax2.bar(model_names, avg_times, alpha=0.7)
+            ax2.set_title("Average Inference Time by Model")
+            ax2.set_ylabel("Time (seconds)")
+            ax2.tick_params(axis="x", rotation=45)
+        # 3. Confidence Distribution
+        ax3.hist(confidences, bins=20, alpha=0.7)
+        ax3.set_title("Confidence Score Distribution")
+        ax3.set_xlabel("Confidence")
+        ax3.set_ylabel("Frequency")
+        # 4. Memory Usage if available
+        memory_usage = [
+            m["memory_usage_mb"] for m in metrics if m["memory_usage_mb"] is not None
+        ]
+        if memory_usage:
+            ax4.plot(range(len(memory_usage)), memory_usage, "o-", alpha=0.7)
+            ax4.set_title("Memory Usage")
+            ax4.set_xlabel("Inference Number")
+            ax4.set_ylabel("Memory (MB)")
+        else:
+            ax4.text(
+                0.5,
+                0.5,
+                "Memory tracking\nnot available",
+                ha="center",
+                va="center",
+                transform=ax4.transAxes,
+            )
+            ax4.set_title("Memory Usage")
+        plt.tight_layout()
+        return fig
+    def export_metrics(self, format: str = "json") -> str:
+        """Export performance metrics in specified format."""
+        metrics = self.get_recent_metrics(1000)  # Get more for export
+        if format == "json":
+            return json.dumps(metrics, indent=2, default=str)
+        elif format == "csv":
+            import pandas as pd
+            df = pd.DataFrame(metrics)
+            return df.to_csv(index=False)
+        else:
+            raise ValueError(f"Unsupported format: {format}")
+# Global tracker instance
+_tracker = None
+def get_performance_tracker() -> PerformanceTracker:
+    """Get global performance tracker instance."""
+    global _tracker
+    if _tracker is None:
+        _tracker = PerformanceTracker()
+    return _tracker
+def display_performance_dashboard():
+    """Display performance tracking dashboard in Streamlit."""
+    tracker = get_performance_tracker()
+    st.markdown("### 📈 Performance Dashboard")
+    # Recent metrics summary
+    recent_metrics = tracker.get_recent_metrics(20)
+    if not recent_metrics:
+        st.info(
+            "No performance data available yet. Run some inferences to see metrics."
+        )
+        return
+    # Summary statistics
+    col1, col2, col3, col4 = st.columns(4)
+    total_inferences = len(recent_metrics)
+    avg_time = np.mean([m["total_time"] for m in recent_metrics])
+    avg_confidence = np.mean([m["confidence"] for m in recent_metrics])
+    unique_models = len(set(m["model_name"] for m in recent_metrics))
+    with col1:
+        st.metric("Total Inferences", total_inferences)
+    with col2:
+        st.metric("Avg Time", f"{avg_time:.3f}s")
+    with col3:
+        st.metric("Avg Confidence", f"{avg_confidence:.3f}")
+    with col4:
+        st.metric("Models Used", unique_models)
+    # Performance visualization
+    fig = tracker.create_performance_visualization()
+    if fig:
+        st.pyplot(fig)
+    # Model comparison table
+    st.markdown("#### Model Performance Comparison")
+    model_stats = tracker.get_model_statistics()
+    if model_stats:
+        import pandas as pd
+        stats_data = []
+        for model_name, stats in model_stats.items():
+            stats_data.append(
+                {
+                    "Model": model_name,
+                    "Total Inferences": stats["total_inferences"],
+                    "Avg Time (s)": f"{stats['avg_total_time']:.3f}",
+                    "Avg Confidence": f"{stats['avg_confidence']:.3f}",
+                    "Fastest (s)": f"{stats['fastest_inference']:.3f}",
+                    "Slowest (s)": f"{stats['slowest_inference']:.3f}",
+                }
+            )
+        df = pd.DataFrame(stats_data)
+        st.dataframe(df, use_container_width=True)
+    # Export options
+    with st.expander("📥 Export Performance Data"):
+        col1, col2 = st.columns(2)
+        with col1:
+            if st.button("Export JSON"):
+                json_data = tracker.export_metrics("json")
+                st.download_button(
+                    "Download JSON",
+                    json_data,
+                    "performance_metrics.json",
+                    "application/json",
+                )
+        with col2:
+            if st.button("Export CSV"):
+                csv_data = tracker.export_metrics("csv")
+                st.download_button(
+                    "Download CSV", csv_data, "performance_metrics.csv", "text/csv"
+                )
+if __name__ == "__main__":
+    # Test the performance tracker
+    tracker = PerformanceTracker()
+    # Simulate some metrics
+    for i in range(5):
+        metrics = PerformanceMetrics(
+            model_name=f"test_model_{i%2}",
+            prediction_time=0.1 + i * 0.01,
+            preprocessing_time=0.05,
+            total_time=0.15 + i * 0.01,
+            memory_usage_mb=100 + i * 10,
+            accuracy=0.8 + i * 0.02,
+            confidence=0.7 + i * 0.05,
+            timestamp=datetime.now().isoformat(),
+            input_size=500,
+            modality="raman",
+        )
+        tracker.log_performance(metrics)
+    print("Performance tracking test completed!")
+    print(f"Recent metrics: {len(tracker.get_recent_metrics())}")
+    print(f"Model stats: {tracker.get_model_statistics()}")

utils/preprocessing.py CHANGED Viewed

@@ -1,6 +1,7 @@
 """
 Preprocessing utilities for polymer classification app.
 Adapted from the original scripts/preprocess_dataset.py for Hugging Face Spaces deployment.
 """
 from __future__ import annotations
@@ -9,8 +10,33 @@ from numpy.typing import DTypeLike
 from scipy.interpolate import interp1d
 from scipy.signal import savgol_filter
 from scipy.interpolate import interp1d
-TARGET_LENGTH = 500     # Frozen default per PREPROCESSING_BASELINE
 def _ensure_1d_equal(x: np.ndarray, y: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
     x = np.asarray(x, dtype=float)
@@ -19,7 +45,10 @@ def _ensure_1d_equal(x: np.ndarray, y: np.ndarray) -> tuple[np.ndarray, np.ndarr
         raise ValueError("x and y must be 1D arrays of equal length >= 2")
     return x, y
-def resample_spectrum(x: np.ndarray, y: np.ndarray, target_len: int = TARGET_LENGTH) -> tuple[np.ndarray, np.ndarray]:
     """Linear re-sampling onto a uniform grid of length target_len."""
     x, y = _ensure_1d_equal(x, y)
     order = np.argsort(x)
@@ -29,6 +58,7 @@ def resample_spectrum(x: np.ndarray, y: np.ndarray, target_len: int = TARGET_LEN
     y_new = f(x_new)
     return x_new, y_new
 def remove_baseline(y: np.ndarray, degree: int = 2) -> np.ndarray:
     """Polynomial baseline subtraction (degree=2 default)"""
     y = np.asarray(y, dtype=float)
@@ -37,19 +67,25 @@ def remove_baseline(y: np.ndarray, degree: int = 2) -> np.ndarray:
     baseline = np.polyval(coeffs, x_idx)
     return y - baseline
-def smooth_spectrum(y: np.ndarray, window_length: int = 11, polyorder: int = 2) -> np.ndarray:
     """Savitzky-Golay smoothing with safe/odd window enforcement"""
     y = np.asarray(y, dtype=float)
     window_length = int(window_length)
     polyorder = int(polyorder)
     # === window must be odd and >= polyorder+1 ===
     if window_length % 2 == 0:
-        window_length += 1
     min_win = polyorder + 1
     if min_win % 2 == 0:
         min_win += 1
     window_length = max(window_length, min_win)
-    return savgol_filter(y, window_length=window_length, polyorder=polyorder, mode="interp")
 def normalize_spectrum(y: np.ndarray) -> np.ndarray:
     """Min-max normalization to [0, 1] with constant-signal guard."""
@@ -60,27 +96,114 @@ def normalize_spectrum(y: np.ndarray) -> np.ndarray:
         return np.zeros_like(y)
     return (y - y_min) / (y_max - y_min)
 def preprocess_spectrum(
     x: np.ndarray,
     y: np.ndarray,
     *,
     target_len: int = TARGET_LENGTH,
     do_baseline: bool = True,
-    degree: int = 2,
     do_smooth: bool = True,
-    window_length: int = 11,
-    polyorder: int = 2,
     do_normalize: bool = True,
     out_dtype: DTypeLike = np.float32,
 ) -> tuple[np.ndarray, np.ndarray]:
-    """Exact CLI baseline: resample -> baseline -> smooth -> normalize"""
     x_rs, y_rs = resample_spectrum(x, y, target_len=target_len)
     if do_baseline:
         y_rs = remove_baseline(y_rs, degree=degree)
     if do_smooth:
         y_rs = smooth_spectrum(y_rs, window_length=window_length, polyorder=polyorder)
     if do_normalize:
         y_rs = normalize_spectrum(y_rs)
     # === Coerce to a real dtype to satisfy static checkers & runtime ===
     out_dt = np.dtype(out_dtype)
-    return x_rs.astype(out_dt, copy=False), y_rs.astype(out_dt, copy=False)

 """
 Preprocessing utilities for polymer classification app.
 Adapted from the original scripts/preprocess_dataset.py for Hugging Face Spaces deployment.
+Supports both Raman and FTIR spectroscopy modalities.
 """
 from __future__ import annotations
 from scipy.interpolate import interp1d
 from scipy.signal import savgol_filter
 from scipy.interpolate import interp1d
+from typing import Tuple, Literal
+TARGET_LENGTH = 500  # Frozen default per PREPROCESSING_BASELINE
+# Modality-specific validation ranges (cm⁻¹)
+MODALITY_RANGES = {
+    "raman": (200, 4000),  # Typical Raman range
+    "ftir": (400, 4000),  # FTIR wavenumber range
+}
+# Modality-specific preprocessing parameters
+MODALITY_PARAMS = {
+    "raman": {
+        "baseline_degree": 2,
+        "smooth_window": 11,
+        "smooth_polyorder": 2,
+        "cosmic_ray_removal": False,
+    },
+    "ftir": {
+        "baseline_degree": 2,
+        "smooth_window": 13,  # Slightly larger window for FTIR
+        "smooth_polyorder": 2,
+        "cosmic_ray_removal": False,  # Could add atmospheric correction
+        "atmospheric_correction": False,  # Placeholder for future implementation
+    },
+}
 def _ensure_1d_equal(x: np.ndarray, y: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
     x = np.asarray(x, dtype=float)
         raise ValueError("x and y must be 1D arrays of equal length >= 2")
     return x, y
+def resample_spectrum(
+    x: np.ndarray, y: np.ndarray, target_len: int = TARGET_LENGTH
+) -> tuple[np.ndarray, np.ndarray]:
     """Linear re-sampling onto a uniform grid of length target_len."""
     x, y = _ensure_1d_equal(x, y)
     order = np.argsort(x)
     y_new = f(x_new)
     return x_new, y_new
 def remove_baseline(y: np.ndarray, degree: int = 2) -> np.ndarray:
     """Polynomial baseline subtraction (degree=2 default)"""
     y = np.asarray(y, dtype=float)
     baseline = np.polyval(coeffs, x_idx)
     return y - baseline
+def smooth_spectrum(
+    y: np.ndarray, window_length: int = 11, polyorder: int = 2
+) -> np.ndarray:
     """Savitzky-Golay smoothing with safe/odd window enforcement"""
     y = np.asarray(y, dtype=float)
     window_length = int(window_length)
     polyorder = int(polyorder)
     # === window must be odd and >= polyorder+1 ===
     if window_length % 2 == 0:
+        window_length += 1
     min_win = polyorder + 1
     if min_win % 2 == 0:
         min_win += 1
     window_length = max(window_length, min_win)
+    return savgol_filter(
+        y, window_length=window_length, polyorder=polyorder, mode="interp"
+    )
 def normalize_spectrum(y: np.ndarray) -> np.ndarray:
     """Min-max normalization to [0, 1] with constant-signal guard."""
         return np.zeros_like(y)
     return (y - y_min) / (y_max - y_min)
+def validate_spectrum_range(x: np.ndarray, modality: str = "raman") -> bool:
+    """Validate that spectrum wavenumbers are within expected range for modality."""
+    if modality not in MODALITY_RANGES:
+        raise ValueError(
+            f"Unknown modality '{modality}'. Supported: {list(MODALITY_RANGES.keys())}"
+        )
+    min_range, max_range = MODALITY_RANGES[modality]
+    x_min, x_max = np.min(x), np.max(x)
+    # Check if majority of data points are within range
+    in_range = np.sum((x >= min_range) & (x <= max_range))
+    total_points = len(x)
+    return (in_range / total_points) >= 0.7  # At least 70% should be in range
 def preprocess_spectrum(
     x: np.ndarray,
     y: np.ndarray,
     *,
     target_len: int = TARGET_LENGTH,
+    modality: str = "raman",  # New parameter for modality-specific processing
     do_baseline: bool = True,
+    degree: int | None = None,  # Will use modality default if None
     do_smooth: bool = True,
+    window_length: int | None = None,  # Will use modality default if None
+    polyorder: int | None = None,  # Will use modality default if None
     do_normalize: bool = True,
     out_dtype: DTypeLike = np.float32,
+    validate_range: bool = True,
 ) -> tuple[np.ndarray, np.ndarray]:
+    """
+    Modality-aware preprocessing: resample -> baseline -> smooth -> normalize
+    Args:
+        x, y: Input spectrum data
+        target_len: Target length for resampling
+        modality: 'raman' or 'ftir' for modality-specific processing
+        do_baseline: Enable baseline correction
+        degree: Polynomial degree for baseline (uses modality default if None)
+        do_smooth: Enable smoothing
+        window_length: Smoothing window length (uses modality default if None)
+        polyorder: Polynomial order for smoothing (uses modality default if None)
+        do_normalize: Enable normalization
+        out_dtype: Output data type
+        validate_range: Check if wavenumbers are in expected range for modality
+    Returns:
+        Tuple of (resampled_x, processed_y)
+    """
+    # Validate modality
+    if modality not in MODALITY_PARAMS:
+        raise ValueError(
+            f"Unsupported modality '{modality}'. Supported: {list(MODALITY_PARAMS.keys())}"
+        )
+    # Get modality-specific parameters
+    modality_config = MODALITY_PARAMS[modality]
+    # Use modality defaults if parameters not specified
+    if degree is None:
+        degree = modality_config["baseline_degree"]
+    if window_length is None:
+        window_length = modality_config["smooth_window"]
+    if polyorder is None:
+        polyorder = modality_config["smooth_polyorder"]
+    # Validate spectrum range if requested
+    if validate_range:
+        if not validate_spectrum_range(x, modality):
+            print(
+                f"Warning: Spectrum wavenumbers may not be optimal for {modality.upper()} analysis"
+            )
+    # Standard preprocessing pipeline
     x_rs, y_rs = resample_spectrum(x, y, target_len=target_len)
     if do_baseline:
         y_rs = remove_baseline(y_rs, degree=degree)
     if do_smooth:
         y_rs = smooth_spectrum(y_rs, window_length=window_length, polyorder=polyorder)
+    # FTIR-specific processing (placeholder for future enhancements)
+    if modality == "ftir":
+        if modality_config.get("atmospheric_correction", False):
+            # Placeholder for atmospheric correction
+            pass
+        if modality_config.get("cosmic_ray_removal", False):
+            # Placeholder for cosmic ray removal
+            pass
     if do_normalize:
         y_rs = normalize_spectrum(y_rs)
     # === Coerce to a real dtype to satisfy static checkers & runtime ===
     out_dt = np.dtype(out_dtype)
+    return x_rs.astype(out_dt, copy=False), y_rs.astype(out_dt, copy=False)
+def get_modality_info(modality: str) -> dict:
+    """Get processing parameters and validation ranges for a modality."""
+    if modality not in MODALITY_PARAMS:
+        raise ValueError(f"Unknown modality '{modality}'")
+    return {
+        "range": MODALITY_RANGES[modality],
+        "params": MODALITY_PARAMS[modality].copy(),
+    }

utils/results_manager.py CHANGED Viewed

@@ -1,14 +1,17 @@
 """Session results management for multi-file inference.
-Handles in-memory results table and export functionality"""
 import streamlit as st
 import pandas as pd
 import json
 from datetime import datetime
-from typing import Dict, List, Any, Optional
 import numpy as np
 from pathlib import Path
 import io
 def local_css(file_name):
@@ -199,6 +202,219 @@ class ResultsManager:
         return len(st.session_state[ResultsManager.RESULTS_KEY]) < original_length
     @staticmethod
     # ==UTILITY FUNCTIONS==
     def init_session_state():

 """Session results management for multi-file inference.
+Handles in-memory results table and export functionality.
+Supports multi-model comparison and statistical analysis."""
 import streamlit as st
 import pandas as pd
 import json
 from datetime import datetime
+from typing import Dict, List, Any, Optional, Tuple
 import numpy as np
 from pathlib import Path
 import io
+from collections import defaultdict
+import matplotlib.pyplot as plt
 def local_css(file_name):
         return len(st.session_state[ResultsManager.RESULTS_KEY]) < original_length
+    @staticmethod
+    def add_multi_model_results(
+        filename: str,
+        model_results: Dict[str, Dict[str, Any]],
+        ground_truth: Optional[int] = None,
+        metadata: Optional[Dict[str, Any]] = None,
+    ) -> None:
+        """
+        Add results from multiple models for the same file.
+        Args:
+            filename: Name of the processed file
+            model_results: Dict with model_name -> result dict
+            ground_truth: True label if available
+            metadata: Additional file metadata
+        """
+        for model_name, result in model_results.items():
+            ResultsManager.add_results(
+                filename=filename,
+                model_name=model_name,
+                prediction=result["prediction"],
+                predicted_class=result["predicted_class"],
+                confidence=result["confidence"],
+                logits=result["logits"],
+                ground_truth=ground_truth,
+                processing_time=result.get("processing_time", 0.0),
+                metadata=metadata,
+            )
+    @staticmethod
+    def get_comparison_stats() -> Dict[str, Any]:
+        """Get comparative statistics across all models."""
+        results = ResultsManager.get_results()
+        if not results:
+            return {}
+        # Group results by model
+        model_stats = defaultdict(list)
+        for result in results:
+            model_stats[result["model"]].append(result)
+        comparison = {}
+        for model_name, model_results in model_stats.items():
+            stats = {
+                "total_predictions": len(model_results),
+                "avg_confidence": np.mean([r["confidence"] for r in model_results]),
+                "std_confidence": np.std([r["confidence"] for r in model_results]),
+                "avg_processing_time": np.mean(
+                    [r["processing_time"] for r in model_results]
+                ),
+                "stable_predictions": sum(
+                    1 for r in model_results if r["prediction"] == 0
+                ),
+                "weathered_predictions": sum(
+                    1 for r in model_results if r["prediction"] == 1
+                ),
+            }
+            # Calculate accuracy if ground truth available
+            with_gt = [r for r in model_results if r["ground_truth"] is not None]
+            if with_gt:
+                correct = sum(
+                    1 for r in with_gt if r["prediction"] == r["ground_truth"]
+                )
+                stats["accuracy"] = correct / len(with_gt)
+                stats["num_with_ground_truth"] = len(with_gt)
+            else:
+                stats["accuracy"] = None
+                stats["num_with_ground_truth"] = 0
+            comparison[model_name] = stats
+        return comparison
+    @staticmethod
+    def get_agreement_matrix() -> pd.DataFrame:
+        """
+        Calculate agreement matrix between models for the same files.
+        Returns:
+            DataFrame showing model agreement rates
+        """
+        results = ResultsManager.get_results()
+        if not results:
+            return pd.DataFrame()
+        # Group by filename
+        file_results = defaultdict(dict)
+        for result in results:
+            file_results[result["filename"]][result["model"]] = result["prediction"]
+        # Get unique models
+        all_models = list(set(r["model"] for r in results))
+        if len(all_models) < 2:
+            return pd.DataFrame()
+        # Calculate agreement matrix
+        agreement_matrix = np.zeros((len(all_models), len(all_models)))
+        for i, model1 in enumerate(all_models):
+            for j, model2 in enumerate(all_models):
+                if i == j:
+                    agreement_matrix[i, j] = 1.0  # Perfect self-agreement
+                else:
+                    agreements = 0
+                    comparisons = 0
+                    for filename, predictions in file_results.items():
+                        if model1 in predictions and model2 in predictions:
+                            comparisons += 1
+                            if predictions[model1] == predictions[model2]:
+                                agreements += 1
+                    if comparisons > 0:
+                        agreement_matrix[i, j] = agreements / comparisons
+        return pd.DataFrame(agreement_matrix, index=all_models, columns=all_models)
+    @staticmethod
+    def create_comparison_visualization() -> plt.Figure:
+        """Create visualization comparing model performance."""
+        comparison_stats = ResultsManager.get_comparison_stats()
+        if not comparison_stats:
+            return None
+        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 8))
+        models = list(comparison_stats.keys())
+        # 1. Average Confidence
+        confidences = [comparison_stats[m]["avg_confidence"] for m in models]
+        conf_stds = [comparison_stats[m]["std_confidence"] for m in models]
+        ax1.bar(models, confidences, yerr=conf_stds, capsize=5)
+        ax1.set_title("Average Confidence by Model")
+        ax1.set_ylabel("Confidence")
+        ax1.tick_params(axis="x", rotation=45)
+        # 2. Processing Time
+        proc_times = [comparison_stats[m]["avg_processing_time"] for m in models]
+        ax2.bar(models, proc_times)
+        ax2.set_title("Average Processing Time")
+        ax2.set_ylabel("Time (seconds)")
+        ax2.tick_params(axis="x", rotation=45)
+        # 3. Prediction Distribution
+        stable_counts = [comparison_stats[m]["stable_predictions"] for m in models]
+        weathered_counts = [
+            comparison_stats[m]["weathered_predictions"] for m in models
+        ]
+        x = np.arange(len(models))
+        width = 0.35
+        ax3.bar(x - width / 2, stable_counts, width, label="Stable", alpha=0.8)
+        ax3.bar(x + width / 2, weathered_counts, width, label="Weathered", alpha=0.8)
+        ax3.set_title("Prediction Distribution")
+        ax3.set_ylabel("Count")
+        ax3.set_xticks(x)
+        ax3.set_xticklabels(models, rotation=45)
+        ax3.legend()
+        # 4. Accuracy (if available)
+        accuracies = []
+        models_with_acc = []
+        for model in models:
+            if comparison_stats[model]["accuracy"] is not None:
+                accuracies.append(comparison_stats[model]["accuracy"])
+                models_with_acc.append(model)
+        if accuracies:
+            ax4.bar(models_with_acc, accuracies)
+            ax4.set_title("Model Accuracy (where ground truth available)")
+            ax4.set_ylabel("Accuracy")
+            ax4.set_ylim(0, 1)
+            ax4.tick_params(axis="x", rotation=45)
+        else:
+            ax4.text(
+                0.5,
+                0.5,
+                "No ground truth\navailable",
+                ha="center",
+                va="center",
+                transform=ax4.transAxes,
+            )
+            ax4.set_title("Model Accuracy")
+        plt.tight_layout()
+        return fig
+    @staticmethod
+    def export_comparison_report() -> str:
+        """Export comprehensive comparison report as JSON."""
+        comparison_stats = ResultsManager.get_comparison_stats()
+        agreement_matrix = ResultsManager.get_agreement_matrix()
+        report = {
+            "timestamp": datetime.now().isoformat(),
+            "model_comparison": comparison_stats,
+            "agreement_matrix": (
+                agreement_matrix.to_dict() if not agreement_matrix.empty else {}
+            ),
+            "summary": {
+                "total_models_compared": len(comparison_stats),
+                "total_files_processed": len(
+                    set(r["filename"] for r in ResultsManager.get_results())
+                ),
+                "overall_statistics": ResultsManager.get_summary_stats(),
+            },
+        }
+        return json.dumps(report, indent=2, default=str)
     @staticmethod
     # ==UTILITY FUNCTIONS==
     def init_session_state():