metadata

license: mit
datasets:
  - financial-fraud-detection
language:
  - en
metrics:
  - auc
  - accuracy
  - f1
  - precision
  - recall
base_model:
  - None
library_name: onnx
pipeline_tag: fraud-detection
tags:
  - fraud-detection
  - ensemble
  - financial-security
  - onnx
  - xgboost
  - lightgbm
  - catboost
  - random-forest
  - production
  - cybersecurity
  - mlops
  - real-time-inference
  - deployed
model-index:
  - name: Fraud Detection Ensemble ONNX
    results:
      - task:
          name: Fraud Detection
          type: fraud-detection
        dataset:
          name: CREDIT CARD fraud detection credit card.csv
          type: tabular
        metrics:
          - type: auc
            value: 0.9998
          - type: accuracy
            value: 0.9942
          - type: f1
            value: 0.9756
          - type: precision
            value: 0.9813
          - type: recall
            value: 0.9701
new_version: 'true'

🛡️ Fraud Detection Ensemble Suite - ONNX Format

Author: darkknight25
Models: XGBoost, LightGBM, CatBoost, Random Forest, Meta Learner
Format: ONNX for production-ready deployment
Tags: fraud-detection, onnx, ensemble, real-world, ml, lightweight, financial-security

🔍 Overview

This repository provides a high-performance fraud detection ensemble trained on real-world financial datasets and exported in ONNX format for lightning-fast inference.

Each model is optimized for different fraud signals and then blended via a meta-model for enhanced generalization.

🎯 Real-World Use Cases

✅ Credit card fraud detection
✅ Transaction monitoring systems
✅ Risk scoring engines
✅ Insurance fraud
✅ Online payment gateways
✅ Embedded or edge deployments using ONNX

🧠 Models Included

Model	Format	Status	Notes
XGBoost	ONNX	✅ Ready	Best for handling imbalanced data
LightGBM	ONNX	✅ Ready	Fast, efficient gradient boosting
CatBoost	ONNX	✅ Ready	Handles categorical features well
RandomForest	ONNX	✅ Ready	Stable classical ensemble
Meta Model	ONNX	✅ Ready	Trained on outputs of above models

🧾 Feature Schema

feature_names.json contains the exact input features expected by all models.

You must preprocess data to match this schema before ONNX inference.

["amount", "time", "is_foreign", "txn_type", ..., "ratio_to_median_purchase_price"]

Shape: (None, 29)

Dtype: float32

import onnxruntime as ort
import numpy as np
import json

# Load feature schema
with open("feature_names.json") as f:
    feature_names = json.load(f)

# Dummy input (replace with your real preprocessed data)
X = np.random.rand(1, len(feature_names)).astype(np.float32)

# Load ONNX model
session = ort.InferenceSession("xgb_model.onnx", providers=["CPUExecutionProvider"])

# Inference
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: X})

print("Fraud probability:", output[0])

Example Inference Code:

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("meta_model.onnx")
input_data = np.array([[...]], dtype=np.float32)  # shape (1, 29)
inputs = {session.get_inputs()[0].name: input_data}
outputs = session.run(None, inputs)
print("Fraud Probability:", outputs[0])

🧪 Training Pipeline

All models were trained using the following:

✅ Stratified train/test split

✅ StandardScaler normalization

✅ Log loss and AUC optimization

✅ Early stopping and feature importance

✅ Light-weight autoencoder anomaly filter (not included here)

🔐 Security Focus

Ensemble modeling reduces false positives and model drift.

Models are robust against outliers and data shifts.

TFLite autoencoder (optional) can detect unknown fraud patterns.

📁 Files

models/
├── xgb_model.onnx
├── lgb_model.onnx
├── cat_model.onnx
├── rf_model.onnx
├── meta_model.onnx
├── feature_names.json

🛠️ Advanced Users

Easily convert ONNX to TFLite, TensorRT, or CoreML.

Deploy via FastAPI, Flask, Streamlit, or ONNX runtime on edge devices.

🤝 License

MIT License. You are free to use, modify, and deploy with attribution. 🙌 Author

Made with ❤️ by darkknight25,SUNNYTHAKUR Contact for enterprise deployments, smart contract forensics, or advanced ML pipelines