π‘οΈ Fraud Detection Ensemble Suite - ONNX Format
Author: darkknight25
Models: XGBoost, LightGBM, CatBoost, Random Forest, Meta Learner
Format: ONNX for production-ready deployment
Tags: fraud-detection
, onnx
, ensemble
, real-world
, ml
, lightweight
, financial-security
π Overview
This repository provides a high-performance fraud detection ensemble trained on real-world financial datasets and exported in ONNX format for lightning-fast inference.
Each model is optimized for different fraud signals and then blended via a meta-model for enhanced generalization.
π― Real-World Use Cases
β
Credit card fraud detection
β
Transaction monitoring systems
β
Risk scoring engines
β
Insurance fraud
β
Online payment gateways
β
Embedded or edge deployments using ONNX
π§ Models Included
Model | Format | Status | Notes |
---|---|---|---|
XGBoost | ONNX | β Ready | Best for handling imbalanced data |
LightGBM | ONNX | β Ready | Fast, efficient gradient boosting |
CatBoost | ONNX | β Ready | Handles categorical features well |
RandomForest | ONNX | β Ready | Stable classical ensemble |
Meta Model | ONNX | β Ready | Trained on outputs of above models |
π§Ύ Feature Schema
feature_names.json
contains the exact input features expected by all models.
You must preprocess data to match this schema before ONNX inference.
["amount", "time", "is_foreign", "txn_type", ..., "ratio_to_median_purchase_price"]
Shape: (None, 29)
Dtype: float32
import onnxruntime as ort
import numpy as np
import json
# Load feature schema
with open("feature_names.json") as f:
feature_names = json.load(f)
# Dummy input (replace with your real preprocessed data)
X = np.random.rand(1, len(feature_names)).astype(np.float32)
# Load ONNX model
session = ort.InferenceSession("xgb_model.onnx", providers=["CPUExecutionProvider"])
# Inference
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: X})
print("Fraud probability:", output[0])
Example Inference Code:
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("meta_model.onnx")
input_data = np.array([[...]], dtype=np.float32) # shape (1, 29)
inputs = {session.get_inputs()[0].name: input_data}
outputs = session.run(None, inputs)
print("Fraud Probability:", outputs[0])
π§ͺ Training Pipeline
All models were trained using the following:
β
Stratified train/test split
β
StandardScaler normalization
β
Log loss and AUC optimization
β
Early stopping and feature importance
β
Light-weight autoencoder anomaly filter (not included here)
π Security Focus
Ensemble modeling reduces false positives and model drift.
Models are robust against outliers and data shifts.
TFLite autoencoder (optional) can detect unknown fraud patterns.
π Files
models/
βββ xgb_model.onnx
βββ lgb_model.onnx
βββ cat_model.onnx
βββ rf_model.onnx
βββ meta_model.onnx
βββ feature_names.json
π οΈ Advanced Users
Easily convert ONNX to TFLite, TensorRT, or CoreML.
Deploy via FastAPI, Flask, Streamlit, or ONNX runtime on edge devices.
π€ License
MIT License. You are free to use, modify, and deploy with attribution. π Author
Made with β€οΈ by darkknight25,SUNNYTHAKUR Contact for enterprise deployments, smart contract forensics, or advanced ML pipelines
Evaluation results
- auc on CREDIT CARD fraud detection credit card.csvself-reported1.000
- accuracy on CREDIT CARD fraud detection credit card.csvself-reported0.994
- f1 on CREDIT CARD fraud detection credit card.csvself-reported0.976
- precision on CREDIT CARD fraud detection credit card.csvself-reported0.981
- recall on CREDIT CARD fraud detection credit card.csvself-reported0.970