Back to Projects

Comparative Analysis of Probabilistic AI Models

My role on this project: AI / ML Research Engineer

Project Overview

A comparative research study of Discrete Bayesian Networks and Gaussian Processes for uncertainty quantification in high-stakes classification tasks. Evaluated multiple structure learning algorithms and kernel functions across fraud detection and medical diagnosis datasets to determine optimal trade-offs between accuracy, calibration, and computational efficiency.

PythonBayesian NetworksGaussian ProcessesHill ClimbingPC-StableRBF KernelMatern KernelMacKay's ApproximationpgmpyGPyTorch
Project Link
Comparative Analysis of Probabilistic AI Models

Achievements

  • Achieved 99% balanced accuracy with Discrete Bayesian Networks using Hill Climbing on fraud detection dataset (50,000 transactions, 21 features)
  • Achieved 99.6% balanced accuracy with Gaussian Processes using Matern kernel on heart disease dataset (1,025 patients, 13 features)
  • Implemented and compared two structure learning algorithms for Bayesian Networks: Hill Climbing (score-based) achieved 900x faster training than PC-Stable (constraint-based) on heart disease data with superior stability
  • Compared RBF and Matern kernels for Gaussian Processes, with Matern achieving better calibration (ECE=0.026) on fraud detection due to effective modeling of irregular behavioral patterns
  • Implemented Bayesian discretization for preprocessing continuous features, optimizing bin boundaries through posterior probability maximization
  • Used Bayesian Information Criterion (BIC) as scoring function for structure learning, generating reliable network structures with lower variance across 5-fold cross-validation
  • Achieved perfect discrimination (AUC=1.0) with DBN-Hill Climbing on fraud detection, demonstrating clear separation of discretized feature patterns
  • Implemented Variable Elimination inference algorithm for DBN, achieving 4x faster inference time compared to Gaussian Processes (1.11s vs 2.7s on fraud dataset)
  • Applied MacKay's approximation for converting GP latent functions to calibrated probability estimates in classification tasks
  • Demonstrated computational trade-offs: GP training required 400x longer than DBN (6-7 minutes vs seconds), but provided superior calibration for high-stakes decision-making
  • Implemented 5-fold stratified cross-validation maintaining class distribution across folds, critical for the imbalanced fraud dataset (67/33 split)
  • Evaluated models across discrimination metrics (balanced accuracy, AUC, F1), calibration metrics (Brier score, ECE, KL divergence), and computational efficiency (training/inference time)
  • Hill Climbing structure learning consistently outperformed PC-Stable with 84.8% to 77.9% accuracy on heart disease and 16x faster training on fraud detection
  • Identified that GP excels when accuracy and calibration are critical, while DBN provides interpretable network structures and faster deployment for near-real-time prediction scenarios
John Olatubosun