Distance-based robust linear discriminant analysis using mom-Qn and wmom-Qn estimators

Authors

  • Yerima, T. B. Department of Mathematical Sciences, Gombe State University, Gombe State, Nigeria Author
  • Rasheed, B. A. Department of Mathematical Sciences, Gombe State University, Gombe State, Nigeria Author
  • Adamu, A. Department of Mathematical Sciences, Gombe State University, Gombe State, Nigeria Author
  • Buba, C. P. Department of Mathematical Sciences, Gombe State University, Gombe State, Nigeria Author

Keywords:

Diabetes, LDA, LOOCV, Robust estimators, Robust Mahalanobis distance

Abstract

Linear Discriminant Analysis (LDA) is a cornerstone of multivariate classification but its performance significantly degrades when assumptions of multivariate normality, group covariance homogeneity are violated. However, this situation becomes worst when outliers are present in the datasets. There is a need for more reliable classification that will address the problem under such conditions. This research aims to address these limitations by adapting a distance-based robust LDA framework designed to maintain high predictive accuracy and stability. The study integrates the Modified One-Step M-estimator (MOM) and its Winsorized variant (WMOM) with the high-breakdown point Qn scale estimator. The study employed Robust Mahalanobis distance to effectively handle multivariate outliers. Performance was evaluated using simulation and a real-life dataset of clinical application distinguishing between Type 1 and Type 2 diabetes. The validation was performed using Leave One Out Cross Validation (LOOCV) and the PRESS-Q statistic. Simulation results under both balanced and unbalanced sample conditions showed that the MOM-Qn model consistently yielded the lowest mean misclassification rates, while Classical LDA recorded the highest misclassification rates. In the real-life dataset MOM-Qn estimator achieved a superior classification accuracy of 98.72% with a misclassification error of only 1.28%. These findings indicate that the adapted robust estimators are significantly less sensitive to both sample conditions, increase in dimensions, contamination, shift in location and shift in shape compare to classical. By incorporating high-breakdown estimators, the framework remains resilient even when standard statistical assumptions are violated. In conclusion, the MOM-Qn and WMOM-Qn framework provide a more stable and efficient alternative to classical LDA. This adapted approach is highly recommended for application in medical diagnostics, where ensuring reliability in the presence of data anomalies is critical for accurate patient classification.

Downloads

Published

2026-01-31