News:

 

Abstract

Diabetes, resulting from inadequate insulin production or utilization, causes extensive harm to the body. Existing diagnostic methods are often invasive and come with drawbacks, such as cost constraints. Although there are machine learning models like Classwise k Nearest Neighbor (CkNN) and General Regression Neural Network (GRNN), they struggle with imbalanced data and result in underperformance. Leveraging advancements in sensor technology and machine learning, we propose a non-invasive diabetes diagnosis using a Back Propagation Neural Network (BPNN) with batch normalization, incorporating data re-sampling and normalization for class balancing. Our method addresses existing challenges such as limited performance associated with traditional machine learning. Experimental results on three datasets show significant improvements in overall accuracy, sensitivity, and specificity compared to traditional methods. Notably, we achieve accuracies of 89.81% in Pima diabetes dataset, 75.49% in CDC BRFSS2015 dataset, and 95.28% in Mesra Diabetes dataset. This underscores the potential of advanced deep learning models, including Transformers, for robust diabetes diagnosis.

Methodology

 


Workflow of Proposed Method: The pipeline encompasses crucial components, including data undersampling to address class imbalance in the dataset. The Workflow of our proposed method illustrates the data scaling procedure for effective feature normalization. The backbone of the pipeline consists of a Back Propagation Neural Network (BPNN) architecture, enhanced with batch normalization, to facilitate automatic diabetes diagnosis. This comprehensive pipeline demonstrates potential for accurate and automated diabetes classification.


 


Back Propagation Neural Network (BPNN) model, adopting a five-fold crossvalidation approach to assess its performance and ensure robustness in the evaluation process.


Visualization

 


The figure displays the feature distributions for diabetes diagnosis in the dataset before (top sub-figure) and after (bottom sub-figure) scaling using standardization. Standardization has successfully transformed the features to a comparable magnitude, resulting in a more uniform distribution, facilitating the training process and enhancing the performance of the Back Propagated diabetes diagnosis model.


 


The plot compares the distribution of positive and negative samples using two methods, PCA (linear dimensionality reduction) and t-SNE (nonlinear dimensionality reduction), providing a comprehensive visualization of their distribution in the Pima dataset.


Experiments

 


Comparative results on different datasets with various models. The cells with '-' indicate that certain comparative studies did not assess their models on specific datasets.


BibTeX

@article{zhang2024deep,
  title={A Deep Learning Approach to Diabetes Diagnosis},
  author={Zhang, Zeyu and Ahmed, Khandaker Asif and Hasan, Md Rakibul and Gedeon, Tom and Hossain, Md Zakir},
  journal={arXiv preprint arXiv:2403.07483},
  year={2024}
}