Recent Advances in Electrical & Electronic Engineering

Recent Advances in Electrical & Electronic Engineering

Editor-in-Chief

ISSN (Print): 2352-0965
ISSN (Online): 2352-0973

Back Subscribe
Review Article

Computational Analysis of Diabetes Kidney Diseases Using Machine Learning

Author(s): Ganesh Chandra, Namita Tiwari, Urmila Mahor, Parashu Ram Pal, Vikash Yadav* and Deepak Kumar Mishra

Volume 18, Issue 10, 2025

Published on: 06 January, 2025

Article ID: e23520965316924 Pages: 12

DOI: 10.2174/0123520965316924241007052621

Price: $65

Become a Editorial Board Member
Become a Reviewer
Become a Editor
Become a Section Editor

Abstract

The increasing complexity of healthcare, coupled with an ageing population, poses significant challenges for decision-making in healthcare delivery. Implementing smart decision support systems can indeed alleviate some of these challenges by providing clinicians with timely and personalized insights. These systems can leverage vast amounts of patient data, advanced analytics, and predictive modeling to offer clinicians a comprehensive view of individual patient needs and potential outcomes.

Currently, researchers and doctors need a faster solution for various diseases in health care. So they started to use the Machine Learning (ML) algorithms for better solution. ML is a sub field of Artificial Intelligence (AI) that provides a useful tool for data analysis, automatic process and others for healthcare system. The use of ML is increasing continuously in healthcare system due to its learning power.

In this paper the following algorithms are used for the diagnosis of Diabetes and Kidney Disease such as: Gradient Boosting Classifier (GBC), Random Forest Classifier (RFC), Extra Trees Classifier (ETC), Support Vector Classifier (SVC) and Multilayer Perceptron (MNP) Neural Network, In our model, Gradient Boosting Classifier is used with repeated cross validation to develop our system for better results. The experiment analysis performed for both unbalanced and balanced dataset. The accuracy achieved in case of unbalanced and balanced datasets for GBC, ETC, RFC SVC, MLP & DTC are 75.7 & 92.2, 75.7 & 90.1, 74.4 & 80.0, 62.5 & 66.4, 58.3 & 63.0 and 59.4 & 74.5 respectively. On comparing these results, we found that GBC results are better than other algorithms.

Keywords: Random forest classifier, gradient boosting classifier, machine learning, support vector classifier, multilayer perceptron, gradient boosting classifier.

Graphical Abstract

[1]
D.S. Char, N.H. Shah, and D. Magnus, "Implementing machine learning in health care — Addressing ethical challenges", N. Engl. J. Med., vol. 378, no. 11, pp. 981-983, 2018.
[http://dx.doi.org/10.1056/NEJMp1714229] [PMID: 29539284]
[2]
"Supervised learning methods for fraud detection in healthcare insurance", In: Machine learning in healthcare informatics., vol. 56. Springer: Berlin, 2014.
[http://dx.doi.org/10.1007/978-3-642-40017-9_12]
[3]
F. Ahamed, and F. Farid, "Applying internet of things and machine-learning for personalized healthcare: Issues and challenges", In 2018 International Conference on Machine Learning and Data Engineering (iCMLDE), 2018 Sydney, NSW, Australia, 03-07 Dec, 2018, pp. 19-21.
[http://dx.doi.org/10.1109/iCMLDE.2018.00014]
[4]
M. Ghassemi, T. Naumann, P. Schulam, A.L. Beam, I.Y. Chen, and R. Ranganath, "A review of challenges and opportunities in machine learning for health", AMIA Jt. Summits Transl. Sci. Proc., vol. 2020, pp. 191-200, 2020.
[PMID: 32477638]
[5]
C.D. Naylor, "On the prospects for a (deep) learning health care system", JAMA, vol. 320, no. 11, pp. 1099-1100, 2018.
[http://dx.doi.org/10.1001/jama.2018.11103] [PMID: 30178068]
[6]
J. Futoma, M. Simons, T. Panch, F. Doshi-Velez, and L.A. Celi, "The myth of generalisability in clinical research and machine learning in health care", Lancet Digit. Health, vol. 2, no. 9, pp. e489-e492, 2020.
[http://dx.doi.org/10.1016/S2589-7500(20)30186-2] [PMID: 32864600]
[7]
M.J. Ball, and J. Lillis, "E-health: Transforming the physician/patient relationship", Int. J. Med. Inform., vol. 61, no. 1, pp. 1-10, 2001.
[http://dx.doi.org/10.1016/S1386-5056(00)00130-1] [PMID: 11248599]
[8]
I. Goodfellow, Y. Bengio, and A. Courville, "Machine learning basics", In: Deep Learning, MIT Press, 2016, pp. 98-164.
[9]
M.I. Jordan, and T.M. Mitchell, "Machine learning: Trends, perspectives, and prospects", Science, vol. 349, no. 6245, pp. 255-260, 2015.
[http://dx.doi.org/10.1126/science.aaa8415] [PMID: 26185243]
[10]
G. Chandra, and S.K. Dwivedi, "A literature survey on various approaches of word sense disambiguation", In 2nd International Symposium on Computational and Business Intelligence, 2014 New Delhi, India, 07-08 Dec, 2014, pp. 106-109.
[http://dx.doi.org/10.1109/ISCBI.2014.30]
[11]
I. El Naqa, and M.J. Murphy, "What is machine learning?", In: I. El Naqa, R. Li, and M. Murphy, Eds., Machine learning in radiation oncology., Springer: Cham, 2015, pp. 3-11.
[http://dx.doi.org/10.1007/978-3-319-18305-3_1]
[12]
B. Yang, Q. Shao, L. Pan, and W. Li, "A study on regularized weighted least square support vector classifier", Pattern Recognit. Lett., vol. 108, pp. 48-55, 2018.
[http://dx.doi.org/10.1016/j.patrec.2018.03.002]
[13]
M. Chen, Y. Hao, K. Hwang, L. Wang, and L. Wang, "Disease prediction by machine learning over big data from healthcare communities", IEEE Access, vol. 5, pp. 8869-8879, 2017.
[http://dx.doi.org/10.1109/ACCESS.2017.2694446]
[14]
G. Kaur, and A. Chhabra, "Improved J48 classification algorithm for the prediction of diabetes", Int. J. Comput. Appl., vol. 98, no. 22, pp. 13-17, 2014.
[http://dx.doi.org/10.5120/17314-7433]
[15]
V. VijayanV, and A. Ravikumar, "Study of data mining algorithms for prediction and diagnosis of diabetes mellitus", Int. J. Comput. Appl., vol. 95, no. 17, pp. 12-16, 2014.
[http://dx.doi.org/10.5120/16685-6801]
[16]
A. Iyer, J. S, and R. Sumbaly, "Diagnosis of diabetes using classification mining techniques", Int. J. Data Min. Knowl. Manage. Process, vol. 5, no. 1, pp. 01-14, 2015.
[http://dx.doi.org/10.5121/ijdkp.2015.5101]
[17]
L. Pandeeswari, K. Rajeswari, and M. Phill, "K-means clustering and Naïve Bayes classifier for categorization of diabetes patients", Eng Technol, vol. 2, no. 1, pp. 179-185, 2015.
[18]
Z. Soltani, and A. Jafarian, "A new artificial neural networks approach for diagnosing diabetes disease type II", Int. J. Adv. Comput. Sci. Appl., vol. 7, no. 6, pp. 89-94, 2016.
[http://dx.doi.org/10.14569/IJACSA.2016.070611]
[19]
K. Saravananathan, and T. Velmurugan, "Analyzing diabetic data using classification algorithms in data mining", Indian J. Sci. Technol., vol. 9, no. 43, pp. 1-6, 2016.
[http://dx.doi.org/10.17485/ijst/2016/v9i43/93874]
[20]
H. Fu, S. Liu, S.I. Bastacky, X. Wang, X.J. Tian, and D. Zhou, "Diabetic kidney diseases revisited: A new perspective for a new era", Mol. Metab., vol. 30, pp. 250-263, 2019.
[http://dx.doi.org/10.1016/j.molmet.2019.10.005] [PMID: 31767176]
[21]
N.A. ElSayed, G. Aleppo, V.R. Aroda, R.R. Bannuru, F.M. Brown, D. Bruemmer, B.S. Collins, M.E. Hilliard, D. Isaacs, E.L. Johnson, S. Kahan, K. Khunti, J. Leon, S.K. Lyons, M.L. Perry, P. Prahalad, R.E. Pratley, J.J. Seley, R.C. Stanton, and R.A. Gabbay, "11. Chronic kidney disease and risk management: Standards of care in diabetes — 2023", Diabetes Care, vol. 46, no. Suppl. 1, pp. S191-S202, 2023.
[http://dx.doi.org/10.2337/dc23-S011] [PMID: 36507634]
[22]
S. Rayego-Mateos, R.R. Rodrigues-Diez, B. Fernandez-Fernandez, C. Mora-Fernández, V. Marchant, J. Donate-Correa, J.F. Navarro-González, A. Ortiz, and M. Ruiz-Ortega, "Targeting inflammation to treat diabetic kidney disease: The road to 2030", Kidney Int., vol. 103, no. 2, pp. 282-296, 2023.
[http://dx.doi.org/10.1016/j.kint.2022.10.030] [PMID: 36470394]
[23]
K. Kanasaki, K. Ueki, and M. Nangaku, "Diabetic kidney disease: The kidney disease relevant to individuals with diabetes", Clin. Exp. Nephrol., 2024.
[http://dx.doi.org/10.1007/s10157-024-02537-z] [PMID: 39031296]
[24]
M. Asif, M. Al-Razgan, Y.A. Ali, and L. Yunrong, "Graph convolution networks for social media trolls detection use deep feature extraction", J. Cloud Comput. (Heidelb.), vol. 13, no. 1, p. 33, 2024.
[http://dx.doi.org/10.1186/s13677-024-00600-4]
[25]
S. Gupta, M. Dominguez, and L. Golestaneh, "Diabetic kidney disease: An update", Med. Clin. North Am., vol. 107, no. 4, pp. 689-705, 2023.
[http://dx.doi.org/10.1016/j.mcna.2023.03.004] [PMID: 37258007]
[26]
G. Chandra, and S.K. Dwivedi, "Query expansion for effective retrieval results of hindi–english cross-lingual IR", Appl. Artif. Intell., vol. 33, no. 7, pp. 567-593, 2019.
[http://dx.doi.org/10.1080/08839514.2019.1577018]
[27]
W. Chu, S.S. Keerthi, and C.J. Ong, "Bayesian trigonometric support vector classifier", Neural Comput., vol. 15, no. 9, pp. 2227-2254, 2003.
[http://dx.doi.org/10.1162/089976603322297368]
[28]
S.S. Keerthi, O. Chapelle, D. DeCoste, K.P. Bennett, and E. Parrado-Hernández, "Building support vector machines with reduced classifier complexity", J. Mach. Learn. Res., vol. 7, no. 6, pp. 1493-1515, 2006.
[29]
N.M. Pindoriya, P. Jirutitijaroen, D. Srinivasan, and C. Singh, "Composite reliability evaluation using Monte Carlo simulation and least squares support vector classifier", IEEE Trans. Power Syst., vol. 26, no. 4, pp. 2483-2490, 2011.
[http://dx.doi.org/10.1109/TPWRS.2011.2116048]
[30]
M. Kaur, and B. Singh, "Diagnosis of malignant pleural mesothelioma using KNN", In Proceedings of 2nd International Conference on Communication, Computing and Networking, 2019 Singapore, 08 Sept, 2018, pp. 637-641.
[http://dx.doi.org/10.1007/978-981-13-1217-5_62]
[31]
Z. Zheng, Y. Cai, and Y. Li, "Oversampling method for imbalanced classification", Comput. Inf., vol. 34, no. 5, pp. 1017-1037, 2016.
[32]
C. Drummond, and R.C. Holte, "C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling", In International Conference on Machine Learning (ICML 2003) Workshop on Learning from Imbalanced Data Sets II Washington, DC, USA, 31 July, 2003.
[33]
B. Krawczyk, "Learning from imbalanced data: Open challenges and future directions", Progress in Artificial Intelligence, vol. 5, no. 4, pp. 221-232, 2016.
[http://dx.doi.org/10.1007/s13748-016-0094-0]
[34]
J.M. Johnson, and T.M. Khoshgoftaar, "Survey on deep learning with class imbalance", J. Big Data, vol. 6, no. 1, p. 27, 2019.
[http://dx.doi.org/10.1186/s40537-019-0192-5]
[35]
K. Tsuda, "Support vector classifier with asymmetric kernel functions", In 7th European Symposium on Artificial Neural Networks, 1999 Bruges, Belgium, April 21-23, 1999, pp. 183-188.
[36]
Z.Y.Z. Youyun, "The study on some problems of support vector classifier", ComEngApp, vol. 39, pp. 36-38, 2003.
[37]
Q. Liu, Q. He, and Z. Shi, "Extreme support vector machine classifier", In Advances in Knowledge Discovery and Data Mining, 12th Pacific-Asia Conference, PAKDD Osaka, Japan, May 20-23, 2008, pp. 222-233.
[http://dx.doi.org/10.1007/978-3-540-68125-0_21]
[38]
A.T. Azar, H.I. Elshazly, A.E. Hassanien, and A.M. Elkorany, "A random forest classifier for lymph diseases", Comput. Methods Programs Biomed., vol. 113, no. 2, pp. 465-473, 2014.
[http://dx.doi.org/10.1016/j.cmpb.2013.11.004] [PMID: 24290902]
[39]
B. Xu, X. Guo, Y. Ye, and J. Cheng, "An improved random forest classifier for text categorization", J. Comput. (Taipei), vol. 7, no. 12, pp. 2913-2920, 2012.
[40]
C. Nguyen, Y. Wang, and H.N. Nguyen, "Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic", J. Biomed. Sci. Eng., vol. 6, no. 5, pp. 551-560, 2013.
[http://dx.doi.org/10.4236/jbise.2013.65070]
[41]
D. Devetyarov, and I. Nouretdinov, "Prediction with confidence based on a random forest classifier", In Artificial Intelligence Applications and Innovations, 2010 Larnaca, Cyprus, 6-7 Oct, 2010, pp. 37-44
[http://dx.doi.org/10.1007/978-3-642-16239-8_8]
[42]
L. Breiman, "Random forests - Random features", Technical Note, University of California, 1999.
[43]
F. Livingston, "Implementation of Breiman’s random forest machine learning algorithm ", J. Mach. Learn., 2005.
[44]
T.M. Oshiro, P.S. Perez, and J.A. Baranauskas, "How many trees in a random forest?", In 8th International workshop on machine learning and data mining in pattern recognition, 2012 Berlin, Heidelberg, July, 2012, pp. 154-168.
[http://dx.doi.org/10.1007/978-3-642-31537-4_13]
[45]
Y. Qi, "Random forest for bioinformatics", In: c. Zhang, and Y. Ma, Eds., Ensemble machine learning., Springer: Boston, MA, 2012, pp. 307-323.
[http://dx.doi.org/10.1007/978-1-4419-9326-7_11]
[46]
V.Y. Kulkarni, and P.K. Sinha, "Pruning of random forest classifiers: A survey and future directions", In 2012 International Conference on Data Science & Engineering (ICDSE), 2012 Cochin, India, 18-20 July 2012, pp. 64-68.
[http://dx.doi.org/10.1109/ICDSE.2012.6282329]
[47]
I. Nitze, U. Schulthess, and H. Asche, "Comparison of machine learning algorithms random forest, artificial neural network and support vector machine to maximum likelihood for supervised crop type classification", In Proceedings of the 4th GEOBIA, 2012 Rio de Janeiro, Brazil, 7-9 May, 2012, pp. 35-40.
[48]
B.S. Bhati, and C.S. Rai, "Ensemble based approach for intrusion detection using extra tree classifier", In: V. Solanki, M. Hoang, Z. Lu, and P. Pattnaik, Eds., Intelligent Computing in Engineering, Springer: Singapore, 2020, pp. 213-220.
[http://dx.doi.org/10.1007/978-981-15-2780-7_25]
[49]
K. Kaur, and S.K. Mittal, "Withdrawn: Classification of mammography image with CNN-RNN based semantic features and extra tree classifier approach using LSTM", Mater. Today Proc., 2020.
[http://dx.doi.org/10.1016/j.matpr.2020.09.619]
[50]
O. Maier, M. Wilms, J. von der Gablentz, U.M. Krämer, T.F. Münte, and H. Handels, "Extra Tree forests for sub-acute ischemic stroke lesion segmentation in MR sequences", J. Neurosci. Methods, vol. 240, pp. 89-100, 2015.
[http://dx.doi.org/10.1016/j.jneumeth.2014.11.011] [PMID: 25448384]
[51]
R. Shafique, A. Mehmood, and G.S. Choi, "Cardiovascular disease prediction system using extra trees classifier", Res. Sq., 2019.
[http://dx.doi.org/10.21203/rs.2.14454/v1]
[52]
A. Natekin, and A. Knoll, "Gradient boosting machines, a tutorial", Front. Neurorobot., vol. 7, p. 21, 2013.
[http://dx.doi.org/10.3389/fnbot.2013.00021] [PMID: 24409142]
[53]
N. Dhieb, H. Ghazzai, H. Besbes, and Y. Massoud, "Extreme gradient boosting machine learning algorithm for safe auto insurance operations", In 2019 IEEE International Conference on Vehicular Electronics and Safety (ICVES), 2019 Cairo, Egypt, 4-6 Sept, 2019, pp. 1-5.
[http://dx.doi.org/10.1109/ICVES.2019.8906396]
[54]
A. Bansal, and S. Kaur, "Extreme gradient boosting based tuning for classification in intrusion detection systems", In: M. Singh, P. Gupta, V. Tyagi, J. Flusser, and T. Ören, Eds., Advances in Computing and Data Sciences, Springer: Singapore, 2018, pp. 372-380.
[http://dx.doi.org/10.1007/978-981-13-1810-8_37]
[55]
R.E. Schapire, "The boosting approach to machine learning: An overview", In: D.D. Denison, M.H. Hansen, C.C. Holmes, B. Mallick, and B. Yu, Eds., Nonlinear Estimation and Classification, Springer: New York, 2003, pp. 149-171.
[http://dx.doi.org/10.1007/978-0-387-21579-2_9]
[56]
A.V. Konstantinov, and L.V. Utkin, "Interpretable machine learning with an ensemble of gradient boosting machines", Knowl. Base. Syst., vol. 222, p. 106993, 2021.
[http://dx.doi.org/10.1016/j.knosys.2021.106993]
[57]
N.L. Sy, "Modelling the infiltration process with a multi-layer perceptron artificial neural network", Hydrol. Sci. J., vol. 51, no. 1, pp. 3-20, 2006.
[http://dx.doi.org/10.1623/hysj.51.1.3]
[58]
B.T. Pham, M.D. Nguyen, K.T.T. Bui, I. Prakash, K. Chapi, and D.T. Bui, "A novel artificial intelligence approach based on multi-layer perceptron neural network and biogeography-based optimization for predicting coefficient of consolidation of soil", Catena, vol. 173, pp. 302-311, 2019.
[http://dx.doi.org/10.1016/j.catena.2018.10.004]
[59]
M. Desai, and M. Shah, "An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN)", Clinical eHealth, vol. 4, pp. 1-11, 2020.
[http://dx.doi.org/10.1016/j.ceh.2020.11.002]
[60]
V. Nasiri, A.A. Darvishsefat, R. Rafiee, A. Shirvany, and M.A. Hemat, "Land use change modeling through an integrated multi-layer perceptron neural network and markov chain analysis (case study: Arasbaran region, Iran)", J. For. Res., vol. 30, no. 3, pp. 943-957, 2019.
[http://dx.doi.org/10.1007/s11676-018-0659-9]
[61]
P. Moallem, and N. Razmjooy, "A multi layer perceptron neural network trained by invasive weed optimization for potato color image segmentation", Trends Appl. Sci. Res., vol. 7, no. 6, pp. 445-455, 2012.
[http://dx.doi.org/10.3923/tasr.2012.445.455]
[62]
F. Murtagh, "Multilayer perceptrons for classification and regression", Neurocomputing, vol. 2, no. 5-6, pp. 183-197, 1991.
[http://dx.doi.org/10.1016/0925-2312(91)90023-5]
[63]
D. Chicco, and G. Jurman, "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation", BMC Genomics, vol. 21, no. 1, p. 6, 2020.
[http://dx.doi.org/10.1186/s12864-019-6413-7] [PMID: 31898477]
[64]
S. Boughorbel, F. Jarray, and M. El-Anbari, "Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric", PLoS One, vol. 12, no. 6, p. e0177678, 2017.
[http://dx.doi.org/10.1371/journal.pone.0177678] [PMID: 28574989]
[65]
D. Chicco, N. Tötsch, and G. Jurman, "The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation", BioData Min., vol. 14, no. 1, p. 13, 2021.
[http://dx.doi.org/10.1186/s13040-021-00244-z] [PMID: 33541410]
[66]
D. Chicco, M.J. Warrens, and G. Jurman, "The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment", IEEE Access, vol. 9, pp. 78368-78381, 2021.
[http://dx.doi.org/10.1109/ACCESS.2021.3084050]
[67]
A.M. Borges, J. Kuang, H. Milhorn, and R. Yi, "An alternative approach to calculating Area‐Under‐the‐Curve (AUC) in delay discounting research", J. Exp. Anal. Behav., vol. 106, no. 2, pp. 145-155, 2016.
[http://dx.doi.org/10.1002/jeab.219] [PMID: 27566660]
[68]
A.J. Bowers, and X. Zhou, "Receiver operating characteristic (ROC) area under the curve (AUC): A diagnostic measure for evaluating the accuracy of predictors of education outcomes", J. Educ. Students Placed Risk, vol. 24, no. 1, pp. 20-46, 2019.
[http://dx.doi.org/10.1080/10824669.2018.1523734]
[69]
M. Kottas, O. Kuss, and A. Zapf, "A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies", BMC Med. Res. Methodol., vol. 14, no. 1, p. 26, 2014.
[http://dx.doi.org/10.1186/1471-2288-14-26] [PMID: 24552686]
[70]
G. Chandra, and S.K. Dwivedi, "Query expansion based on term selection for Hindi – English cross lingual IR", Journal of King Saud University - Computer and Information Sciences, vol. 32, no. 3, pp. 310-319, 2020.
[http://dx.doi.org/10.1016/j.jksuci.2017.09.002]
[71]
G. Chandra, and S.K. Dwivedi, "Assessing query translation quality using back translation in hindi-english clir", Int. J. Intell. Syst. Appl., vol. 9, no. 3, pp. 51-59, 2017.
[http://dx.doi.org/10.5815/ijisa.2017.03.07]
[72]
R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, "A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction", J. Appl. Sci. Technol. Trends, vol. 1, no. 1, pp. 56-70, 2020.
[http://dx.doi.org/10.38094/jastt1224]

Rights & Permissions Print Cite