Analyzing Diabetes Datasets using Data Mining


 Data mining, Classification, Algorithm, Diabtes MelitusType II.

How to Cite

Saman Hina, Anita Shaikh, & Sohail Abul Sattar. (2017). Analyzing Diabetes Datasets using Data Mining. Journal of Basic & Applied Sciences, 13, 466–471.


Data mining techniques explore critical information in various domains (for example in CRM (customer relationship management), HR (Human Resource), GIS (Geographic Information System) etc.) but most importantly in medical domain. In medical domain, data mining can assist in minimizing the risk of developing some stereotyped diseases such as cancer, heart diseases, diabetes etc. In this paper, authors have focused data of Diabetic patients. Diabetic patient’s body lacks ability to manage the glucose level in blood which can affect the other body mechanism. This can lead to the dysfunctioning of other physiological and psychological parameters such as reduced weight, skin folding. These parameters may be a valuable data source for the research. Diabetes mellitus placed 4th among Noncommunicable diseases-NCDs, caused 1.5 million global deaths each year worldwide [1]. The increase in digital information has elevated numerous challenges especially when it comes to automated content analysis and to make use of some machine learning techniques to aid mankind for predicting the non-communicable diseases like diabetics. . In this research different classifying algorithms such as Naïve bayes, MLP, J.48, ZeroR, Random Forest, and Regression were applied to depict the result. The conducted research aims to extract knowledge from the given set of data and to generate comprehensive and intelligent results.


World Health Organization, Diabetes Programm.

Machine Learning Group at the University of Waikato. Weka 3: Data Mining Software in Java. Retrieved September 4, 2016, from

Sanofi, Diabetes Pakistan, Statistics. http://www.

The News International. 73051-seven-million-pakistanis-suffering-from-type-2-diabetes

Pima Indians Diabetes Data Set. ml/datasets/Pima+Indians+Diabetes

Logistic Regression. regression

Singh S, Kaur K. A Review on Diagnosis of Diabetes in Data Mining. International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Index Copernicus Value 2013; 6.14 | Impact Factor (2013): 4.438.

Witten IH. Department of Computer Science University of Waikato New Zealand, Simple neural networks”, “More Data Mining with Weka. More Data Mining with Weka, Simple Neuarl Network, 5

Sathees Kumar B, Gayathri P. Department of Computer Science, Bishop Heber College, Analysis of Adult-Onset Diabetes Using Data Mining Classification Algorithms, International Journal of Modern Computer Science (IJMCS) ISSN: 2320-7868 (Online) Volume No.-2, Issue No.-3, June, 2014 Conference proceeding.

Radha P, Srinivasan B. Predicting Diabetes by cosequencing the various Data Mining Classification Techniques. IJISET - International Journal of Innovative Science, Engineering & Technology 2014; 1(6).

Iyer A, Jeyalatha S, Sumbaly R. Diagnosis of Diabetes Using Classification Mining Techniques. International Journal of Data Mining & Knowledge Management Process (IJDKP) 2015; 5(1):

Satyanandam N, Satyanarayana Ch, Riyazuddin Md, Amjan S. Data Mining Machine Learning Approaches and Medical Diagnose Systems. International Journal of Computer & Organization Trends 2012; 2(3).

Sa’di S, Maleki A, Hashemi R, Panbechi Z, Chalabi K. Comparison Of Data Mining Algorithms In The Diagnosis Of Type II diabetes. International Journal on Computational Science & Applications (IJCSA) 2015; 5(5).

Ezaz Ahmed D, Mathur YK, Kumar V. Knowledge Discovery in Health Care Datasets Using Data Mining Tools. (IJACSA) International Journal of Advanced Computer Science and Applications 2012; 3(4): 117.

Daghistani T, Alshammari R. Diagnosis of Diabetes by Applying Data Mining Classification TechniquesComparison ofThree Data Mining Algorithms. IJACSA) International Journal of Advanced Computer Science and Applications 2016; 7(7): 16.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Copyright (c) 2017 Saman Hina, Anita Shaikh , Sohail Abul Sattar