Collaborative Computing-Based K-Nearest Neighbour Algorithm and Mutual Information to Classify Gene Expressions for Type 2 Diabetes

Collaborative Computing-Based K-Nearest Neighbour Algorithm and Mutual Information to Classify Gene Expressions for Type 2 Diabetes

Sura Zaki Al Rashid
Copyright: © 2022 |Volume: 18 |Issue: 2 |Pages: 12
ISSN: 1548-3673|EISSN: 1548-3681|EISBN13: 9781799893875|DOI: 10.4018/IJeC.304044
Cite Article Cite Article

MLA

Al Rashid, Sura Zaki. "Collaborative Computing-Based K-Nearest Neighbour Algorithm and Mutual Information to Classify Gene Expressions for Type 2 Diabetes." IJEC vol.18, no.2 2022: pp.1-12. http://doi.org/10.4018/IJeC.304044

APA

Al Rashid, S. Z. (2022). Collaborative Computing-Based K-Nearest Neighbour Algorithm and Mutual Information to Classify Gene Expressions for Type 2 Diabetes. International Journal of e-Collaboration (IJeC), 18(2), 1-12. http://doi.org/10.4018/IJeC.304044

Chicago

Al Rashid, Sura Zaki. "Collaborative Computing-Based K-Nearest Neighbour Algorithm and Mutual Information to Classify Gene Expressions for Type 2 Diabetes," International Journal of e-Collaboration (IJeC) 18, no.2: 1-12. http://doi.org/10.4018/IJeC.304044

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

The classification process is used in gene expression data on venous endothelial cells of umbilical cords in humans to reveal the concepts of regulation of insulin using dynamic gene expression data for two classes, namely, control and exposed to insulin. The mutual information statistical feature selection method is used on all available datasets to select these significant genes. The data reduction results are divided into training and testing, and further supplemented to the KNN classifier for diabetes classification. The results show that the mutual information in KNN reaches the highest ranked 10,000 genes and the test classification accuracy is 100%. Pathway analysis and gene ontology enrichment are used to evaluate the targeted genes. The results clearly exhibit the importance of finding the most informative genes in the database by using the statistical gene selection technique to achieve a reduction in time and cost and increase the efficiency of the classifier. This method exhibits these significant results that can be applied to other data and diseases.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.