Abstract:
Autism is one of the neurological disorders that occurs in children. There are many causes of autism, one of which is genetic factors. Therefore, in order to find effecti...Show MoreMetadata
Abstract:
Autism is one of the neurological disorders that occurs in children. There are many causes of autism, one of which is genetic factors. Therefore, in order to find effective treatments, we need to discover the genes which relate to autism disease. In this paper, we use a computational approach to train a model that can predict new autism-related candidate genes. The methodology combines different data sources such as protein-protein interaction networks, microRNAs (miRNA)-target network and known autism-related genes into an integrated network. The structural properties of this network are represented as a vector dataset and a binary classification problem is formulated. However, because the number of known autism-related genes is very small, we face an imbalance data classification problem. To solve this issue, an under-sampling clustering-based data balancing algorithm has been proposed. Training classifiers with machine learning models such as SVMs, k-NN, and RFs, we obtained results of 1-3% higher in G-mean measures when comparing to cases without using any data balancing strategies. These results implied that our proposed model may contribute to finding new autism-related gene candidates.
Date of Conference: 24-26 October 2019
Date Added to IEEE Xplore: 05 December 2019
ISBN Information:
Print on Demand(PoD) ISSN: 2164-2508