Abstract
Prioritizing disease genes is trying to identify potential disease causing genes for a given phenotype, which can be applied to reveal the inherited basis of human diseases and facilitate drug development. Our motivation is inspired by label propagation algorithm and the false positive protein-protein interactions that exist in the dataset. To the best of our knowledge, the false positive protein-protein interactions have not been considered before in disease gene prioritization. Label propagation has been successfully applied to prioritize disease causing genes in previous network-based methods. These network-based methods use basic label propagation, i.e. random walk, on networks to prioritize disease genes in different ways. However, all these methods can not deal with the situation in which plenty false positive protein-protein interactions exist in the dataset, because the PPI network is used as a fixed input in previous methods. This important characteristic of data source may cause a large deviation in results. We conduct extensive experiments over OMIM datasets, and our proposed method IDLP has demonstrated its effectiveness compared with eight state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chatr-Aryamontri, A., Breitkreutz, B.-J.: The BioGRID interaction database: 2015 update. Nucleic Acids Res. 43, D470–8 (2015)
Chen, Y., Li, L.: Phenome-driven disease genetics prediction toward drug discovery. Bioinformatics 31(12), i276–i283 (2015)
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Gandhi, T.K.B., Zhong, J.: Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets. Nat. Genet. 38(3), 285–293 (2006)
Hamosh, A., Scott, A.F.: Online Mendelian Inheritance in Man (OMIM), a knowledge base of human genes and genetic disorders. Nucleic Acids Res. 33(Database issue), D514–D517 (2004)
Hoehndorf, R., Schofield, P.N.: Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases. Sci. Rep. 5, 10888 (2015)
Hwang, T., Kuang, R.: A heterogeneous label propagation algorithm for disease gene discovery. In: SIAM, p. 12 (2010)
Köhler, S., Bauer, S.: Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82(4), 949–958 (2008)
Li, Y., Patra, J.C.: Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network. Bioinformatics 26(9), 1219–1224 (2010)
Genetic, T.N., Goodrich, J.A.: Protein-protein interaction assays: eliminating false positive interactions. Nat. Methods 3(2), 135–139 (2006)
Ni, J., Koyuturk, M.: Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model. BMC Bioinform. 17(1), 453 (2016)
Oti, M., Brunner, H.G.: The modular nature of genetic diseases. Clin. Genet. 71(1), 1–11 (2006)
Petegrosso, R., Park, S.: Transfer learning across ontologies for phenome-genome association prediction. Bioinformatics 25 (2016). https://doi.org/10.1093/bioinformatics/btw649
van Driel, M.A., Bruggeman, J.: A text-mining analysis of the human phenome. Eur. J. Hum. Genet. 14(5), 535–542 (2006)
Vanunu, O., Magger, O.: Associating genes and protein complexes with disease via network propagation. PLoS Comput. Biol. 6(1), e1000641 (2010)
von Mering, C., Krause, R.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399–403 (2002)
Xuebing, W., Jiang, R.: Network-based global inference of human disease genes. Mol. Syst. Biol. 4, 189 (2008)
Xie, M., Hwang, T., Kuang, R.: Prioritizing disease genes by bi-random walk. In: Advances in Knowledge Discovery and Data Mining, pp. 292–303 (2012)
Zhou, D., Bousquet, O.: Learning with local and global consistency. In: NIPS, vol. 1, pp. 595–602 (2004)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61702367). The Research Project of Tianjin Municipal Commission of Education (No. 2017KJ033).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhang, Y. et al. (2018). IDLP: A Novel Label Propagation Framework for Disease Gene Prioritization. In: Phung, D., Tseng, V., Webb, G., Ho, B., Ganji, M., Rashidi, L. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2018. Lecture Notes in Computer Science(), vol 10937. Springer, Cham. https://doi.org/10.1007/978-3-319-93034-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-93034-3_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93033-6
Online ISBN: 978-3-319-93034-3
eBook Packages: Computer ScienceComputer Science (R0)