Abstract
The Human Phenotype Ontology (HPO) is a standardized vocabulary of terms related to diseases. The importance and the specificity of HPO terms are estimated employing the Information Content (IC). Thus, the analysis of annotated data is a critical challenge for bioinformatics. There exist several approaches to support ontology curators in maintaining and analysing data. Among these, the use of Association Rules (AR) can improve the quality of annotations. In this paper, we present an algorithm for the parallel extraction of Weighted Association Rules (WAR) from HPO terms and annotations, able to face high dimension of data. Experiments performed on real and synthetic datasets show good speed-up and scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Agapito, G., Cannataro, M., Guzzi, P.H., Milano, M.: Using GO-WAR for mining cross-ontology weighted association rules. Comput. Methods Programs Biomed. 120(2), 113–122 (2015). https://doi.org/10.1016/j.cmpb.2015.03.007. ISSN 0169-2607
Agapito, G., Guzzi, P.H., Cannataro, M.: Parallel and distributed association rule mining in life science: a novel parallel algorithm to mine genomics data. Inf. Sci. (2018). https://doi.org/10.1016/j.ins.2018.07.055. ISSN 0020-0255
Agapito, G., Milano, M., Guzzi, P.H., Cannataro, M.: Improving annotation quality in gene ontology by mining cross-ontology weighted association rules. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1–8. IEEE (2014). https://doi.org/10.1109/BIBM.2014.6999374
Agapito, G., Milano, M., Guzzi, P.H., Cannataro, M.: Extracting cross-ontology weighted association rules from gene ontology annotations. IEEE/ACM Trans. Comput. Biol. Bioinf. 13(2), 197–208 (2015). https://doi.org/10.1109/TCBB.2015.2462348
Agrawal, R., Imieli, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993). https://doi.org/10.1145/170036.170072
Cai, C., Fu, A., Cheng, C., Kwong, W.: Mining association rules with weighted items. In: 1998 Database Engineering and Applications Symposium. Proceedings, IDEAS 1998, International, pp. 68–77 (1998). https://doi.org/10.1109/IDEAS.1998.694360
Consortium, G.O., et al.: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32(Suppl. 1), D258–D261 (2004)
Faria, D., et al.: Mining go annotations for improving annotation consistency. PLoS ONE 7(7), e40519 (2012). https://doi.org/10.1371/journal.pone.0040519
Flouris, G., Huang, Z., Pan, J.Z., Plexousakis, D., Wache, H.: Inconsistencies, negations and changes in ontologies. In: 1999 Proceedings of the National Conference on Artificial Intelligence, vol. 21, p. 1295 AAAI Press/MIT Press, Menlo Park/Cambridge (2006)
Gruber, T.: Encyclopedia of database systems. Ontology, pp. 1963–1965 (2009)
Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick, V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(Suppl. 1), D514–D517 (2005)
Harispe, S., Sanchez, D., Ranwez, S., Janaqi, S., Montmain, J.: A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J. Biomed. Inform. 48, 38–53 (2013)
Hermjakob, H., et al.: The HUPO PSI’s molecular interaction format - a community standard for the representation of protein interaction data. Nat. Biotechnol. 22, 177–183 (2004). https://doi.org/10.1038/nbt926
Kohler, S., et al.: Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am. J. Hum. Genet. 85(4), 457–464 (2009)
Li, H., Wang, Y., Zhang, D., Zhang, M., Chang, E.Y.: PFP: parallel FP-Growth for query recommendation. In: Proceedings of the 2008 ACM Conference on Recommender Systems, pp. 107–114. ACM (2008)
Manda, P., McCarthy, F., Bridges, S.M.: Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new go relationships. J. Biomed. Inform. 46(5), 849–856 (2013)
Manda, P., Ozkan, S., Wang, H., McCarthy, F., Bridges, S.M.: Cross-ontology multi-level association rule mining in the gene ontology. PLoS ONE 7(10), e47411 (2012)
Milano, M., Agapito, G., Guzzi, P.H., Cannataro, M.: An experimental study of information content measurement of gene ontology terms. Int. J. Mach. Learn. Cybern. 9(3), 427–439 (2016). https://doi.org/10.1007/s13042-015-0482-y
Peng, K., et al.: The disease and gene annotations (DGA): an annotation resource for human disease. Nucleic Acids Res. 41(D1), D553–D560 (2013). https://doi.org/10.1093/nar/gks1244
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: IJCAI, pp. 448–453 (1995), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.5277
Robinson, P.N., Kohler, S., Bauer, S., Seelow, D., Horn, D., Mundlos, S.: The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am. J. Hum. Genet. 83(5), 610–615 (2008)
Sanchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24(2), 297–303 (2011)
Schriml, L.M., et al.: Disease ontology: a backbone for disease semantic integration. Nucleic Acids Res. 40(D1), D940–D946 (2012)
Wang, W., Yang, J., Yu, P.S.: Efficient mining of weighted association rules (WAR). In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 270–274. ACM, New York (2000). https://doi.org/10.1145/347090.347149
Yeh, I., Karp, P.D., Noy, N.F., Altman, R.B.: Knowledge acquisition, consistency checking and concurrency control for Gene Ontology (GO). Bioinformatics 19(2), 241–248 (2003)
Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in wordnet. In: 2008 Future Generation Communication and Networking Symposia, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008)
Acknowledgments
This work has been partially funded by the following research project funded by the Calabrian Region: “Smart Electronic Invoices Accounting-SELINA CUP:J28C1700016006”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Agapito, G., Cannataro, M., Guzzi, P.H., Milano, M. (2020). Parallel Learning of Weighted Association Rules in Human Phenotype Ontology. In: Schwardmann, U., et al. Euro-Par 2019: Parallel Processing Workshops. Euro-Par 2019. Lecture Notes in Computer Science(), vol 11997. Springer, Cham. https://doi.org/10.1007/978-3-030-48340-1_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-48340-1_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-48339-5
Online ISBN: 978-3-030-48340-1
eBook Packages: Computer ScienceComputer Science (R0)