Abstract
In pattern matching based Protein-Protein Interaction Extraction systems, patterns generated manually or automatically exist erroneous and redundancy, which greatly affect the system’s performance. In this paper, a MDL-based pattern optimizing algorithm is proposed to filter out the bad patterns and redundancy. Experiments show that our algorithm is effective in improving the system’s performance while greatly cutting down the number of patterns. It also has excellent generalizability which is important in implementing practical systems.
This paper is Supported by Natural Science Foundation No.60272019 and No. 60321002.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ono, T., Hishigaki, H., Tanigami, A., Takagi, T.: Automated extraction of information on protein-protein interactions from the biological literature. Bioinformatics 17(2), 155–161 (2001)
Marcotte, E.M., Xenarios, I., Eisenberg, D.: Mining literature for protein-protein interactions. Bioinformatics 17(4), 359–363 (2000)
Hirschman, L., Park, J.C., Tsujii, J., Wong, L., Wu, C.H.: Accomplishments and challenges in literature data mining for biology. Bioinformatics 18, 1553–1561 (2002)
Pustejovsky, J., Castano, J., Zhang, J., Kotecki, M., Cochran, B.: Robust relational parsing over biomedical literature: extracting inhibit relations. In: Proceedings of the seventh Pacific Symposium on Biocomputing (PSB 2002), pp. 362–373 (2002)
Ray, S., Craven, M.: Representing sentence structure in hidden markov models for information extraction. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI-2001), pp. 1273–1279. Morgan Kaufmann, San Francisco (2001)
Friedman, C., Kra, P., Yu, H., Krauthammer, M., Rzhetsky, A.: Genies: a naturallanguage processing system for the extraction of molecular pathways from journal articles. Bioinformatics 17(Suppl. 1), S74–S82 (2001)
Huang, M.L., Zhu, X.Y., Hao, Y., Payan, D.G., Qu, K., Li, M.: Discovering patterns to extract protein-protein interactions from full texts. Bioinformatics (2004) (Accepted June)
Yao, D., Wang, J., Lu, Y., Noble, N., Sun, H., Zhu, X., Lin, N., Payan, D.G., Li, M., Qu, K.: Pathway-Finder: paving the way towards automatic pathway extraction. In: Chen, Y.-P.P. (ed.) Bioinformatics 2004: Proceedings of the 2nd Asia-Pacific Bioinformatics Conference, APBC (2004)
Rissanen, J.: Modelling by shortest data description. Automatica 14, 465–471 (1978)
Paul, M.B.: Vitányi and Ming Li Minimum Description Length Induction, Bayesianism, and Kolmogorov Complexity. IEEE transactions on information theory 46(2) (March 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hao, Y., Zhu, X., Li, M. (2005). A New Algorithm for Pattern Optimization in Protein-Protein Interaction Extraction System. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds) Pattern Recognition and Image Analysis. IbPRIA 2005. Lecture Notes in Computer Science, vol 3523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11492542_49
Download citation
DOI: https://doi.org/10.1007/11492542_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26154-4
Online ISBN: 978-3-540-32238-2
eBook Packages: Computer ScienceComputer Science (R0)