An Incremental Hypersphere Learning Framework for Protein Membership Prediction

Lopes, Noel; Correia, Daniel; Pereira, Carlos; Ribeiro, Bernardete; Dourado, António

doi:10.1007/978-3-642-28942-2_39

Noel Lopes^25,28,
Daniel Correia^25,27,
Carlos Pereira^25,27,
Bernardete Ribeiro^25,26 &
…
António Dourado^25,26

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7208))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1761 Accesses
4 Citations

Abstract

With the recent raise of fast-growing biological databases, it is essential to develop efficient incremental learning algorithms able to extract information efficiently, in particular for constructing protein prediction models. Traditional inference inductive learning models such as SVM perform well when all the data is available. However, they are not suited to cope with the dynamic change of the databases. Recently, a new Incremental Hypersphere Classifier (IHC) Algorithm which performs instance selection has been proved to have impact in online learning settings. In this paper we propose a two-step approach which firstly uses IHC for selecting a reduced data set (and also for immediate prediction), and secondly applies Support Vector Machines (SVM) for protein detection. By retaining the samples that play the most significant role in the construction of the decision surface while removing those that have less or no impact in the model, IHC can be used to efficiently select a reduced data set. Under some conditions, our proposed IHC-SVM approach is able to improve performance accuracy over the baseline SVM for the problem of peptidase detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D., Kibler, D., Albert, M.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2001)
Article Google Scholar
Cheng, B.Y., Carbonell, J.G., Klein-Seetharaman, J.: Protein classification based on text document classification techniques. Proteins: Structure, Function, and Bioinformatics 58(4), 955–970 (2005)
Article Google Scholar
Hua, S., Sun, Z.: Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17(1), 721–728 (2001)
Article Google Scholar
Lopes, N., Ribeiro, B.: An Incremental Class Boundary Preserving Hypersphere Classifier. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part II. LNCS, vol. 7063, pp. 690–699. Springer, Heidelberg (2011)
Chapter Google Scholar
Masud, M.M., Chen, Q., Khan, L., Aggarwal, C., Gao, J., Han, J., Thuraisingham, B.: Addressing concept-evolution in concept-drifting data streams. In: Proceedings of the 2010 IEEE International Conference on Data Mining, pp. 929–934. IEEE Computer Society Press, Washington, DC (2010)
Chapter Google Scholar
Morgado, L., Pereira, C., Veríssimo, P., Dourado, A.: A support vector machine based framework for protein membership prediction. In: Computational Intelligence for Engineering Systems, Intelligent Systems, Control and Automation: Science and Engineering, vol. 46, pp. 90–103. Springer, Netherlands (2011)
Google Scholar
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of molecular biology 247(4), 536–540 (1995)
Google Scholar
Pereira, C., Morgado, L., Correia, D., Verissimo, P., Dourado, A.: Kernel machines for proteomics data analysis: Algorithms and tools. Presented at the European Network for Business and Industrial Statistics, Coimbra, Portugal (2011)
Google Scholar
Ratsch, G., Sonnenburg, S., Schafer, C.: Learning interpretable svms for biological sequence classification. BMC Bioinformatics 7, S1–S9 (2006)
Article MathSciNet Google Scholar
Rawlings, N.D., Barrett, A.J., Bateman, A.: MEROPS: the peptidase database. Nucleic Acids Research 38(Database-Issue), 227–233 (2010)
Article Google Scholar
She, R., Chen, F., Wang, K., Ester, M., Gardy, J.L., Brinkman, F.S.L.: Frequent-subsequence-based prediction of outer membrane proteins. In: Proceedings of Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA (2003)
Google Scholar
Vapnik, V.N.: The nature of statistical learning theory. Springer, Heidelberg (1995)
MATH Google Scholar
Wilson, D., Martinez, T.: Reduction techniques for instance-based learning algorithms. Machine Learning 38(3), 257–286 (2000)
Article MATH Google Scholar
Wurst, M.: The word vector tool user guide operator reference developer tutorial (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

CISUC - Center for Informatics and Systems of University of Coimbra, Portugal
Noel Lopes, Daniel Correia, Carlos Pereira, Bernardete Ribeiro & António Dourado
Department of Informatics Engineering, University of Coimbra, Portugal
Bernardete Ribeiro & António Dourado
ISEC - Coimbra Institute of Engineering, Portugal
Daniel Correia & Carlos Pereira
UDI/IPG - Research Unit, Polytechnic Institute of Guarda, Portugal
Noel Lopes

Authors

Noel Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Correia
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Pereira
View author publications
You can also search for this author in PubMed Google Scholar
Bernardete Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar
António Dourado
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidad de Salamanca, Plaza de la Merced S/N, 37008, Salamanca, Spain
Emilio Corchado
VŠB-TU Ostrava 17, Listopadu 15, 70833, Ostrava, Czech Republic
Václav Snášel
Machine Intelligence Research Labs(MIR Labs), Scientific Network for Innovation and Research Excellence, P.O. Box 2259, 98071, Auburn, Washington, USA
Ajith Abraham
Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland
Michał Woźniak
University of the Basque Country, Pº Manuel Lardizabal 1, 20018, San Sebastian, Spain
Manuel Graña
Yonsei University, 134 Shinchon-dong, Sudaemoon-ku, 120-749, Seoul, Korea
Sung-Bae Cho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lopes, N., Correia, D., Pereira, C., Ribeiro, B., Dourado, A. (2012). An Incremental Hypersphere Learning Framework for Protein Membership Prediction. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-28942-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28941-5
Online ISBN: 978-3-642-28942-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics