Skip to main content
Log in

Probability based voting extreme learning machine for multiclass XML documents classification

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

This paper presents a novel solution based on Extreme Learning Machine (ELM) for multiclass XML documents classification. ELM is a generalized Single-hidden Layer Feedforward Network (SLFN) with extremely fast learning capacity. An improved vector model DSVM (Distribution based Structured Vector Model) is proposed to represent XML documents with more structural information and more precise semantic information. The XML documents classifiers are conducted based on PV-ELM (Probablity based Voting ELM) with a postprocessing method ε-RCC (ε - Revoting of Confusing Classes) to refine the voting results. To evaluate the overall performance of this solution, a series of experiments are conducted on two real datasets of news feeds online. The experimental results show that DSVM represents the XML documents more effectively and PV-ELM with ε-RCC achieves a higher accuracy than original ELM algorithm for multiclass classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. J. Mach. Learn. Res. 1, 9–16 (2000)

    MathSciNet  Google Scholar 

  2. Feng, G., Huang, G.-B., Lin, Q., Gay, R.K.L.: Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Trans. Neural Netw. 20, 1352–1357 (2009)

    Google Scholar 

  3. Han, E.-H., Karypis, G.: Centroid-based document classification: analysis and experimental results. In: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, pp. 424–431. Springer-Verlag, New York (2000)

    Chapter  Google Scholar 

  4. Ho, T.K., Hull, J.J., Srihari, S.N.: Decision combination in multiple classifier systems. IEEE Trans. Pattern Anal. 16, 66–75 (1994)

    Article  Google Scholar 

  5. Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13, 415–425 (2002)

    Google Scholar 

  6. Huang, G.-B., Chen, L.: Convex incremental extreme learning machine. Neurocomputing 70, 3056–3062 (2007)

    Article  Google Scholar 

  7. Huang, G.-B., Chen, L.: Enhanced random search based incremental extreme learning machine. Neurocomputing 71, 3460–3468 (2008)

    Article  Google Scholar 

  8. Huang, G.-B., Chen, L., Siew, C.K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17, 879–892 (2006)

    Google Scholar 

  9. Huang, G.-B., Ding, X., Zhou, H.: Optimization method based extreme learning machine for classification. Neurocomputing 74, 155–163 (2010)

    Article  Google Scholar 

  10. Huang, G.-B., Zhu, Q.-Y., Mao, K.Z., Siew, C.-K., Saratchandran, P., Sundararajan, N.: Can threshold networks be trained directly?. IEEE T. Circuits-II 53, 187–191 (2006)

    Google Scholar 

  11. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings IEEE International Joint Conference on Neural Networks, vol. 2, pp. 985–990 (2004)

  12. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)

    Article  Google Scholar 

  13. Jianwu, Y., Xiaoou, C.: A semi-structured document model for text mining. J. Comput. Sci. Technol. 17, 603–610 (2002)

    MATH  Google Scholar 

  14. Jun Rong, H., Huang, G.-B., Soon Ong, Y.: Extreme learning machine for multi-categories classification applications. In: IJCNN, pp. 1709–1713 (2008)

  15. Jun Rong, H., Soon Ong, Y., Hwee Tan, A., Zhu, Z.: A fast pruned-extreme learning machine for classification. Neurocomputing 72, 359–366 (2008)

    Article  Google Scholar 

  16. Li, M.-B., Huang, G.-B., Saratchandran, P., Sundararajan, N.: Fully complex extreme learning machine. Neurocomputing 68, 306–314 (2005)

    Article  Google Scholar 

  17. Miche, Y., Sorjamaa, A., Bas, P., Simula, O., Jutten, C., Lendasse, A.: Op-elm: optimally pruned extreme learning machine. IEEE Trans. Neural Netw. 21, 158–162 (2010)

    Google Scholar 

  18. Rong, H.-J., Huang, G.-B., Sundararajan, N., Saratchandran, P.: Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans. Syst. Man Cybern. 39, 1067–1072 (2009)

    Article  Google Scholar 

  19. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, Inc., New York, NY, USA (1986)

    Google Scholar 

  20. Suresh, S., Saraswathi, S., Sundararajan, N.: Performance enhancement of extreme learning machine for multi-category sparse data classification problems. Eng. Appl. Artif. Intell. 23, 1149–1157 (2010)

    Article  Google Scholar 

  21. Wang, G., Zhao, Y., Wang, D.: A protein secondary structure prediction framework based on the extreme learning machine. Neurocomputing 72, 262–268 (2008)

    Article  Google Scholar 

  22. Zhao, X., Wang, G., Bi, X., Gong, P., Zhao, Y.: Xml document classification based on elm. Neurocomputing 74(16), 2444–2451 (2011)

    Article  Google Scholar 

  23. Zhu, Q.-Y., Qin, A.K., Suganthan, P.N., Huang, G.-B.: Evolutionary extreme learning machine. Pattern Recogn. 38, 1759–1763 (2005)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangguo Zhao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, X., Bi, X. & Qiao, B. Probability based voting extreme learning machine for multiclass XML documents classification. World Wide Web 17, 1217–1231 (2014). https://doi.org/10.1007/s11280-013-0230-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-013-0230-8

Keywords

Navigation