An Efficient Classification of Fuzzy XML Documents Based on Kernel ELM

Zhao, Zhen; Ma, Zongmin; Yan, Li

doi:10.1007/s10796-019-09973-3

An Efficient Classification of Fuzzy XML Documents Based on Kernel ELM

Published: 05 December 2019

Volume 23, pages 515–530, (2021)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

271 Accesses
4 Citations
Explore all metrics

Abstract

Data classification for distributed and heterogeneous XML data sources is always an open challenge. A considerable number of algorithms for classification of XML documents have been proposed in the literature. Yet, the existing approaches fall short in ability to classify the fuzzy XML documents. In this paper, we provide a KPCA-KELM classification framework for the fuzzy XML documents based on Kernel Extreme Learning Machine (KELM). Firstly, we propose a novel fuzzy XML document tree model to represent fuzzy XML documents. Secondly, we employ an effective vector space model to represent the semantic structure of fuzzy XML documents based on the proposed fuzzy XML document tree model. Thirdly, we classify the fuzzy XML document using KELM after feature extraction using Kernel Principal Component Analysis (KPCA). The corresponding experimental results demonstrate that our proposed KPCA-KELM approach shortens the training time while maintaining the same level of accuracy as the state-of-the-art baseline models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of unsupervised feature selection methods

Article 29 January 2019

Saúl Solorio-Fernández, J. Ariel Carrasco-Ochoa & José Fco. Martínez-Trinidad

A comprehensive survey on feature selection in the various fields of machine learning

Article 23 July 2021

Pradip Dhal & Chandrashekhar Azad

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Article Open access 06 November 2019

Najat Ali, Daniel Neagu & Paul Trundle

Notes

References

Abiteboul, S., Segoufin, L., & Vianu, V. (2006). Representing and querying XML with incomplete information. ACM Transactions on Database Systems, 31(1), 208–254.
Article Google Scholar
Agreste, S., Meo, P. D., Ferrara, E., & Ursino, D. (2014). XML Matchers: approaches and challenges. Knowledge-Based Systems, 66, 190–209.
Article Google Scholar
Blatman, G., & Sudret, B. (2011). Adaptive sparse polynomial chaos expansion based on least angle regression. Journal of Computational Physics, 230(6), 2345–2367.
Article Google Scholar
Brzezinski, D., & Piernik, M. (2015). Structural XML classification in concept drifting data streams. New Generation Computing, 33(4), 345–366.
Article Google Scholar
Dalamagas, T., Cheng, T., Winkel, K. J., et al. (2006). A methodology for clustering XML documents by structure. Information Systems, 31(3), 187–228.
Article Google Scholar
Fletcher, R. (1981). Practical methods of optimization. Constrained Optim., 2.
Gaurav A, Alhajj R (2006) Incorporating fuzziness in XML and mapping fuzzy relational data into fuzzy XML. In: Proceedings of the 2006 ACM symposium on applied computing, ACM, Dijon, pp. 456–460
Guha, S., Jagadish, H. V., Koudas, N., & Srivastava, D. (2006). Integrating XML data sources using approximate joins. ACM Transactions on Database Systems, 31(1), 161–207.
Article Google Scholar
Gupta, P., Chauhan, S., & Jaiswal, M. P. (2019). Classification of smart city research - a descriptive literature review and future research agenda. Information Systems Frontiers, 21(3), 661–685.
Article Google Scholar
Huang, G. B. (2014). An insight into extreme learning machines: Random neurons, random features and kernels. Cognitive Computation, 6(3), 376–390.
Article Google Scholar
Huang, G. B., & Chen, L. (2007). Convex incremental extreme learning machine. Neurocomputing, 70(16), 3056–3062.
Article Google Scholar
Huang, G., Song, S., Gupta, J. N. D., & Wu, C. (2014). Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern., 44(12), 2405–2417.
Article Google Scholar
Huang, S., Wang, B., et al. (2016). Parallel ensemble of online sequential extreme learning machine based on MapReduce. Neurocomputing, 174, 352–367.
Article Google Scholar
Huang, G. B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B, 42(2), 513–529.
Article Google Scholar
Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme learning machine: Theory and applications. Neurocomputing, 70(1), 489–501.
Article Google Scholar
Iosifidis, A., Tefas, A., & Pitas, I. (2015). On the kernel extreme learning machine classifier. Pattern Recognition Letters, 54, 11–17.
Article Google Scholar
Kamgar-Parsi, B., & Kanal, L. N. (2010). An improved branch and bound algorithm for computing k-nearest neighbors. Pattern Recognition Letters, 3(1), 7–12.
Article Google Scholar
Li, T., & Ma, Z. M. (2017). Object-stack: an object-oriented approach for top-k keyword querying over fuzzy xml. Information Systems Frontiers, 19(3), 669–697.
Article Google Scholar
Ma, Z. M., & Yan, L. (2007). Fuzzy XML data modeling with the UML and relational data models. Data & Knowledge Engineering, 63, 972–996.
Article Google Scholar
A.G. Maguitman, F. Menczer, H. Roinestad, et al., (2005) Algorithmic detection of semantic similarity. In: Proc. of the 14th International Conference on World Wide Web, ACM, Chiba, pp. 107–116.
Negoita, C., Zadeh, L. A., & Zimmermann, H. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1, 3–28.
Article Google Scholar
Nierrman, A., & Jagadish, H. V. (2002). ProTDB: Probabilistic data in XML, in: Proceedings of the 28th international conference on vary large data bases (pp. 646–657). Hong Kong: VLDB Endowment.
Google Scholar
Oliboni, B., Pozzani, G. (2008) Representing fuzzy information by using XML schema, in: Proceedings of the 19th international conference on database and expert systems application, Turin, pp. 683–687
Paliwal, M., & Kumar, U. A. (2009). Neural networks and statistical techniques: A review of applications. Expert Systems with Applications an International Journal, 36(1), 2–17.
Article Google Scholar
Palshikar, G. K., Apte, M., & Pandita, D. (2018). Weakly supervised and online learning of word models for classification to detect disaster reporting tweets. Information Systems Frontiers, 20(5), 949–959.
Article Google Scholar
L. Ribeiro, T. Härder (2006) Entity identification in XML documents. In: 18th GI-Workshop on the Foundations of Databases, pp. 130–134.
Salton, G., & McGill, M. (1983). Introduction to modern information retrieval. New York: McGrawHill Book Company.
Google Scholar
Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.
Article Google Scholar
Suykens, J., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing Letters, 9(3), 293–300.
Article Google Scholar
Tang, J., Deng, C., & Huang, G. B. (2016). Extreme Learning Machine for Multilayer Perceptron. IEEE Transactions on Neural Networks & Learning Systems, 27(4), 809–821.
Article Google Scholar
Tekli, J., & Chbeir, R. (2012). A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics. Web Semantics: Science, Services and Agents on the World Wide Web, 11(3), 14–40.
Article Google Scholar
Tekli, J., Chbeir, R., et al. (2015). Approximate XML structure validation based on document-grammar tree similarity. Information Sciences, 295, 258–302.
Article Google Scholar
Thasleena, N. T., & Varghese, S. C. (2015). Enhanced associative classification of XML documents supported by semantic concepts. Procedia Computer Science, 46, 194–201.
Article Google Scholar
Thomo, A.., Venkatesh, S. (2008) Rewriting of visibly pushdown languages for xml data integration. In: Proc. of the 17th ACM Conference on Information and Knowledge Management, ACM, Napa Valley, pp. 521–530
Turowski, K., & Weng, U. (2002). Representing and processing fuzzy information-an XML-based approach. Knowledge-Based Systems, 15(1), 67–75.
Article Google Scholar
Yan, L., Ma, Z. M., & Liu, J. (2009). Fuzzy data modeling based on XML schema, in: Proceedings of 2009 ACM symposium on applied computing (pp. 1563–1567). Honolulu: ACM.
Book Google Scholar
Yang, J., & Chen, X. (2002). A semi-structured document model for text mining. Journal of Computer Science and Technology, 17(5), 603–610.
Article Google Scholar
Zhang, X. L., Yang, T., Fan, B. Q., et al. (2012). A Novel Method for Measuring Structure and Semantic Similarity of XML Documents Based on Extended Adjacency Matrix. Physics Procedia, 24, 1452–1461.
Article Google Scholar
Zhao, X., Bi, X., et al. (2016). Uncertain XML documents classification using extreme learning machine. Neurocomputing, 174, 375–382.
Article Google Scholar
Zhao, Z., Ma, Z. M., Zhang, F., et al. (2017). Classification of fuzzy XML documents based on double hidden layer ELM. Computer Engineering and Applications, 53(4), 19–24.
Google Scholar
Zhao, X., Wang, G., Bi, X., et al. (2011). XML document classification based on ELM. Neurocomputing, 74(16), 2444–2451.
Article Google Scholar

Download references

Acknowledgements

The authors wish to thank the anonymous referees for their valuable comments and suggestions, which improved the technical content and the presentation of the paper. This work was supported by the National Natural Science Foundation of China (61772269, 61370075 & 61976027) and the Scientific Research Projects of Liaoning Educational Committee (LQ2017003).

Author information

Authors and Affiliations

College of Information Science and Technology, Bohai University, Jinzhou, 121013, Liaoning, China
Zhen Zhao
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, Jiangsu, China
Zongmin Ma & Li Yan
Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, 210023, Jiangsu, China
Zongmin Ma

Authors

Zhen Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zongmin Ma
View author publications
You can also search for this author in PubMed Google Scholar
Li Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zongmin Ma.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, Z., Ma, Z. & Yan, L. An Efficient Classification of Fuzzy XML Documents Based on Kernel ELM. Inf Syst Front 23, 515–530 (2021). https://doi.org/10.1007/s10796-019-09973-3

Download citation

Published: 05 December 2019
Issue Date: June 2021
DOI: https://doi.org/10.1007/s10796-019-09973-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Classification of Fuzzy XML Documents Based on Kernel ELM

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

A comprehensive survey on feature selection in the various fields of machine learning

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Efficient Classification of Fuzzy XML Documents Based on Kernel ELM

Abstract

Access this article

Similar content being viewed by others

A review of unsupervised feature selection methods

A comprehensive survey on feature selection in the various fields of machine learning

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation