Object-stack: An object-oriented approach for top-k keyword querying over fuzzy XML

Li, Ting; Ma, Zongmin

doi:10.1007/s10796-017-9748-0

Object-stack: An object-oriented approach for top-k keyword querying over fuzzy XML

Published: 24 March 2017

Volume 19, pages 669–697, (2017)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

Ting Li¹ &
Zongmin Ma²

361 Accesses
6 Citations
Explore all metrics

Abstract

Keyword search is the most popular technique of searching information from XML (eXtensible markup language) document. It enables users to easily access XML data without learning the structure query language or studying the complex data schemas. Existing traditional keyword query methods are mainly based on LCA (lowest common ancestor) semantics, in which the returned results match all keywords at the granularity of elements. In many practical applications, information is often uncertain and vague. As a result, how to identify useful information from fuzzy data is becoming an important research topic. In this paper, we focus on the issue of keyword querying on fuzzy XML data at the granularity of objects. By introducing the concept of “object tree”, we propose the query semantics for keyword query at object-level. We find the minimum whole matching result object trees which contain all keywords and the partial matching result object trees which contain partial keywords, and return the root nodes of these result object trees as query results. For effectively and accurately identifying the top-K answers with the highest scores, we propose a score mechanism with the consideration of tf*idf document relevance, users’ preference and possibilities of results. We propose a stack-based algorithm named object-stack to obtain the top-K answers with the highest scores. Experimental results show that the object-stack algorithm outperforms the traditional XML keyword query algorithms significantly, and it can get high quality of query results with high search efficiency on the fuzzy XML document.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An approach of top-k keyword querying for fuzzy XML

Article 20 October 2017

From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search

XReason: A Semantic Approach That Reasons with Patterns to Answer XML Keyword Queries

References

Abiteboul, S., & P. Senellart, (2006) Querying and updating probabilistic information in xml, in Proceedings of the 2006 International Conference on Extended Data Base Technology, pp. 1059–1068.
Abiteboul, S., Segoufin, L., & Vianu, V. (2006). Representing and querying XML with incomplete information. Transactions on Database systems, 31(1), 208–254.
Article Google Scholar
Bao, Z. F., T. W. Ling, B. & Chen, J. H. Lu, (2009) Effective XML keyword search with relevance oriented ranking, in: IEEE international conference on data engineering, pp. 517–528.
Barcel’o, P., & Libkin, L. (2010). A. poggi, C. Sirangelo. XML with incomplete information. Journal of the ACM, 58(1), 1–62.
Article Google Scholar
Barcel’o, P., L. Libkin, A. poggi, & C. Sirangelo. (2009) XML with incomplete information: models, properties, and query answering. In: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 237–246.
Bhalotia, G., C. Nakhe, A. Hulgeri, S. Chakrabarti, & S. Sudarshan, (2002) Keyword Searching and Browsing in Databases using BANKS, in: Proceedings of IEEE 18th international conference on Data Engineering, pp 431–440.
Cohen, S., J. Mamou, Y. Kanza, & Y. Sagiv. (2003) XSEarch: A semantic search engine for XML, in: Proceedings of the 29th international conference on very large data bases, pp. 45–56.
Cohen, S., B. Kimelfeld, & Y. Sagiv, (2009) Running tree automata on probabilistic xml, in Proceedings of the 28th ACM SIGMOD-SIGACTSIGART symposium on Principles of database systems, pp. 227–236.
DBLP (n.d) Bibliography. Available: http://dblp.uni-trier.de/xml/
Gaurav, R. (2006) Alhajj, Incorporating fuzziness in XML and mapping fuzzy relational data into fuzzy XML, in: Proceedings of the 2006 ACM Symposium on Applied, Computing, pp. 456–460.
George, J. K., & Bo, Y. (1995). Fuzzy Sets and Fuzzy Logic. Theory and Applications, Upper Saddle River, NJ: Prentice Hall.
Google Scholar
Guo, L., F. Shao, C. Botev, & J. Shanmugasundaram, (2003) XRANK: Ranked keyword search over XML documents, in: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp. 16–27.
Hristidis, V., Koudas, N., Papakonstantinou, Y., & Srivastava, D. (2006). Keyword proximity search in XML trees. IEEE Transactions on Knowledge and Data Engineering, 18(4), 525–539.
Article Google Scholar
Kanza, Y., Nutt, W., & Sagiv, Y. (2002). Querying incomplete information in semistructured data. Journal of Computer and System Sciences, 64(3), 655–693.
Article Google Scholar
Kimelfeld, Y. & Sagiv, (2007a) Combining incompleteness and ranking in tree queries, in: Proceedings of the 11th International Conference on Database Theory, pp. 329–343.
Kimelfeld, Y. & Sagiv, (2007b) Matching twigs in probabilistic XML, in: Proceedings of the 33rd International Conference on Vary large Data Bases, pp. 27–38.
Kimelfeld, & Sagiv, Y. (2008). Modeling and querying probabilistic xml data. ACM SIGMOD Record, 37(4), 69–77.
Article Google Scholar
Kimelfeld, Y. Kosharovsky, & Y. Sagiv, (2008) Query efficiency in probabilistic xml models, in Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 701–714.
Li, G., J. Feng, J. Wang, & L. Zhou, (2007) Efficient keyword search for valuable LCAs over xml documents, in: Proceedings of the 16th ACM Conference on Information and Knowledge Management, pp. 31–40.
Li, J., C. Liu, R. Zhou et al., (2009a) Processing XML Keyword Search by Constructing Effective Structured Queries, in: Proceedings of the Joint International Conferences on Advances in Data and Web Management, pp. 88–99.
Li, Y., et al., (2009b) Holistically twig matching in probabilistic XML, in: Proceedings of 25th international conference on data engineering, pp. 1649-1656
Li, J., C. Liu, R. Zhou, & W. Wang, (2011) Top-k keyword search over probabilistic XML data, in: Proceedings of IEEE 27th international Conference on Data Engineering, pp. 673–684.
Li, L., Le, T. N., Wu, H., Ling, T. W., & Bressan, S. (2013). Discovering semantics from data-centric XML. Database and Expert Systems Applications, 88–102.
Liu, Z., & Y. Chen. (2007) Identifying meaningful return information for XML keyword search. In: Proceedings of the 2007ACM SIGMOD international conference on Management of data, pp. 329–340.
Liu, Z., J. Walker, & Y. Chen. (2007) XSeek: A semantic XML search engine using keywords. in: Proceedings of the 33rd international conference on very large data bases, pp. 1330–1333.
Liu, J., Ma, Z. M., & Ma, R. Z. (2013). Efficient processing of twig query with compound predicates in fuzzy XML. Fuzzy Sets and Systems, 229, 33–53.
Article Google Scholar
Liu, J., Ma, Z. M., & Qv, Q. (2014). Dynamically querying possibilistic XML data. Information Sciences, 261, 70–84.
Article Google Scholar
Ma, Z. M., & Yan, L. (2007). Fuzzy XML data modeling with the UML and relational data models. Data & Knowledge Engineering, 63, 972–996.
Article Google Scholar
Ma, Z., & Yan, L. (2016). Modeling fuzzy data with XML: A survey. Fuzzy Sets and Systems, 301, 146–159.
Article Google Scholar
Ma, Z. M., Zhang, W. J., & Ma, W. Y. (2004). Extending object-oriented databases for fuzzy information modeling. Information Systems, 29, 421–435.
Article Google Scholar
Ma, Z. M., Liu, J., & Yan, L. (2011). Matching twigs in fuzzy XML. Information Science, 181, 184–200.
Article Google Scholar
Meng, X. F., Li, Y., Ma, Z. M., Zhang, F., & Wang, X. (2011). An adaptive query relaxation approach for relational databases based on semantic similarity. Chinese Journal of Computers, 34(5), 812–824.
Article Google Scholar
Mondial (2016). Available: http://www.dbis.informatik.uni-goettingen.de/mondial.
Oliboni, B., & Pozzani, G. (2010). An XML schema for managing fuzzy documents. Soft Computing in XML Data Management, 3–23.
Panić, G., Racković, M., & Škrbić, S. (2014). Fuzzy XML and prioritized fuzzy XQuery with implementation. Journal of Intelligent and Fuzzy Systems, 26, 303–316.
Google Scholar
Ribeiro, L., & T. Härder. Entity identification in XML documents. Grundlagen von Datenbanken (2006).
Singhal, J. Choi, D. Hindle, et al., (1999) At&t at TREC-7, in Proceedings of the 7th Text Retrieval Conference (TREC-7), pp. 239–252.
Sun, C. Y. Chan, & A. K. Goenka, Multiway SLCA-based keyword search in xml data, in: Proceedings of the 16th International Conference on World Wide Web, 2007, pp. 1043–1052.
Wei, X., et al. (2015). Online comment-based hotel quality automatic assessment using improved fuzzy comprehensive evaluation and fuzzy cognitive map. IEEE Transactions on Fuzzy Systems, 23(1), 72–84.
Article Google Scholar
XMARK (2016). Available: http://www.xml-benchmark.org/.
Xu, Y., & Y. Papakonstantinou, (2005) Efficient keyword search for smallest LCAs in XML databases, in: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, pp. 527–538.
Xu, Y., & Y. Papakonstantinou, (2008) Efficient lca based keyword search in XML data, in: Proceedings of 11th international conference on extending database technology: Advances in database technology, pp. 535–546.
Xu, Z., et al. (2017a) From Latency, through Outbreak, to Decline: Detecting Different States of Emergency Events Using Web Resources, IEEE Transactions on Big Data, doi:10.1109/TBDATA.2016.2599935.
Xu, Z., et al., (2017b) Crowdsourcing based Description of Urban Emergency Events using Social Media Big Data, IEEE Transactions on Cloud Computing, doi:10.1109/TCC.2016.2517638.
Xuan, J., et al. (2016). Uncertainty analysis for the keyword system of web events. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 46(6), 829–842.
Article Google Scholar
Yang, W. D., & B. Shi. (2007) Schema-aware keyword search over XML streams, in: Proceedings of 7th international conference on computer and information technology, pp. 29–34.
Zhang, J., Chang, L., Sha, C. F., et al. (2012). Keywords filtering over probabilistic XML data. Web Technologies and Applications, 183–194.
Zhou, R., C. F. Liu, J. X. Li, J. X. Y, (2013) ELCA evaluation for keyword search on probabilistic XML data, World Wide Web 16(2) 171–193.

Download references

Acknowledgements

The authors thank the anonymous referees for their valuable comments and suggestions, which improved the technical content and the presentation of the paper. The work was supported in part by the National Natural Science Foundation of China (61370075 and 61572118).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Northeastern University, Shenyang, 110819, China
Ting Li
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Zongmin Ma

Authors

Ting Li
View author publications
You can also search for this author in PubMed Google Scholar
Zongmin Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zongmin Ma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, T., Ma, Z. Object-stack: An object-oriented approach for top-k keyword querying over fuzzy XML. Inf Syst Front 19, 669–697 (2017). https://doi.org/10.1007/s10796-017-9748-0

Download citation

Published: 24 March 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10796-017-9748-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object-stack: An object-oriented approach for top-k keyword querying over fuzzy XML

Abstract

Access this article

Similar content being viewed by others

An approach of top-k keyword querying for fuzzy XML

From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search

XReason: A Semantic Approach That Reasons with Patterns to Answer XML Keyword Queries

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Object-stack: An object-oriented approach for top-k keyword querying over fuzzy XML

Abstract

Access this article

Similar content being viewed by others

An approach of top-k keyword querying for fuzzy XML

From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search

XReason: A Semantic Approach That Reasons with Patterns to Answer XML Keyword Queries

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation