Abstract
In this paper, we address the problem of recognizing domain-specific queries from general search engine’s query log. Unlike most previous work in query classification relying on external resources or annotated training queries, we take query log as the only resource for recognizing domain-specific queries. In the proposed approach, we represent query log as a heterogeneous graph and then formulate the task of domain-specific query recognition as graph-based transductive learning. In order to reduce the impact of noisy and insufficient of initial annotated queries, we further introduce an active learning strategy into the learning process such that the manual annotations needed are reduced and the recognition results can be continuously refined through interactive human supervision. Experimental results demonstrate that the proposed approach is capable of recognizing a certain amount of high-quality domain-specific queries with only a small number of manually annotated queries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: Sources of evidence for vertical selection. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 315–322 (2009)
Giachanou, A., Salampasis, M., Paltoglou, G.: Multilayer source selection as a tool for supporting patent search and classification. Inf. Retrieval J. 18(6), 559–585 (2015)
Yan, X., Liu, Y., Fand, Q., Zhang, M., Ma, S., Ru, L.: Domain-specific terms extraction based on web resource and user behavior. J. Softw. (in Chinese) 24(9), 2089–2100 (2013)
Shen, D., Pan, R., Sun, J.-T., Pan, J.J., Wu, K., Yin, J., Yang, Q.: Query enrichment for web-query classification. ACM Trans. Inf. Syst. 24, 320–352 (2006)
Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in Web search. In: Proceedings of the 14th International Conference on World Wide Web, pp. 391–400 (2005)
Li, X., Wang, Y., Acero, A.: Learning query intent from regularized click graphs. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 339–346 (2008)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Adv. NIPS 16(16), 321–328 (2004)
Zhu, X., Lafferty, J., Ghahramani, Z.: Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (2003)
Gu, Q., Zhang, T., Han, J.: Batch-mode active learning via error bound minimization. In: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pp. 300–309 (2014)
Shi, L., Zhao, Y., Tang, J.: Batch mode active learning for networked data. ACM Trans. Intell. Syst. Technol. 3(2), 1–25 (2012)
Ji, M., Han, J.: A variance minimization criterion to active learning on graphs. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, pp. 556–564 (2012)
Fuxman, A., Tsaparas, P., Achan, K., Agrawal, R.: Using the wisdom of the crowds for keyword generation. In: Proceeding of the 17th International World Wide Web Conference, pp. 61–70 (2008)
Jiang, D., Leung, K.W.T., Ng, W.: Query intent mining with multiple dimensions of web search data. World Wide Web 19(3), 475–497 (2016)
Hu, Y., Qian, Y., Li, H., Jiang, D., Pei, J., Zheng, Q.: Mining query subtopics from search log data. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 305–314 (2012)
Ji, M., Yan, J., Gu, S., Han, J., He, X., Zhang, W.V., Chen, Z.: Learning search tasks in queries and web pages via graph regularization. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 55–64 (2011)
Li, Y., Hsu, B.J.P., Zhai, C.: Unsupervised identification of synonymous query intent templates for attribute intents. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 2029–2038 (2013)
Qian, Y., Sakai, T., Ye, J., Zheng, Q., Li, C.: Dynamic query intent mining from a search log stream. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 1205–1208 (2013)
Ren, X., Wang, Y., Yu, X., Yan, J., Chen, Z., Han, J.: Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 23–32 (2014)
Acknowledgement
This work is partially supported by Chinese Natural Science Foundation (61602278), Shandong Province Higher Educational Science and Technology Program (J14LN33) and China Postdoctoral Science Foundation (2014M561949).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ni, W., Liu, T., Sun, H., Wei, Z. (2017). An Active Learning Approach to Recognizing Domain-Specific Queries From Query Log. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10367. Springer, Cham. https://doi.org/10.1007/978-3-319-63564-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-63564-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63563-7
Online ISBN: 978-3-319-63564-4
eBook Packages: Computer ScienceComputer Science (R0)