Skip to main content

An Active Learning Approach to Recognizing Domain-Specific Queries From Query Log

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10367))

Abstract

In this paper, we address the problem of recognizing domain-specific queries from general search engine’s query log. Unlike most previous work in query classification relying on external resources or annotated training queries, we take query log as the only resource for recognizing domain-specific queries. In the proposed approach, we represent query log as a heterogeneous graph and then formulate the task of domain-specific query recognition as graph-based transductive learning. In order to reduce the impact of noisy and insufficient of initial annotated queries, we further introduce an active learning strategy into the learning process such that the manual annotations needed are reduced and the recognition results can be continuously refined through interactive human supervision. Experimental results demonstrate that the proposed approach is capable of recognizing a certain amount of high-quality domain-specific queries with only a small number of manually annotated queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.sogou.com/labs/dl/q.html.

  2. 2.

    https://www.dmoz.org/.

References

  1. Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: Sources of evidence for vertical selection. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 315–322 (2009)

    Google Scholar 

  2. Giachanou, A., Salampasis, M., Paltoglou, G.: Multilayer source selection as a tool for supporting patent search and classification. Inf. Retrieval J. 18(6), 559–585 (2015)

    Article  Google Scholar 

  3. Yan, X., Liu, Y., Fand, Q., Zhang, M., Ma, S., Ru, L.: Domain-specific terms extraction based on web resource and user behavior. J. Softw. (in Chinese) 24(9), 2089–2100 (2013)

    Google Scholar 

  4. Shen, D., Pan, R., Sun, J.-T., Pan, J.J., Wu, K., Yin, J., Yang, Q.: Query enrichment for web-query classification. ACM Trans. Inf. Syst. 24, 320–352 (2006)

    Article  Google Scholar 

  5. Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in Web search. In: Proceedings of the 14th International Conference on World Wide Web, pp. 391–400 (2005)

    Google Scholar 

  6. Li, X., Wang, Y., Acero, A.: Learning query intent from regularized click graphs. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 339–346 (2008)

    Google Scholar 

  7. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Adv. NIPS 16(16), 321–328 (2004)

    Google Scholar 

  8. Zhu, X., Lafferty, J., Ghahramani, Z.: Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (2003)

    Google Scholar 

  9. Gu, Q., Zhang, T., Han, J.: Batch-mode active learning via error bound minimization. In: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence, pp. 300–309 (2014)

    Google Scholar 

  10. Shi, L., Zhao, Y., Tang, J.: Batch mode active learning for networked data. ACM Trans. Intell. Syst. Technol. 3(2), 1–25 (2012)

    Article  Google Scholar 

  11. Ji, M., Han, J.: A variance minimization criterion to active learning on graphs. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, pp. 556–564 (2012)

    Google Scholar 

  12. Fuxman, A., Tsaparas, P., Achan, K., Agrawal, R.: Using the wisdom of the crowds for keyword generation. In: Proceeding of the 17th International World Wide Web Conference, pp. 61–70 (2008)

    Google Scholar 

  13. Jiang, D., Leung, K.W.T., Ng, W.: Query intent mining with multiple dimensions of web search data. World Wide Web 19(3), 475–497 (2016)

    Article  Google Scholar 

  14. Hu, Y., Qian, Y., Li, H., Jiang, D., Pei, J., Zheng, Q.: Mining query subtopics from search log data. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 305–314 (2012)

    Google Scholar 

  15. Ji, M., Yan, J., Gu, S., Han, J., He, X., Zhang, W.V., Chen, Z.: Learning search tasks in queries and web pages via graph regularization. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 55–64 (2011)

    Google Scholar 

  16. Li, Y., Hsu, B.J.P., Zhai, C.: Unsupervised identification of synonymous query intent templates for attribute intents. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 2029–2038 (2013)

    Google Scholar 

  17. Qian, Y., Sakai, T., Ye, J., Zheng, Q., Li, C.: Dynamic query intent mining from a search log stream. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 1205–1208 (2013)

    Google Scholar 

  18. Ren, X., Wang, Y., Yu, X., Yan, J., Chen, Z., Han, J.: Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 23–32 (2014)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by Chinese Natural Science Foundation (61602278), Shandong Province Higher Educational Science and Technology Program (J14LN33) and China Postdoctoral Science Foundation (2014M561949).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ni, W., Liu, T., Sun, H., Wei, Z. (2017). An Active Learning Approach to Recognizing Domain-Specific Queries From Query Log. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10367. Springer, Cham. https://doi.org/10.1007/978-3-319-63564-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63564-4_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63563-7

  • Online ISBN: 978-3-319-63564-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics