ABSTRACT
Query feeds recommendation is a new recommended paradigm in mobile search applications, where a stream of queries need to be recommended to improve user engagement. It requires a great quantity of attractive queries for recommendation. A conventional solution is to retrieve queries from a collection of past queries recorded in user search logs. However, these queries usually have poor readability and limited coverage of article content, and are thus not suitable for the query feeds recommendation scenario. Furthermore, to deploy the generated queries for recommendation, human validation, which is costly in practice, is required to filter unsuitable queries. In this paper, we propose TitIE, a query mining system to generate valuable queries using the titles of documents. We employ both an extractive text generator and an abstractive text generator to generate queries from titles. To improve the acceptance rate during human validation, we further propose a model-based scoring strategy to pre-select the queries that are more likely to be accepted during human validation. Finally, we propose a novel dual learning approach to jointly learn the generation model and the selection model by making full use of the unlabeled corpora under a semi-supervised scheme, thereby simultaneously improving the performance of both models. Results from both offline and online evaluations demonstrate the superiority of our approach.
- Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 632--642.Google Scholar
- Zhicong Cheng, Bin Gao, and Tie-Yan Liu. 2010. Actively predicting diverse search intent from user browsing behaviors. In Proceedings of the 19th International Conference on World Wide Web. 221--230. Google ScholarDigital Library
- Hang Cui, Ji-Rong Wen, Jian-Yun Nie, and Wei-Ying Ma. 2002. Probabilistic query expansion using query logs. In Proceedings of the 11th international conference on World Wide Web. 325--332. Google ScholarDigital Library
- Aja Huang David Silver and et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529, 7587 (2016), 484--489.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. 4171--4186.Google Scholar
- Bruno M. Fonseca, Paulo Braz Golgher, Bruno Pôssas, Berthier A. Ribeiro-Neto, and Nivio Ziviani. 2005. Concept-based interactive query expansion. In Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management. 696--703. Google ScholarDigital Library
- Fred X. Han, Di Niu, Haolan Chen, Weidong Guo, Shengli Yan, and Bowei Long. 2020. Meta-Learning for Query Conceptualization at Web Scale. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3064--3073.Google ScholarDigital Library
- Fred X. Han, Di Niu, Haolan Chen, Kunfeng Lai, Yancheng He, and Yu Xu. 2019. A Deep Generative Approach to Search Extrapolation and Recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1771--1779. Google ScholarDigital Library
- Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, and Wei-Ying Ma. 2016. Dual Learning for Machine Translation. In Advances in Neural Information Processing Systems. 820--828. Google ScholarDigital Library
- Sepp Hochreiter and Jü rgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Hanqi Jin, Tianming Wang, and Xiaojun Wan. 2020 a. Multi-Granularity Interaction Network for Extractive and Abstractive Multi-Document Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6244--6254.Google ScholarCross Ref
- Hanqi Jin, Tianming Wang, and Xiaojun Wan. 2020 b. SemSUM: Semantic Dependency Guided Neural Abstractive Summarization. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. 8026--8033.Google Scholar
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations.Google Scholar
- Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 66--71.Google ScholarCross Ref
- Ruirui Li, Liangda Li, Xian Wu, Yunhong Zhou, and Wei Wang. 2019. Click Feedback-Aware Query Recommendation Using Adversarial Examples. In Proceedings of the the World Wide Web Conference. 2978--2984. Google ScholarDigital Library
- Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74--81.Google Scholar
- Bang Liu, Weidong Guo, Di Niu, Chaoyue Wang, Shunnan Xu, Jinghong Lin, Kunfeng Lai, and Yu Xu. 2019. A User-Centered Concept Mining System for Query and Document Understanding at Tencent. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1831--1841. Google ScholarDigital Library
- Xiaoyu Liu, Shunda Pan, Qi Zhang, Yu-Gang Jiang, and Xuanjing Huang. 2018. Generating Keyword Queries for Natural Language Queries to Alleviate Lexical Chasm Problem. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1163--1172. Google ScholarDigital Library
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311--318. Google ScholarDigital Library
- Sheng Qian, Guanyue Li, Wen-Ming Cao, Cheng Liu, Si Wu, and Hau-San Wong. 2019. Improving representation learning in autoencoders via multidimensional interpolation and dual regularizations. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. 3268--3274. Google ScholarDigital Library
- Pengda Qin, Weiran Xu, and William Yang Wang. 2018. Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2137--2147.Google ScholarCross Ref
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, Vol. 21 (2020), 140:1--140:67.Google Scholar
- H Scudder. 1965. Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory, Vol. 11, 3 (1965), 363--371. Google ScholarDigital Library
- Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1073--1083.Google Scholar
- Min Joon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. 2017. Bidirectional Attention Flow for Machine Comprehension. In Proceedings of the 5th International Conference on Learning Representations.Google Scholar
- Lei Shen and Yang Feng. 2020. CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 556--566.Google ScholarCross Ref
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, Vol. 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems. 3104--3112. Google ScholarDigital Library
- Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. 1999. Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Advances in Neural Information Processing Systems. 1057--1063. Google ScholarDigital Library
- Bilyana Taneva, Tao Cheng, Kaushik Chakrabarti, and Yeye He. 2013. Mining acronym expansions and their meanings using query click log. In Proceedings of the 22nd International World Wide Web Conference. 1261--1272. Google ScholarDigital Library
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. In Advances in Neural Information Processing Systems. 5998--6008.Google Scholar
- Shuohang Wang and Jing Jiang. 2017. Machine Comprehension Using Match-LSTM and Answer Pointer. In Proceedings of the 5th International Conference on Learning Representations.Google Scholar
- Ryen W White and Gary Marchionini. 2007. Examining the effectiveness of real-time query expansion. Information Processing & Management, Vol. 43, 3 (2007), 685--704. Google ScholarDigital Library
- Adina Williams, Nikita Nangia, and Samuel R. Bowman. 2018. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. 1112--1122.Google Scholar
- Xinwei Wu, Hechang Chen, Jiashu Zhao, Li He, Dawei Yin, and Yi Chang. 2021. Unbiased Learning to Rank in Feeds Recommendation. In The Fourteenth ACM International Conference on Web Search and Data Mining. 490--498. Google ScholarDigital Library
- Yu Wu, Wei Wu, Dejian Yang, Can Xu, and Zhoujun Li. 2018. Neural Response Generation With Dynamic Vocabularies. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. 5594--5601.Google Scholar
- Yingce Xia, Tao Qin, Wei Chen, Jiang Bian, Nenghai Yu, and Tie-Yan Liu. 2017. Dual Supervised Learning. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. 3789--3798. Google ScholarDigital Library
- Yingce Xia, Xu Tan, Fei Tian, Tao Qin, Nenghai Yu, and Tie-Yan Liu. 2018. In Proceedings of the 35th International Conference on Machine Learning, Vol. 80. 5379--5388.Google Scholar
- Jinxi Xu and W. Bruce Croft. 1996. Query Expansion Using Local and Global Document Analysis. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 4--11. Google ScholarDigital Library
- Jiacheng Xu, Zhe Gan, Yu Cheng, and Jingjing Liu. 2020. Discourse-Aware Neural Extractive Text Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5021--5031.Google ScholarCross Ref
- Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2020. mT5: A massively multilingual pre-trained text-to-text transformer. CoRR, Vol. abs/2010.11934 (2020).Google Scholar
- Xiaohui Yan, Jiafeng Guo, and Xueqi Cheng. 2011. Context-aware query recommendation by learning high-order relation in query logs. In Proceedings of the 20th ACM Conference on Information and Knowledge Management. ACM, 2073--2076. Google ScholarDigital Library
- Zhiquan Ye, Yuxia Geng, Jiaoyan Chen, Jingmin Chen, Xiaoxiao Xu, Suhang Zheng, Feng Wang, Jun Zhang, and Huajun Chen. 2020. Zero-shot Text Classification via Reinforced Self-training. In Proceedings of ACL. 3014--3024.Google ScholarCross Ref
- Yizhe Zhang, Guoyin Wang, Chunyuan Li, Zhe Gan, Chris Brockett, and Bill Dolan. 2020. POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 8649--8670.Google ScholarCross Ref
- Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. 2020. Extractive Summarization as Text Matching. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6197--6208.Google ScholarCross Ref
Index Terms
- Dual Learning for Query Generation and Query Selection in Query Feeds Recommendation
Recommendations
Aging effects on query flow graphs for query suggestion
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementWorld Wide Web content continuously grows in size and importance. Furthermore, users ask Web search engines to satisfy increasingly disparate information needs. New techniques and tools are constantly developed aimed at assisting users in the ...
View-based query containment
PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsQuery containment is the problem of checking whether for all databases the answer to a query is a subset of the answer to a second query. In several data management tasks, such as data integration, mobile computing, etc., the data of interest are only ...
Learning to evaluate and recommend query in restaurant search systems
Users tend to use their own terms to search items in structured search systems such as restaurant searches (e.g. Yelp), but due to users' lack of understanding on internal vocabulary and structures, they often fail to adequately search, which leads to ...
Comments