short-paper

Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations

Authors:
Helia Hashemi

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

,
Hamed Zamani

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

,
W. Bruce Croft

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge ManagementOctober 2022Pages 4003–4008https://doi.org/10.1145/3511808.3557666

Published:17 October 2022Publication History

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Pages 4003–4008

ABSTRACT

Learning multiple intent representations for queries has potential applications in facet generation, document ranking, search result diversification, and search explanation. The state-of-the-art model for this task assumes that there is a sequence of intent representations. In this paper, we argue that the model should not be penalized as long as it generates an accurate and complete set of intent representations. Based on this intuition, we propose a stochastic permutation invariant approach for optimizing such networks. We extrinsically evaluate the proposed approach on a facet generation task and demonstrate significant improvements compared to competitive baselines. Our analysis shows that the proposed permutation invariant approach has the highest impact on queries with more potential intents.

References

Harry G. Barrow, Jay M. Tenenbaum, Robert C. Bolles, and Helen C. Wolf. 1977. Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching. In IJCAI. 659--663.Google Scholar
W. B. Croft and D. J. Harper. 1979. Using Probabilistic Models of Document Retrieval Without Relevance Information. J. Doc., Vol. 35, 4 (1979), 285--295.Google ScholarCross Ref
Wisam Dakka and Panagiotis G. Ipeirotis. 2008. Automatic Extraction of Useful Facet Hierarchies from Text Databases. 2008 IEEE 24th International Conference on Data Engineering (2008), 466--475.Google ScholarDigital Library
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman. 1990. Indexing by Latent Semantic Analysis. J. Assoc. Inf. Sci., Vol. 41, 6 (1990), 391--407.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL '19). ACL, Minneapolis, Minnesota, 4171--4186.Google Scholar
Zhicheng Dou, Zhengbao Jiang, Sha Hu, Ji-Rong Wen, and Ruihua Song. 2016. Automatically Mining Facets for Queries from Their Search Results. IEEE Trans. on Knowl. and Data Eng., Vol. 28, 2 (2016), 385--397.Google ScholarDigital Library
Helia Hashemi, Hamed Zamani, and W. Bruce Croft. 2020. Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, USA, 1131--1140.Google Scholar
Helia Hashemi, Hamed Zamani, and W. Bruce Croft. 2021. Learning Multiple Intent Representations for Search Queries. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, 669--679.Google Scholar
Maximilian Ilse, Jakub M. Tomczak, and Max Welling. 2018. Attention-based Deep Multiple Instance Learning. CoRR (2018).Google Scholar
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, USA, 39--48.Google ScholarDigital Library
Christian Kohlschütter, Paul-Alexandru Chirita, and Wolfgang Nejdl. 2006. Using Link Analysis to Identify Aspects in Faceted Web Search.Google Scholar
Weize Kong and James Allan. 2013. Extracting Query Facets from Search Results. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (Dublin, Ireland) (SIGIR '13). ACM, New York, NY, USA, 93--102.Google ScholarDigital Library
Weize Kong and James Allan. 2014. Extending Faceted Search to the General Web. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (Shanghai, China) (CIKM '14). 839--848.Google ScholarDigital Library
Weize Kong and James Allan. 2016. Precision-Oriented Query Facet Extraction. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM '16). 1433--1442.Google ScholarDigital Library
Adam R. Kosiorek, Hyunjik Kim, and Danilo J. Rezende. 2020. Conditional Set Generation with Transformers. CoRR (2020).Google Scholar
Harold W. Kuhn. 1955. Naval Research Logistics Quarterly 1--2 (1955), 83--97.Google Scholar
K. Latha, K. R. Veni, and R. Rajaram. 2010. AFGF: An Automatic Facet Generation Framework for Document Retrieval. In 2010 International Conference on Advances in Computer Engineering. 110--114.Google Scholar
Victor Lavrenko and W. Bruce Croft. 2001. Relevance Based Language Models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New Orleans, Louisiana, USA) (SIGIR '01). ACM, New York, NY, USA, 120--127.Google ScholarDigital Library
Juho Lee, Yoonho Lee, Jungtaek Kim, Adam R. Kosiorek, Seungjin Choi, and Yee Whye Teh. 2018. Set Transformer. CoRR (2018).Google Scholar
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871--7880.Google ScholarCross Ref
Chengkai Li, Ning Yan, Senjuti B. Roy, Lekhendro Lisham, and Gautam Das. 2010. Facetedpedia: Dynamic Generation of Query-Dependent Faceted Interfaces for Wikipedia. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). 651--660.Google ScholarDigital Library
Binsheng Liu, Xiaolu Lu, and J. Shane Culpepper. 2021. Strong Natural Language Query Generation. Inf. Retr., Vol. 24, 4--5 (oct 2021), 322--346.Google Scholar
Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, and Thomas Kipf. 2020. Object-Centric Learning with Slot Attention. CoRR (2020).Google Scholar
David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Schö lkopf, and Lé on Bottou. 2017. Discovering Causal Signals in Images. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (Honolulu, HI, USA) (CVPR '17). IEEE Computer Society, 58--66.Google ScholarCross Ref
A. Meyerson. 2001. Online facility location. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science. 426--431.Google ScholarCross Ref
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and Their Compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems (Lake Tahoe, Nevada) (NeurIPS '13). Curran Associates Inc., Red Hook, NY, USA, 3111--3119.Google ScholarDigital Library
Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. 2013. Domain Generalization via Invariant Feature Representation.Google Scholar
Junier Oliva, Barnabas Poczos, and Jeff Schneider. 2013. Distribution to Distribution Regression. In Proceedings of the 30th International Conference on Machine Learning.Google Scholar
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (Philadelphia, Pennsylvania) (ACL '02). ACL, USA, 311--318.Google ScholarDigital Library
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP '14). ACL, Doha, Qatar, 1532--1543.Google ScholarCross Ref
J. J. Rocchio. 1971. Relevance Feedback in Information Retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing. 313--323.Google Scholar
G. Salton, A. Wong, and C. S. Yang. 1975. A Vector Space Model for Automatic Indexing. Commun. ACM, Vol. 18, 11 (Nov. 1975), 613--620.Google ScholarDigital Library
Chris Samarinas, Arkin Dharawat, and Hamed Zamani. 2022. Revisiting Open Domain Query Facet Extraction and Generation. In Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval (Madrid, Spain) (ICTIR '22). Association for Computing Machinery, New York, NY, USA, 43--50. https://doi.org/10.1145/3539813.3545138Google ScholarDigital Library
Baoguang Shi, Song Bai, Zhichao Zhou, and Xiang Bai. 2015. DeepPano: Deep Panoramic Representation for 3-D Shape Recognition. IEEE Signal Processing Letters (2015).Google Scholar
Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical Networks for Few-shot Learning. CoRR (2017).Google ScholarDigital Library
Emilia Stoica, Marti Hearst, and Megan Richardson. 2007. Automating Creation of Hierarchical Faceted Metadata Structures. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics. 244--251.Google Scholar
Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik G. Learned-Miller. 2015. Multi-view Convolutional Neural Networks for 3D Shape Recognition. In Proceedings of the 2015 IEEE International Conference on Computer Vision (Santiago, Chile) (ICCV '15). IEEE Computer Society, 945--953.Google Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (Montreal, Canada) (NIPS'14). MIT Press, Cambridge, MA, USA, 3104--3112.Google ScholarDigital Library
Zoltán Szabó, Bharath K. Sriperumbudur, Barnabás Póczos, and Arthur Gretton. 2016. Learning Theory for Distribution Regression. J. Mach. Learn. Res., Vol. 17, 1 (jan 2016), 5272--5311.Google Scholar
Jaime Teevan, Susan Dumais, and Zachary Gutt. 2008. Challenges for Supporting Faceted Search in Large, Heterogeneous Corpora like the Web. In HCIR 2009.Google Scholar
Oriol Vinyals, Samy Bengio, and Manjunath Kudlur. 2016. Order Matters: Sequence to sequence for sets.Google Scholar
Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2021. Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval. In International Conference on Learning Representations (ICLR '21).Google Scholar
Xiaobing Xue and W. Bruce Croft. 2013. Modeling Reformulation Using Query Distributions. ACM Trans. Inf. Syst., Vol. 31, 2, Article 6 (may 2013), 34 pages.Google ScholarDigital Library
Bo Yang, Sen Wang, Andrew Markham, and Niki Trigoni. 2018. Attentional Aggregation of Deep Feature Sets for Multi-view 3D Reconstruction. CoRR (2018).Google Scholar
Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabá s Pó czos, Ruslan Salakhutdinov, and Alexander J. Smola. 2017. Deep Sets. CoRR (2017).Google Scholar
Hamed Zamani and W. Bruce Croft. 2016. Estimating Embedding Vectors for Queries. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval (Newark, Delaware, USA) (ICTIR '16). ACM, New York, NY, USA, 123--132.Google Scholar
Hamed Zamani and W. Bruce Croft. 2017. Relevance-Based Word Embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (Shinjuku, Tokyo, Japan) (SIGIR '17). ACM, New York, NY, USA, 505--514.Google ScholarDigital Library
Chengxiang Zhai and John Lafferty. 2001. Model-Based Feedback in the Language Modeling Approach to Information Retrieval. In Proceedings of the Tenth International Conference on Information and Knowledge Management (Atlanta, Georgia, USA) (CIKM '01). ACM, New York, NY, USA, 403--410.Google ScholarDigital Library
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020b. BERTScore: Evaluating Text Generation with BERT. In Proceedigns of the 8th International Conference on Learning Representations (ICLR '20).Google Scholar
Yan Zhang, Jonathon S. Hare, and Adam Prü gel-Bennett. 2019. Deep Set Prediction Networks. CoRR (2019).Google Scholar
Yan Zhang, Jonathon S. Hare, and Adam Prü gel-Bennett. 2020a. FSPool: Learning Set Representations with Featurewise Sort Pooling. CoRR (2020).Google Scholar

Index Terms

Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Information systems
  1. Information retrieval
    1. Information retrieval query processing
      1. Query representation

Recommendations

Revisiting Open Domain Query Facet Extraction and Generation
ICTIR '22: Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval

Web search queries can often be characterized by various facets. Extracting and generating query facets has various real-world applications, such as displaying facets to users in a search interface, search result diversification, clarifying question ...
Read More
Learning Multiple Intent Representations for Search Queries
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Representation learning has always played an important role in information retrieval (IR) systems. Most retrieval models, including recent neural approaches, use representations to calculate similarities between queries and documents to find relevant ...
Read More
Intent-aware query similarity
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Query similarity calculation is an important problem and has a wide range of applications in IR, including query recommendation, query expansion, and even advertisement matching. Existing work on query similarity aims to provide a single similarity ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
October 2022
5274 pages
ISBN:9781450392365
DOI:10.1145/3511808
General Chairs:
Mohammad Al Hasan
Indiana University Purdue University, Indianapolis, USA
,
Li Xiong
Emory University, Atlanta, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
facet generation
query representation learning
text set generation
Qualifiers
- short-paper
Conference

Acceptance Rates
CIKM '22 Paper Acceptance Rate621of2,257submissions,28%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 166
  Total Downloads
- Downloads (Last 12 months)75
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revisiting Open Domain Query Facet Extraction and Generation

Learning Multiple Intent Representations for Search Queries

Intent-aware query similarity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Revisiting Open Domain Query Facet Extraction and Generation

Learning Multiple Intent Representations for Search Queries

Intent-aware query similarity

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media