Abstract
Social microblogging services have gained a significant interest for society during our decade. These online platforms offered by the web 2.0 showed up the emergence of a large amount of data, allowing users to produce, share and exchange various content. Twitter is one of the most popular microblogging sites used by people to find relevant posts that satisfy their information need (e.g., breaking news, popular trends, information about people of interest, etc). However, Twitter’s queries and messages are short and access to information is sometimes difficult because of the variety of published content and huge amount of data generated. In this context, it is difficult for the user to properly find the relevant information. The proposal work deals with the context of social information retrieval (SIR) and aims to improve tweets retrieval quality. Thus, we propose a query expansion method to expand users’ queries. The proposed approach is based on Formal Concept Analysis by extracting patterns from documents retrieved by the search system. Also, the method uses Word Embeddings to enrich the patterns by adding similar words. The final query is therefore given by merging the initial query with the extended query. We experiment and evaluate the proposed method on the TREC 2011 dataset containing approximately 16 million tweets and 49 queries. Results revealed the effectiveness of the proposed approach and show the interest of combining patterns and word embeddings for enhanced microblogs retrieval.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Terrier is an effective open source search engine (Information Retrieval system), readily deployable on large-scale collections of documents.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Aggarwal N, Buitelaar P (2012) Query expansion using Wikipedia and DBpedia. In: CLEF (Online Working Notes/Labs/Workshop)
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD Rec 22(2):207–216. https://doi.org/10.1145/170036.170072
ALMasri M, Berrut C, Chevallet JP (2013) Wikipedia based Semantic Query Enrichment. In: Proceedings of the Sixth International Workshop on Exploiting Semantic Annotations in Information Retrieval, ESAIR ’13, pp 5–8
Almasri M, Berrut C, Chevallet J (2016) A Comparison of Deep Learning Based Query Expansion with Pseudo-Relevance Feedback and Mutual Information. In: Advances in Information Retrieval - 38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20–23, 2016. Proceedings, pp 709–715
Amati G (2003) Probability models for information retrieval based on divergence from randomness. PhD thesis, University of Glasgow
Bai J, Song D, Bruza P, Nie Jy, Cao G (2005) Query expansion using term relationships language models for information retrieval. International Conference on Information and Knowledge Management, Proceedings
Carpineto C, Romano G (2012) A survey of automatic query expansion in information retrieval. ACM Comput Surv 44(1)
Codocedo V, Napoli A (2015) Formal Concept Analysis and Information Retrieval – A Survey. In: International Conference in Formal Concept Analysis - ICFCA 2015, Springer, Nerja, Spain, vol 9113, pp 61–77
Codocedo V, Baixeries J, Kaytoue M, Napoli A (2016) Contributions to the Formalization of Order-like Dependencies using FCA. In: What can FCA do for Artificial Intelligence?, The Hague, Netherlands
Diaz F, Mitra B, Craswell N (2016) Query expansion with locally-trained word embeddings. CoRR abs/1605.07891
Dogra N, Mulhem P, Goeuriot L, Amini MR (2018) Corpus d’entraînement sur les plongements de mots pour la recherche de microblogs culturels. In: COnférence en Recherche d’Informations et Applications - CORIA 2018, Rennes, France
Ganter B, Wille R (1999) Formal concept analysis: mathematical foundations. Springer Science
Gong Z, Cheang CW, Hou U L (2006) Multi-term Web Query Expansion Using WordNet. In: Database and Expert Systems Applications, Springer Berlin Heidelberg, Berlin, Heidelberg, pp 379–388
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86
Hu J, Deng W, Guo J (2006) Improving retrieval performance by global analysis. In: 18th International Conference on Pattern Recognition, vol 2, pp 703–706
Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446
Jones KS, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: Development and comparative experiments. Inf Process Manage 36(6)
Kotov A, Zhai C (2012) Tapping into knowledge base for concept feedback: Leveraging ConceptNet to improve search results for difficult queries. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA, WSDM ’12, pp 403–412
Lau C, Li Y, Tjondronegoro D (2011) Microblog retrieval using topical features and query expansion. Proceedings of The Twentieth Text REtrieval Conference
Li W, Jones GJF (2017) Comparative evaluation of query expansion methods for enhanced search on microblog data: DCU ADAPT @ SMERP 2017 workshop data challenge. In: Proceedings of the First International Workshop on Exploitation of Social Media for Emergency Relief and Preparedness co-located with European Conference on Information Retrieval, pp 61–72
Li Y, Dong X, Guan Y (2011) HIT_LTRC at TREC 2011 Microblog Track. In: Text REtrieval Conference (TREC) 2011
Macdonald C, Ounis I (2007) Expertise Drift and Query Expansion in Expert Search. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM ’07, pp 341–350
Massoudi K, Tsagkias M, de Rijke M, Weerkamp W (2011) Incorporating query expansion and quality indicators in searching microblog posts. In: Proceedings of the 33rd European Conference on Advances in Information Retrieval, Springer-Verlag, Berlin, Heidelberg, ECIR’11, pp 362–367
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient Estimation of Word Representations in Vector Space. In: Proceedings of the International Conference on Learning Representations, pp 1–12
Mittal N, Nayak R, Govil MC, Jain KC (2010) Dynamic query expansion for efficient information retrieval. In: 2010 International Conference on Web Information Systems and Mining, vol 1, pp 211–215
Ounis I, Amati G, Plachouras V, He B, Macdonald C, Johnson D (2005) Terrier information retrieval platform. In: Proceedings of the 27th European Conference on Advances in Information Retrieval Research, Springer-Verlag, Berlin, Heidelberg, pp 517–519
Ounis I, Macdonald C, Lin J, Soboroff I (2011) Overview of the trec-2011 microblog track. In: In Proceedings of TREC 2011
Pal D, Mitra M, Bhattacharya S (2015) Exploring query categorisation for query expansion: A study. CoRR abs/1509.05567. arXiv:1509.05567
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46
Robertson SE, Walker S (1994) Some Simple Effective Approximations to the 2-Poisson Model for Probabilistic Weighted Retrieval. In: Croft BW, van Rijsbergen CJ (eds) SIGIR ’94, Springer London, London, pp 232–241
Rocchio JJ (1971) Relevance feedback in information retrieval. In: Salton G (ed) The Smart retrieval system - experiments in automatic document processing, Englewood Cliffs, NJ: Prentice-Hall, pp 313–323
Roy D, Paul D, Mitra M, Garain U (2016) Using word embeddings for automatic query expansion. CoRR abs/1606.07608. arXiv:1606.07608
Sanderson M (2010) Test collection based evaluation of information retrieval systems. Foundations and TrendsⓇin Information Retrieval 4(4):247–375
Silva PRC, Dias SM, Brandão WC, Song MAJ, Zárate LE (2017) Formal concept analysis applied to professional social networks analysis. In: Proceedings of the 19th International Conference on Enterprise Information Systems, Volume 1, Porto, Portugal, April, 2017, pp 123–134
Spink A, Wolfram D, Jansen J, Saracevic T (2001) Searching the web: The public and their queries. Journal of the American Society for Information Science and Technology 52:226 – 234
Xu J, Croft WB (1996) Query Expansion Using Local and Global Document Analysis. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’96, pp 4–11
Yang Z, Li C, Fan K, Huang J (2017) Exploiting multi-sources query expansion in microblogging filtering. Neural Network World 27:59–76
Zaki MJ, Hsiao C (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478
Zhai C, Lafferty J (2001) Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, ACM, NY,USA, pp 403–410
Zingla MA, Chiraz L, Slimani Y (2016) Short query expansion for microblog retrieval. Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 20th International Conference KES-2016 96(C)
Zingla MA, Latiri C, Mulhem P, Berrut C, Slimani Y (2018) Hybrid query expansion model for text and microblog information retrieval. Inf Retr Journal 21(4):337–367
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bendella, M., Quafafou, M. (2022). Leveraging Closed Patterns and Formal Concept Analysis for Enhanced Microblogs Retrieval. In: Missaoui, R., Kwuida, L., Abdessalem, T. (eds) Complex Data Analytics with Formal Concept Analysis. Springer, Cham. https://doi.org/10.1007/978-3-030-93278-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-93278-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93277-0
Online ISBN: 978-3-030-93278-7
eBook Packages: Computer ScienceComputer Science (R0)