IntelliSMART: Intelligent Semantic Machine-Assisted Research Tool

Khatri, Aadyant; Egierski, Nicolas; Pochamreddy, Ashutosh; Alhamadani, Abdulaziz; Sarkar, Shailik; Lu, Chang-Tien

doi:10.1007/978-3-031-78554-2_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15214))

Included in the following conference series:

International Conference on Advances in Social Networks Analysis and Mining

9 Accesses

Abstract

The exponential growth of academic literature presents significant challenges for researchers attempting to find relevant information. Traditional keyword-based retrieval systems often fail to address issues such as synonyms, homonyms, and semantic nuances, leading to suboptimal search results. This paper introduces a novel system called IntelliSMART (Intelligent Semantic Machine-Assisted Research Tool), which leverages large language models (LLMs) and advanced semantic processing techniques to improve the retrieval of academic literature. Our approach integrates query rewriting, embedding generation, efficient indexing, and complex article retrieval mechanisms to provide highly accurate and contextually relevant results that align with the user’s intent. The IntelliSMART system features a user-friendly front end that facilitates intuitive query input, along with a robust back end for handling user queries, generating embeddings, indexing extensive collections of academic papers, and efficiently retrieving the most relevant documents. The proposed system shows significant improvements over conventional methods, highlighting its potential to transform the search experience in academic research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhu, Y., et al.: Large language models for information retrieval: a survey, 2024
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology Behind Search, 2nd edn. Addison-Wesley Publishing Company, USA (2011)
MATH Google Scholar
Mao, R., et al.: A survey on semantic processing techniques. Inf. Fusion 101, 101988 (2024). https://www.sciencedirect.com/science/article/pii/S1566253523003044
Abdul-Jaleel, N., et al.: Umass at trec 2004: novelty and hard, Computer Science Department Faculty Publication Series, p. 189, 2004
Google Scholar
Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 403–410, 2001
Google Scholar
Metzler, D., Croft, W.B.: A Markov random field model for term dependencies. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 472–479, 2005
Google Scholar
Zheng, Z., Hui, K., He, B., Han, X., Sun, L., Yates, A.: Bert-qe: contextualized query expansion for document re-ranking, arXiv preprint arXiv:2009.07258, 2020
Diaz, F., Mitra, B., Craswell, N.: Query expansion with locally-trained word embeddings, arXiv preprint arXiv:1605.07891, 2016
Kuzi, S., Shtok, A., Kurland, O.: Query expansion using word embeddings. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 1929–1932, 2016
Google Scholar
Mackie, I., Sekulic, I., Chatterjee, S., Dalton, J., Crestani, F.: GRM: generative relevance modeling using relevance-aware sample estimation for document retrieval, arXiv preprint arXiv:2306.09938, 2023
Srinivasan, K., Raman, K., Samanta, A., Liao, L., Bertelli, L., Bendersky, M.: Quill: query intent with large language models using retrieval augmentation and multi-stage distillation, arXiv preprint arXiv:2210.15718, 2022
Feng, J., et al.: Knowledge refinement via interaction between search engines and large language models, arXiv preprint arXiv:2305.07402, 2023
Mackie, I., Chatterjee, S., Dalton, J.: Generative and pseudo-relevant feedback for sparse, dense and learned sparse retrieval, arXiv preprint arXiv:2305.07477, 2023
Gao, L., Ma, X., Lin, J., Callan, J.: Precise zero-shot dense retrieval without relevance labels, arXiv preprint arXiv:2212.10496, 2022
Jagerman, R., Zhuang, H., Qin, Z., Wang, X., Bendersky, M.: Query expansion by prompting large language models, arXiv preprint arXiv:2305.03653, 2023
Tang, Y., Qiu, R., Li, X.: Prompt-based effective input reformulation for legal case retrieval. In: Australasian Database Conference. Springer, pp. 87–100, 2023
Google Scholar
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at trec-3. Nist Spec. Publ. Sp 109, 109 (1995)
Google Scholar
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering, arXiv preprint arXiv:2004.04906, 2020
Xiong, L., et al.: Approximate nearest neighbor negative contrastive learning for dense text retrieval, arXiv preprint arXiv:2007.00808, 2020
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805, 2018
Neelakantan, A., et al.: Text and code embeddings by contrastive pre-training, arXiv preprint arXiv:2201.10005, 2022
Ma, X., Wang, L., Yang, N., Wei, F., Lin, J.: Fine-tuning llama for multi-stage text retrieval, arXiv preprint arXiv:2310.08319, 2023
Asai, A., et al.: Task-aware retrieval with instructions, arXiv preprint arXiv:2211.09260, 2022
Wei, J., et al.: Finetuned language models are zero-shot learners, arXiv preprint arXiv:2109.01652, 2021
Li, M., et al.: Generate, filter, and fuse: Query expansion via multi-step keyword generation for zero-shot neural rankers, arXiv preprint arXiv:2311.09175, 2023
Anand, A., Setty, V., Anand, A.,et al.: Context aware query rewriting for text rankers using llm, arXiv preprint arXiv:2308.16753, 2023
Li, J., Tang, T., Zhao, W.X., Nie, J.-Y., Wen, J.-R.: Pretrained language models for text generation: a survey, arXiv preprint arXiv:2201.05273, 2022
Mitra, B., Craswell, N.: Neural models for information retrieval, arXiv preprint arXiv:1705.01509, 2017
Li, Z., Zhang, X., Zhang, Y., Long, D., Xie, P., Zhang, M.: Towards general text embeddings with multi-stage contrastive learning, arXiv preprint arXiv:2308.03281, 2023
arXiv.org submitters, “arxiv dataset,” 2024. https://www.kaggle.com/dsv/7548853

Download references

Author information

Authors and Affiliations

Department of Computer Science, Virginia Tech, Falls Church, VA, 22043, USA
Aadyant Khatri, Nicolas Egierski, Ashutosh Pochamreddy, Shailik Sarkar & Chang-Tien Lu
Department of Data Science and Business Analytics, Florida Polytechnic University, Lakeland, FL, USA
Abdulaziz Alhamadani

Authors

Aadyant Khatri
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Egierski
View author publications
You can also search for this author in PubMed Google Scholar
Ashutosh Pochamreddy
View author publications
You can also search for this author in PubMed Google Scholar
Abdulaziz Alhamadani
View author publications
You can also search for this author in PubMed Google Scholar
Shailik Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Chang-Tien Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aadyant Khatri .

Editor information

Editors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Luca Maria Aiello
Indian Institute of Technology Delhi, New Delhi, Delhi, India
Tanmoy Chakraborty
Università degli Studi di Milano, Milan, Italy
Sabrina Gaito

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khatri, A., Egierski, N., Pochamreddy, A., Alhamadani, A., Sarkar, S., Lu, CT. (2025). IntelliSMART: Intelligent Semantic Machine-Assisted Research Tool. In: Aiello, L.M., Chakraborty, T., Gaito, S. (eds) Social Networks Analysis and Mining. ASONAM 2024. Lecture Notes in Computer Science, vol 15214. Springer, Cham. https://doi.org/10.1007/978-3-031-78554-2_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-78554-2_12
Published: 25 January 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78553-5
Online ISBN: 978-3-031-78554-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics