skip to main content
10.1145/2766462.2767759acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval

Published: 09 August 2015 Publication History

Abstract

Smartphones and tablets with their apps pervaded our everyday life, leading to a new demand for search tools to help users find the right apps to satisfy their immediate needs. While there are a few commercial mobile app search engines available, the new task of mobile app retrieval has not yet been rigorously studied. Indeed, there does not yet exist a test collection for quantitatively evaluating this new retrieval task. In this paper, we first study the effectiveness of the state-of-the-art retrieval models for the app retrieval task using a new app retrieval test data we created. We then propose and study a novel approach that generates a new representation for each app. Our key idea is to leverage user reviews to find out important features of apps and bridge vocabulary gap between app developers and users. Specifically, we jointly model app descriptions and user reviews using topic model in order to generate app representations while excluding noise in reviews. Experiment results indicate that the proposed approach is effective and outperforms the state-of-the-art retrieval models for app retrieval.

References

[1]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the Journal of machine Learning research, 3:993--1022, 2003.
[2]
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 25--32. ACM, 2004.
[3]
A. Datta, K. Dutta, S. Kajanan, and N. Pervin. Mobilewalla: A mobile application search engine. In Mobile Computing, Applications, and Services, pages 172--187. Springer, 2012.
[4]
A. P. De Vries, A.-M. Vercoustre, J. A. Thom, N. Craswell, and M. Lalmas. Overview of the inex 2007 entity ranking track. In Focused Access to XML Documents, pages 245--251. Springer, 2008.
[5]
H. Duan, C. Zhai, J. Cheng, and A. Gattani. Supporting keyword search in product database: A probabilistic approach. Proc. VLDB Endow., 6(14):1786--1797, Sept. 2013.
[6]
H. Fang, T. Tao, and C. Zhai. A formal study of information retrieval heuristics. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 49--56. ACM, 2004.
[7]
K. Ganesan and C. Zhai. Findilike: preference driven entity search. In Proceedings of the 21st international conference companion on World Wide Web, pages 345--348. ACM, 2012.
[8]
K. Ganesan and C. Zhai. Opinion-based entity ranking. Information retrieval, 15(2):116--150, 2012.
[9]
T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50--57. ACM, 1999.
[10]
D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich. Recommender systems: an introduction. Cambridge University Press, 2010.
[11]
K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4):422--446, 2002.
[12]
J. Kamps, M. Marx, M. De Rijke, and B. Sigurbjörnsson. Xml retrieval: What to retrieve? In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 409--410. ACM, 2003.
[13]
J. R. Landis and G. G. Koch. The measurement of observer agreement for categorical data. biometrics, pages 159--174, 1977.
[14]
W. Li and A. McCallum. Pachinko allocation: Dag-structured mixture models of topic correlations. In Proceedings of the 23rd international conference on Machine learning, pages 577--584. ACM, 2006.
[15]
J. Lin, K. Sugiyama, M.-Y. Kan, and T.-S. Chua. Addressing cold-start in app recommendation: latent user models constructed from twitter followers. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pages 283--292. ACM, 2013.
[16]
Z. Liu, J. Walker, and Y. Chen. Xseek: a semantic xml search engine using keywords. In Proceedings of the 33rd international conference on Very large data bases, pages 1330--1333. VLDB Endowment, 2007.
[17]
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The stanford corenlp natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55--60, 2014.
[18]
P. Ogilvie and J. Callan. Combining document representations for known-item search. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 143--150. ACM, 2003.
[19]
J. Pehcevski, A.-M. Vercoustre, and J. A. Thom. Exploiting locality of wikipedia links in entity ranking. In Advances in Information Retrieval, pages 258--269. Springer, 2008.
[20]
J. Pérez-Iglesias, J. R. Pérez-Agüera, V. Fresno, and Y. Z. Feinstein. Integrating the probabilistic models bm25/bm25f into lucene. arXiv preprint arXiv:0911.5046, 2009.
[21]
J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 275--281. ACM, 1998.
[22]
S. E. Robertson. The probability ranking principle in ir. Readings in information retrieval, pages 281--286, 1997.
[23]
F. Song and W. B. Croft. A general language model for information retrieval. In Proceedings of the eighth international conference on Information and knowledge management, pages 316--321. ACM, 1999.
[24]
A.-M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In Proceedings of the 2008 ACM symposium on Applied computing, pages 1101--1106. ACM, 2008.
[25]
H. M. Wallach, D. Minmo, and A. McCallum. Rethinking lda: Why priors matter. 2009.
[26]
N. Walsh, M. Fernández, A. Malhotra, M. Nagy, and J. Marsh. Xquery 1.0 and xpath 2.0 data model (xdm). W3C recommendation, W3C (January 2007), 2007.
[27]
X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 178--185. ACM, 2006.
[28]
X. Yi and J. Allan. A comparative study of utilizing topic models for information retrieval. In Advances in Information Retrieval, pages 29--41. Springer, 2009.
[29]
E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In Proceedings of the 15th ACM international conference on Information and knowledge management, pages 102--111. ACM, 2006.
[30]
P. Yin, P. Luo, W.-C. Lee, and M. Wang. App recommendation: a contest between satisfaction and temptation. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 395--404. ACM, 2013.
[31]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 334--342. ACM, 2001.
[32]
H. Zhu, H. Xiong, Y. Ge, and E. Chen. Mobile app recommendations with security and privacy awareness. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 951--960. ACM, 2014.

Cited By

View all
  • (2025)Entity search based on consumer preferences leveraging user reviewsExpert Systems with Applications10.1016/j.eswa.2025.126990(126990)Online publication date: Feb-2025
  • (2024)How to effectively mine app reviews concerning software ecosystem? A survey of review characteristicsJournal of Systems and Software10.1016/j.jss.2024.112040213(112040)Online publication date: Jul-2024
  • (2024)Data-Driven Analysis for Monitoring Software EvolutionNew Trends in Database and Information Systems10.1007/978-3-031-70421-5_36(383-391)Online publication date: 14-Nov-2024
  • Show More Cited By

Index Terms

  1. Leveraging User Reviews to Improve Accuracy for Mobile App Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '15: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval
    August 2015
    1198 pages
    ISBN:9781450336215
    DOI:10.1145/2766462
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 August 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. app search
    2. mobile app retrieval
    3. test collections

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR '15
    Sponsor:

    Acceptance Rates

    SIGIR '15 Paper Acceptance Rate 70 of 351 submissions, 20%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)28
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Entity search based on consumer preferences leveraging user reviewsExpert Systems with Applications10.1016/j.eswa.2025.126990(126990)Online publication date: Feb-2025
    • (2024)How to effectively mine app reviews concerning software ecosystem? A survey of review characteristicsJournal of Systems and Software10.1016/j.jss.2024.112040213(112040)Online publication date: Jul-2024
    • (2024)Data-Driven Analysis for Monitoring Software EvolutionNew Trends in Database and Information Systems10.1007/978-3-031-70421-5_36(383-391)Online publication date: 14-Nov-2024
    • (2023)Towards Automatically Localizing Function Errors in Mobile Apps With User ReviewsIEEE Transactions on Software Engineering10.1109/TSE.2022.317809649:4(1464-1486)Online publication date: 1-Apr-2023
    • (2023)Semantic similarity for mobile application recommendation under scarce user dataEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.105974121:COnline publication date: 1-May-2023
    • (2023)Multi-label Classification of Mobile Application User Reviews Using Neural Language ModelsSymbolic and Quantitative Approaches to Reasoning with Uncertainty10.1007/978-3-031-45608-4_31(417-426)Online publication date: 19-Nov-2023
    • (2022)Emerging topic identification from app reviews via adaptive online biterm topic modeling基于自适应在线双词主题模型的应用程序评论新兴主题识别Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210046523:5(678-691)Online publication date: 11-Apr-2022
    • (2022)Text-rating review discrepancy (TRRD): an integrative review and implications for researchFuture Business Journal10.1186/s43093-022-00114-y8:1Online publication date: 22-Feb-2022
    • (2022)Hierarchical Bayesian multi-kernel learning for integrated classification and summarization of app reviewsProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549174(558-569)Online publication date: 7-Nov-2022
    • (2022)Domain-specific analysis of mobile app reviews using keyword-assisted topic modelsProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510201(762-773)Online publication date: 21-May-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media