skip to main content
10.1145/3338906.3341459acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
short-paper

Helping developers search and locate task-relevant information in natural language documents

Published:12 August 2019Publication History

ABSTRACT

While performing a task, software developers interact with a myriad of natural language documents. Not all information in these documents is relevant to a developer's task forcing them to filter relevant information from large amounts of irrelevant information. If a developer misses some of the necessary information for her task, she will have an incomplete or incorrect basis from which to complete the task. Many approaches mine relevant text fragments from natural language artifacts. However, existing approaches mine information for pre-defined tasks and from a restricted set of artifacts. I hypothesize that it is possible to design a more generalizable approach that can identify, for a particular task, relevant text across different artifact types establishing relationships between them and facilitating how developers search and locate task-relevant information. To investigate this hypothesis, I propose to match a developer's task to text fragments in natural language artifacts according to their semantics. By semantically matching textual pieces to a developer's task we aim to more precisely identify fragments relevant to a task. To help developers in thoroughly navigating through the identified fragments I also propose to synthesize and group them. Ultimately, this research aims to help developers make more informed decisions regarding their software development task. Dr. Gail C. Murphy supervises this work.

References

  1. Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The Berkeley FrameNet Project. In Proc. of the 17th Int’l Conf. on Computational Linguistics - Volume 1 (COLING’98). Stroudsburg, PA, USA, 86–90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Bavota. 2016. Mining Unstructured Data in Software Repositories: Current and Future Trends. In 2016 IEEE 23rd Int’l Conf. on Software Analysis, Evolution, and Reengineering (SANER’16), Vol. 5. 1–12.Google ScholarGoogle Scholar
  3. Oscar Chaparro, Jing Lu, Fiorella Zampetti, Laura Moreno, Massimiliano Di Penta, Andrian Marcus, Gabriele Bavota, and Vincent Ng. 2017. Detecting Missing Information in Bug Descriptions. In Proc. of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). New York, NY, USA, 396–407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Cubranic, G. C. Murphy, J. Singer, and K. S. Booth. 2005. Hipikat: a project memory for software development. IEEE TSE 31, 6 (June 2005), 446–465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dipanjan Das, Desai Chen, André FT Martins, Nathan Schneider, and Noah A Smith. 2014. Frame-semantic parsing. Computational linguistics 40, 1 (2014), 9–56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Klaas Andries de Graaf, Peng Liang, Antony Tang, and Hans van Vliet. 2014. The Impact of Prior Knowledge on Searching in Software Documentation. In Proc. of the 2014 ACM DocEng (DocEng’14). New York, NY, USA, 189–198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, and Jaime Carbonell. 1999. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. In Proc. of the 22nd SIGIR. New York, NY, USA, 121–128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Reid Holmes and Andrew Begel. 2008. Deep Intellisense: A Tool for Rehydrating Evaporated Information. In Proc. of the 2008 Int’l Working Conf. on Mining Software Repositories (MSR’08). New York, NY, USA, 23–26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Qiao Huang, Xin Xia, Zhenchang Xing, David Lo, and Xinyu Wang. 2018. API Method Recommendation Without Worrying About the task-API Knowledge Gap. In Proc. of the 33rd ACM/IEEE Int’l Conf. on Automated Software Engineering (ASE’18). New York, NY, USA, 293–304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Jiang, J. Zhang, X. Li, Z. Ren, and D. Lo. 2016. A More Accurate Model for Finding Tutorial Segments Explaining APIs. In 2016 IEEE 23rd Int’l Conf. on Software Analysis, Evolution, and Reengineering (SANER’16), Vol. 1. 157–167.Google ScholarGoogle Scholar
  11. H. Jiang, J. Zhang, Z. Ren, and T. Zhang. 2017. An Unsupervised Approach for Discovering Relevant Tutorial Fragments for APIs. In 2017 IEEE/ACM 39th Int’l Conf. on Software Engineering (ICSE’17). 38–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. J. Ko and B. A. Myers and. 2006. A Linguistic Analysis of How People Describe Software Problems. In Visual Languages and Human-Centric Computing (VL/HCC’06). 127–134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Li, Z. Xing, X. Peng, and W. Zhao. 2013. What help do developers seek, when and how?. In 2013 20th Working Conf. on Reverse Engineering (WCRE’13). 142–151.Google ScholarGoogle Scholar
  14. R. Lotufo, Z. Malik, and K. Czarnecki. 2012. Modelling the ‘Hurried’ bug report reading process to summarize bug reports. In 2012 28th IEEE Int’l Conf. on Software Maintenance (ICSM’12). 430–439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gail C. Murphy, Mik Kersten, Martin P. Robillard, and Davor Čubranić. 2005. The Emergent Structure of Development Tasks. In European Conference on Object-Oriented Programming (ECOOP’05). Berlin, Heidelberg, 33–48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Petrosyan, M. P. Robillard, and R. De Mori. 2015. Discovering Information Explaining API Types Using Text Classification. In 2015 IEEE/ACM 37th IEEE Int’l Conf. on Software Engineering (ICSE’15), Vol. 1. 869–879. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Peter Pirolli and Stuart Card. 1999. Information foraging. Psychological review 106, 4 (1999), 643–675.Google ScholarGoogle Scholar
  18. Luca Ponzanelli, Simone Scalabrino, Gabriele Bavota, Andrea Mocci, Rocco Oliveto, Massimiliano Di Penta, and Michele Lanza. 2017. Supporting Software Developers with a Holistic Recommender System. In Proc. of the 39th Int’l Conf. on Software Engineering (ICSE’17). Piscataway, NJ, USA, 94–105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Rastkar, G. C. Murphy, and G. Murray. 2010. Summarizing software artifacts: a case study of bug reports. In 2010 ACM/IEEE 32nd Int’l Conf. on Software Engineering (ICSE’10), Vol. 1. 505–514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Martin P. Robillard and Yam B. Chhetri. 2015. Recommending Reference API Documentation. Empirical Softw. Engg. 20, 6 (Dec. 2015), 1558–1586. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tefko Saracevic. 2007. Relevance: A review of the literature and a framework for thinking on the notion in information science. Part II: nature and manifestations of relevance. Journal of the American Society for Information Science and Technology 58, 13 (2007), 1915–1933. Google ScholarGoogle ScholarCross RefCross Ref
  22. Pete Sawyer, Paul Rayson, and Roger Garside. 2002. REVERE: Support for Requirements Synthesis from Documents. Information Systems Frontiers 4, 3 (Sept. 2002), 343–353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Singer and T. Lethbridge. 1998. Studying Work Practices to Assist Tool Design in Software Engineering. In Proc. of the 6th IWPC. Washington, DC, USA, 173–. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xin Ye, Hui Shen, Xiao Ma, Razvan Bunescu, and Chang Liu. 2016. From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering. In Proc. of the 38th Int’l Conf. on Software Engineering (ICSE’16). New York, NY, USA, 404–415. Abstract 1 Introduction 2 Related Work 3 Proposed Research 3.1 Constructing an Annotated Corpus 3.2 Characterizing Task-relevant Textual Pieces 3.3 Automatically Identifying Task-relevant Information 3.4 Improving Developers' Searches with Task-specific Recommendations 4 Summary References Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Helping developers search and locate task-relevant information in natural language documents

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
      August 2019
      1264 pages
      ISBN:9781450355728
      DOI:10.1145/3338906

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      Overall Acceptance Rate112of543submissions,21%

      Upcoming Conference

      FSE '24
    • Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader