skip to main content
10.1145/3239235.3268927acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article
Best Industry Paper

Automatic topic classification of test cases using text mining at an Android smartphone vendor

Published:11 October 2018Publication History

ABSTRACT

Background: An Android smartphone is an ecosystem of applications, drivers, operating system components, and assets. The volume of the software is large and the number of test cases needed to cover the functionality of an Android system is substantial. Enormous effort has been already taken to properly quantify "what features and apps were tested and verified?". This insight is provided by dashboards that summarize test coverage and results per feature. One method to achieve this is to manually tag or label test cases with the topic or function they cover, much like function points. At the studied Android smartphone vendor, tests are labelled with manually defined tags, so-called "feature labels (FLs)", and the FLs serve to categorize 100s to 1000s test cases into 10 to 50 groups.

Aim: Unfortunately for developers, manual assignment of FLs to 1000s of test cases is a time consuming task, leading to inaccurately labeled test cases, which will render the dashboard useless. We created an automated system that suggests tags/labels to the developers for their test cases rather than manual labeling.

Method: We use machine learning models to predict and label the functionality tested by 10,000 test cases developed at the company.

Results: Through the quantitative experiments, our models achieved acceptable F-1 performance of 0.3 to 0.88. Also through the qualitative studies with expert teams, we showed that the hierarchy and path of tests was a good predictor of a feature's label.

Conclusions: We find that this method can reduce tedious manual effort that software developers spent classifying test cases, while providing more accurate classification results.

References

  1. Karan Aggarwal, , Finbarr Timbers, , Tanner Rutgers, , Abram Hindle, , Eleni Stroulia, , and Russell Greiner. 2017. Detecting duplicate bug reports with software engineering domain knowledge. Journal of Software: Evolution and Process 29 (2017). Issue 3. e1821 smr.1821.Google ScholarGoogle Scholar
  2. Raymond P. L. Buse and Thomas Zimmermann. 2012. Information Needs for Software Development Analytics. In Proceedings of the 34th International Conference on Software Engineering (ICSE '12). IEEE Press, Piscataway, NJ, USA, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Ebert and H. Soubra. 2014. Functional Size Estimation Technologies for Software Maintenance. IEEE Software 31, 6 (Nov 2014).Google ScholarGoogle Scholar
  4. D. Han, C. Zhang, X. Fan, A. Hindle, K. Wong, and E. Stroulia. 2012. Understanding Android Fragmentation with Topic Analysis of Vendor-Specific Bugs. In 2012 19th Working Conference on Reverse Engineering. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Abram Hindle, Christian Bird, Thomas Zimmermann, , and Nachiappan Nagappan. 2012. Relating Requirements to Implementation via Topic Analysis: Do Topics Extracted from Requirements Make Sense to Managers and Developers?. In International Conference on Software Maintenance (ICSM 2012). IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Keenan, A. Czauderna, G. Leach, J. Cleland-Huang, Y. Shin, E. Moritz, M. Gethers, D. Poshyvanyk, J. Maletic, J. H. Hayes, A. Dekhtyar, D. Manukian, S. Hossein, and D. Hearn. 2012. TraceLab: An experimental workbench for equipping researchers to innovate, synthesize, and comparatively evaluate traceability solutions. In 2012 34th International Conference on Software Engineering (ICSE). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Stuart McIlroy, Nasir Ali, Hammad Khalid, and Ahmed E. Hassan. 2016. Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Software Engineering 21, 3 (01 Jun 2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Anh Tuan Nguyen, Tung Thanh Nguyen, Tien N Nguyen, David Lo, and Chengnian Sun. 2012. Duplicate bug report detection with a combination of information retrieval and topic modeling. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Annibale Panichella, Bogdan Dit, Rocco Oliveto, Massimiliano Di Penta, Denys Poshyvanyk, and Andrea De Lucia. 2013. How to Effectively Use Topic Models for Software Engineering Tasks? An Approach Based on Genetic Algorithms. In Proceedings of the 2013 International Conference on Software Engineering (ICSE '13). IEEE Press, Piscataway, NJ, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Miroslaw Staron, Wilhelm Meding, JÃűrgen Hansson, Christoffer HÃűglund, Kent Niesel, and Vilhelm Bergmann. 2014. Chapter 8 - Dashboards for Continuous Monitoring of Quality for Software Product under Development. In Relating System Quality and Software Architecture, Ivan Mistrik, Rami Bahsoon, Peter Eeles, Roshanak Roshandel, and Michael Stal (Eds.). Morgan Kaufmann, Boston.Google ScholarGoogle Scholar
  11. D. Suleiman, M. Alian, and A. Hudaib. 2017. A survey on prioritization regression testing test case. In 2017 8th International Conference on Information Technology (ICIT).Google ScholarGoogle Scholar
  12. Stephen W Thomas, Hadi Hemmati, Ahmed E Hassan, and Dorothea Blostein. 2014. Static test case prioritization using topic models. Empirical Software Engineering 19, 1 (2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Treude and M. A. Storey. 2010. Awareness 2.0: staying aware of projects, developers and tasks using dashboards and feeds. In 2010 ACM/IEEE32nd International Conference on Software Engineering, Vol. 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Shin Yoo and Mark Harman. 2012. Regression testing minimization, selection and prioritization: a survey. Software Testing, Verification and Reliability 22, 2 (2012). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kovair Marketing, "White Paper - ALM and Integrated ALM", https://www.kovair.com/What-are-ALM-and-Integrated-ALM.pdfGoogle ScholarGoogle Scholar
  16. Radim Rehůřek and Petr Sojka, "Software Framework for Topic Modeling with Large Corpora", Proc. of the LREC Workshop, 45--50, 2010Google ScholarGoogle Scholar
  17. Pedregosa, F. et. al., "Scikit-learn: Machine Learning in Python", J. Mach. Learn. Res., 12, 2825--2830, 2011 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation", J. Mach. Learn. Res., 3, 993--1022, 2003 Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Automatic topic classification of test cases using text mining at an Android smartphone vendor

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ESEM '18: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
      October 2018
      487 pages
      ISBN:9781450358231
      DOI:10.1145/3239235

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 October 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate130of594submissions,22%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader