Successes, challenges, and rethinking – an industrial investigation on crowdsourced mobile application testing

Gao, Ruizhi; Wang, Yabin; Feng, Yang; Chen, Zhenyu; Eric Wong, W.

doi:10.1007/s10664-018-9618-5

Successes, challenges, and rethinking – an industrial investigation on crowdsourced mobile application testing

Experience Report
Published: 12 May 2018

Volume 24, pages 537–561, (2019)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Ruizhi Gao ORCID: orcid.org/0000-0002-3258-689X¹,
Yabin Wang²,
Yang Feng^2,3,
Zhenyu Chen² &
…
W. Eric Wong¹

1353 Accesses
26 Citations
1 Altmetric
Explore all metrics

Abstract

The term crowdsourcing – a compound contraction of crowd and outsourcing – is a new paradigm for utilizing the power of crowds of people to facilitate large-scale tasks that are costly or time consuming with traditional methods. This paradigm offers mobile application companies the possibility to outsource their testing activities to crowdsourced testers (crowdtesters) who have various testing facilities and environments, as well as different levels of skills and expertise. With this so-called Crowdsourced Mobile Application Testing (CMAT), some of the well-recognized issues in testing mobile applications, such as multitude of mobile devices, fragmentation of device models, variety of OS versions, and omnifariousness of testing scenarios, could be mitigated. However, how effective is CMAT in practice? What are the challenges and issues presented by the process of applying CMAT? How can these issues and challenges be overcome and CMAT be improved? Although CMAT has attracted attention from both academia and industry, these questions have not been addressed or researched in depth based on a large-scale and real-life industrial study. Since June 2015, we have worked with Mooctest, Inc., a CMAT intermediary, on testing five real-life Android applications using their CMAT platform – Kikbug. Throughout the process, we have collected 1013 bug reports from 258 crowdtesters and found 247 bugs in total. This paper will present our industrial study thoroughly and give an insightful analysis to investigate the successes and challenges of applying CMAT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

In this paper, we use “bug” and “fault” interchangeably.
The functionalities of different CMAT platforms may vary. However, the impact on the general CMAT workflow is insignificant.
You may visit http://mooctest.net/wiki for a test trial with more detailed instructions.
For the rest of the paper, a “detected bug” is one that is reported and also approved by the customers. A “reported bug” is not necessarily a “detected bug” unless the customers approve it.
STQA is sponsored by NSF I/UCRC (Industry/University Cooperative Research Centers Program). You may visit http://paris.utdallas.edu/stqa for more detailed information about STQA.

References

Adaptive Vehicle Make (2018) http://www.darpa.mil/news-events/2014-02-05
Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad H, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput 17(2):76–81
Article Google Scholar
Amazon Mechanical Turk (2018) https://www.mturk.com/mturk/welcome
Bruun A, Stage J (2015) New approaches to usability evaluation in software development: barefoot and crowdsourcing. J Syst Softw 105:40–53
Article Google Scholar
Capgemini (2017–2018) World Quality Report for Mobile Testing
Z. Chen and B. Luo (2014) “Quasi-crowdsourcing testing for educational projects,” in Proceedings of international conference on software engineering, pp. 272.275, Hyderabad, India, Mary
J. Cheng, J. Teevan, M. S. Bernstein (2015) “Measuring crowdsourcing effort with error-time curves,” in Proceedings of ACM conference on human factors in computing systems, pp. 1365–1374, Seoul, Korea
CloudMusic (2018) https://play.google.com/store/apps/details?id=com.netease.cloudmusic
CrowdMed (2018) https://www.crowdmed.com/
Crowdsourcing.org (2013) “Using crowdsourcing for software testing”
Lucas Dargis (2013) “Is UTest a Scam” http://lucasdargis.com/is-utest-a-scam/
E. Dolstra, R. Vliegendhart, and J. Pouwelse (2013) “Crowdsourcing GUI tests,” In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation, pages 332–341, Luxembourg
Y. Feng, Z. Chen, J. A. Jone, C. Fang, and B. Xu (2015) “Test report prioritization to assist Crowdsourced testing,” in Proceedings of joint meeting on foundations of software engineering, pp. 225–236, Bergamo, Italy
M. Goldman (2011) “Role-based interfaces for collaborative software development,” in Proceedings of the 24th annual ACM symposium adjunct on user Interface software and technology, pp. 23–26, Charlotte, USA
M. Goldman, G. Little, and R. C. Miller (2011) “Real-time collaborative coding in a web IDE,” in Proceedings of the 24th annual ACM symposium on user interface software and technology, pp. 155–164, Santa Barbara, USA
M. Gomez, R. Rouvoy, B. Adams, and L. Seinturier (2016) “Reproducing context-sensitive crashes of mobile apps using Crowdsourced monitoring,” in Proceedings of the international conference on mobile software engineering and systems, pp. 88–99, Austin, Texas
F. Guaiani and H. Muccini (2016) “Crowd and laboratory testing, can they co-exist? An exploratory study,” in Proceedings of the second international workshop on CrowdSourcing in software engineering, pp. 32–37, Florence, Italy
Haerem T, Rau D (2007) The influence of degree of expertise and objective task complexity on perceived task complexity and performance. J Appl Psychol 92(5):1320–1331
Article Google Scholar
M. Harman, Y. Jia, W. B. Langdon, J. Petke, I. H. Moghadam, S. Yoo, and F. Wu (2014) “Genetic improvement for adaptive software engineering,” in Proceedings of the international symposium on software engineering for Adaptiveand self-managing systems, pp. 1–4, Austin, USA
Hotelling H (1953) New light on the correlation coefficient and its transforms. J R Stat Soc 15(2):193–232
MathSciNet MATH Google Scholar
J. Howe (2016) “The rise of crowdsourcing,” Wired Magazine
Y.-C. Huang, C.-I. Wang, and J. Hsu (2013) “Leveraging the crowd for creating wireframe-based exploration of mobile design pattern gallery,” in Proceedings of the companion publication of the 2013 international conference on intelligent user interfaces, pp. 17–20, Santa Monica, USA
JustForFun (2018) http://apk.hiapk.com/appinfo/com.xp.tugele
Latoza TD, Van der Hoek A (2016) Crowdsourcing in software engineering: models, motivations, and challenges. IEEE Softw 33(1):74–80
Article Google Scholar
N. Leicht, N. Knop, I. Blohm, C. Müller-Bloch, and J. M. Leimeister (2016) “When is crowdsourcing advantageous? The case of Crowdsourced software testing,” in Proceedings of European conference on information systems, pp. 1–17, Istanbul, Turkey
Leicht N, Blohm I, Leimeister JM (2017) Leveraging the power of the crowd for software testing. IEEE Softw 34(2):62–69
Article Google Scholar
D. Liu, M. Lease, R. Kuipers, and R. Bia (2012) “Crowdsourcing for usability testing,” in Proceedings of the American Society for Information Science and Technology, vol. 49, no. 1, pp. 1–10
Mantyla MV, Itkonen J (2013) More testers - the effect of crowd size and time restriction in software testing. Inf Softw Technol 55(6):986–1003
Article Google Scholar
K. Mao, L. Capra, M. Harman, and Y. Jia (2015) “A survey of the use of crowdsourcing in software engineering,” Research Note, University College London
Mok R, Chang R, Li W (2017) Detecting low-quality workers in QoE Crowdtesting: a worker behavior-based approach. IEEE Transactions on Multimedia 19(3):530–543
Article Google Scholar
D. Mujumdar, M. Kallenbach, B. Liu, and B. Hartmann (2011) “Crowdsourcing suggestions to programming problems for dynamic web development languages,” in Proceedings of the 2011 annual conference extended abstracts on human factors in computing systems, pp. 1525–1530, Vancouver, Canada
MyCrowd (2018) https://mycrowd.com/
M. Nebeling, M. Speicher, and M. C. Norrie (2013) “CrowdStudy: General Toolkit for Crowdsourced Evaluation of Web Interfaces,” in Proceedings of the 5th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp 255–264, London, UK
OpenSignal, Android Fragmentation Visualized (2015)
F. Pastore, L. Mariani, and G. Fraser (2013) “Crowdoracles: can the crowd solve the Oracle problem?” In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation, pages 342–351, Luxembourg
iShopping (2018) https://play.google.com/store/apps/details?id=com.taobao.shopstreet
Uber (2018) https://www.uber.com/
UBook (2018) http://www.pgyer.com/qe34
UserTesting (2018) https://www.usertesting.com/
UTest (2018) https://www.utest.com/
Waze (2018) https://www.waze.com/
H. Xue (2013) “Using redundancy to improve security and testing,” Ph.D. dissertation, University of Illinois at Urbana-Champaign
M. Yan, H. Sun, and X. Liu (2014) “iTest: testing software with mobile crowdsourcing,” in Proceedings of the 1st International Workshop on Crowd-based Software Development Methods and Technologies, pp. 19–24, Hong Kong
M. Yuen, I. King, and K. Leung (2011) “A survey of crowdsourcing systems,” in Proceedings of IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE Conference on Social Computing, pp. 766–773, Boston, USA
Zhang X, Yang Z, Zhou Z, Cai H, Chen L, Li X (2014) Free market of crowdsourcing: incentive mechanism Design for Mobile Sensing. IEEE Trans Prallel Dist Syst 25(12):3190–3200
Article Google Scholar
Zogaj S, Bretschneider U, Leimeister JM (2014) Managing Crowdsourced software testing: a case study based insight on the challenges of a crowdsourcing intermediary. J Bus Econ 84:375–405
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61690201).

Author information

Authors and Affiliations

Department of Computer Science, University of Texas at Dallas, Richardson, TX, USA
Ruizhi Gao & W. Eric Wong
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
Yabin Wang, Yang Feng & Zhenyu Chen
Department of Informatics, University of California, Irvine, CA, USA
Yang Feng

Authors

Ruizhi Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yabin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Feng
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
W. Eric Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhenyu Chen or W. Eric Wong.

Additional information

Communicated by: Antonia Bertolino

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, R., Wang, Y., Feng, Y. et al. Successes, challenges, and rethinking – an industrial investigation on crowdsourced mobile application testing. Empir Software Eng 24, 537–561 (2019). https://doi.org/10.1007/s10664-018-9618-5

Download citation

Published: 12 May 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s10664-018-9618-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Successes, challenges, and rethinking – an industrial investigation on crowdsourced mobile application testing

Abstract

Access this article

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation