The AIQ Meta-Testbed: Pragmatically Bridging Academic AI Testing and Industrial Q Needs

Borg, Markus

doi:10.1007/978-3-030-65854-0_6

Markus Borg^11,12

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 404))

Included in the following conference series:

International Conference on Software Quality

1427 Accesses
11 Citations

Abstract

AI solutions seem to appear in any and all application domains. As AI becomes more pervasive, the importance of quality assurance increases. Unfortunately, there is no consensus on what artificial intelligence means and interpretations range from simple statistical analysis to sentient humanoid robots. On top of that, quality is a notoriously hard concept to pinpoint. What does this mean for AI quality? In this paper, we share our working definition and a pragmatic approach to address the corresponding quality assurance with a focus on testing. Finally, we present our ongoing work on establishing the AIQ Meta-Testbed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On monitorability of AI

Article Open access 06 February 2024

Towards Specifying and Evaluating the Trustworthiness of an AI-enabled System

Trustworthy Human-Centered Automation Through Explainable AI and High-Fidelity Simulation

Notes

1.
bit.ly/3dKeUEH.
2.
https://github.com/ckaestne/seaibib.
3.
https://github.com/SE-ML/awesome-seml.
4.
Well aware of the two previous “AI winters”, periods with less interest and funding due to inflated expectations.
5.
metatest.ai.
6.
ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai.

References

Lipson, H., Kurman, M.: Driverless: Intelligent Cars and the Road Ahead. MIT Press, Cambridge (2016)
Google Scholar
Jiang, F., et al.: Artificial intelligence in healthcare: past, present and future. Stroke Vasc. Neurol. 2(4), 230–243 (2017)
Article Google Scholar
Walkinshaw, N.: Software Quality Assurance: Consistency in the Face of Complexity and Change. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-64822-4
Book Google Scholar
Borg, M., et al.: Safely entering the deep: a review of verification and validation for machine learning and a challenge elicitation in the automotive industry. J. Autom. Softw. Eng. 1(1), 1–19 (2019)
Article Google Scholar
Salay, R., Queiroz, R., Czarnecki, K.: An Analysis of ISO 26262: Machine Learning and Safety in Automotive Software. SAE Technical Paper 2018–01-1075 (2018)
Google Scholar
Azulay, A., Weiss, Y.: Why do deep convolutional networks generalize so poorly to small image transformations? J. Mach. Learn. Res. 20, 25 (2019)
MathSciNet MATH Google Scholar
Schulmeyer, G.: Handbook Of Software Quality Assurance, 1st edn. Prentice Hall, Upper Saddle River (1987)
Google Scholar
Galin, D.: Software Quality Assurance: From Theory to Implementation. Pearson, Harlow (2003)
Google Scholar
Mistrik, I., Soley, R.M., Ali, N., Grundy, J., Tekinerdogan, B. (eds.): Software Quality Assurance: In Large Scale and Complex Software-Intensive Systems. Morgan Kaufmann, Waltham (2016)
Google Scholar
Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6, 52138–52160 (2018)
Article Google Scholar
Borg, M.: Explainability first! Cousteauing the depths of neural networks to argue safety. In: Greenyer, J., Lochau, M., Vogel, T., (eds.) Explainable Software for Cyber-Physical Systems (ES4CPS): Report from the GI Dagstuhl Seminar 19023, pp. 26–27 (2019)
Google Scholar
Vogelsang, A., Borg, M.: Requirements engineering for machine learning: perspectives from data scientists. In: Proceedings of the 27th International Requirements Engineering Conference Workshops, pp. 245–251 (2019)
Google Scholar
Weyns, D., et al.: A survey of formal methods in self-adaptive systems. In: Proceedings of the 5th International Conference on Computer Science and Software Engineering, pp. 67–79 (2012)
Google Scholar
Gonzalez, C.A., Cabot, J.: Formal verification of static software models in MDE: a systematic review. Inf. Softw. Tech. 56(8), 821–838 (2014)
Article Google Scholar
Herbsleb, J., et al.: Software quality and the capability maturity model. Commun. ACM 40(6), 30–40 (1997)
Article Google Scholar
Ashrafi, N.: The impact of software process improvement on quality: theory and practice. Inf. Manag. 40(7), 677–690 (2003)
Article Google Scholar
Gelperin, D., Hetzel, B.: The growth of software testing. Commun. ACM 31(6), 687–695 (1988)
Article Google Scholar
Orso, A., Rothermel, G.: Software testing: a research travelogue (2000–2014). In: Future of Software Engineering Proceedings, pp. 117–132 (2014)
Google Scholar
Kassab, M., DeFranco, J.F., Laplante, P.A.: Software testing: the state of the practice. IEEE Softw. 34(5), 46–52 (2017)
Article Google Scholar
Hulten, G.: Building Intelligent Systems: A Guide to Machine Learning Engineering, 1st edn. Apress, New York (2018)
Book Google Scholar
Kästner, C., Kang, E.: Teaching Software Engineering for AI-Enabled Systems. arXiv:2001.06691 [cs], January 2020
Serban, A., van der Blom, K., Hoos, H., Visser, J.: Adoption and effects of software engineering best practices in machine learning. In: Proceedings of the 14th International Symposium on Empirical Software Engineering and Measurement (2020)
Google Scholar
Bosch, J., Crnkovic, I., Olsson, H.H.: Engineering AI Systems: A Research Agenda. arXiv:2001.07522 [cs], January 2020
Zhang, J.M., et al.: Machine learning testing: survey, landscapes and horizons. IEEE Trans. Softw. Eng. (2020). (Early Access)
Google Scholar
Vincenzo, R., Jahangirova, G., Stocco, A., Humbatova, N., Weiss, M., Tonella, P.: Testing machine learning based systems: a systematic mapping. Empirical Softw. Eng. 25, 5193–5254 (2020)
Article Google Scholar
Schallmo, D.R.A., Williams, C.A.: History of digital transformation. Digital Transformation Now!. SB, pp. 3–8. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-72844-5_2
Chapter Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 3rd edn. Pearson, Upper Saddle River (2009)
MATH Google Scholar
Cai, K.Y.: Optimal software testing and adaptive software testing in the context of software cybernetics. Inf. Softw. Technol. 44(14), 841–855 (2002)
Article Google Scholar
Mahdavi-Hezavehi, S., et al.: A systematic literature review on methods that handle multiple quality attributes in architecture-based self-adaptive systems. Inf. Softw. Technol. 90, 1–26 (2017)
Article Google Scholar
Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 2503–2511 (2015)
Google Scholar
Humbatova, N., Jahangirova, G., Bavota, G., Riccio, V., Stocco, A., Tonella, P.: Taxonomy of real faults in deep learning systems. In: Proceedings of the 42nd International Conference on Software Engineering (2020)
Google Scholar
Ammann, P., Offutt, J.: Introduction to Software Testing. Cambridge University Press, Cambridge (2016)
Book Google Scholar
Felderer, M., Russo, B., Auer, F.: On testing data-intensive software systems. Security and Quality in Cyber-Physical Systems Engineering, pp. 129–148. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25312-7_6
Chapter Google Scholar
Basili, V., Selby, R.: Comparing the effectiveness of software testing strategies. IEEE Trans. Softw. Eng. SE–13(12), 1278–1296 (1987)
Article Google Scholar
Zhu, Q., Panichella, A., Zaidman, A.: A systematic literature review of how mutation testing supports quality assurance processes. Softw. Test. Verif. Reliab. 28(6), e1675 (2018)
Article Google Scholar
Erich, F., Amrit, C., Daneva, M.: A qualitative study of DevOps usage in practice. J. Softw. Evol. Process 29(6), e1885 (2017)
Article Google Scholar
Karamitsos, I., Albarhami, S., Apostolopoulos, C.: Applying DevOps practices of continuous automation for machine learning. Information 11(7), 363 (2020)
Article Google Scholar

Download references

Acknowledgements

This work was funded by Plattformen at Campus Helsingborg, Lund University.

Author information

Authors and Affiliations

RISE Research Institutes of Sweden AB, Lund, Sweden
Markus Borg
Department of Computer Science, Lund University, Lund, Sweden
Markus Borg

Authors

Markus Borg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Markus Borg .

Editor information

Editors and Affiliations

TU Wien, Vienna, Austria
Dietmar Winkler
TU Wien, Vienna, Austria
Stefan Biffl
Blekinge Institute of Technology, Karlskrona, Sweden
Daniel Mendez
Johannes Kepler University Linz, Linz, Austria
Manuel Wimmer
Software Quality Lab GmbH, Linz, Austria
Johannes Bergsmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borg, M. (2021). The AIQ Meta-Testbed: Pragmatically Bridging Academic AI Testing and Industrial Q Needs. In: Winkler, D., Biffl, S., Mendez, D., Wimmer, M., Bergsmann, J. (eds) Software Quality: Future Perspectives on Software Engineering Quality. SWQD 2021. Lecture Notes in Business Information Processing, vol 404. Springer, Cham. https://doi.org/10.1007/978-3-030-65854-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-65854-0_6
Published: 06 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65853-3
Online ISBN: 978-3-030-65854-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics