Abstract
Building successful and meaningful interoperation with external software APIs requires satisfying their conceptual interoperability constraints. These constraints, which we call the COINs, include structure, dynamic, and quality specifications that if missed they lead to costly implications of unexpected mismatches and running-late projects. However, for software architects and analysts, manual analysis of unstructured text in API documents to identify conceptual interoperability constraints is a tedious and time-consuming task that requires knowledge about constraint types. In this paper, we present our empirically-based research in addressing the aforementioned issues by utilizing machine learning techniques. We started with a multiple-case study through which we contributed a ground truth dataset. Then, we built a model for this dataset and tested its robustness through experiments using different machine learning text-classification algorithms. The results show that our model enables achieving \(70.4\,\%\) precision and \(70.2\,\%\) recall in identifying seven classes of constraints (i.e., Syntax, Semantic, Structure, Dynamic, Context, Quality, and Not-COIN). This achievement increases to \(81.9\,\%\) precision and \(82.0\,\%\) recall when identifying two classes (i.e., COIN, Not-COIN). Finally, we implemented a tool prototype to demonstrate the value of our findings for architects in a practical context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Programmable web: http://www.programmableweb.com/apis/directory.
- 2.
Simple HTML DOM: http://simplehtmldom.sourceforge.net/.
- 3.
References
Abukwaik, H., Abujayyab, M., Humayoun, S.R., Rombach, D.: Extracting conceptual interoperability constraints from API documentation using machine learning. In: ICSE 2016 (2016)
Abukwaik, H., Naab, M., Rombach, D.: A proactive support for conceptual interoperability analysis in software systems. In: WICSA 2015 (2015)
Anvaari, M., Zimmermann, O.: Semi-automated design guidance enhancer (SADGE): a framework for architectural guidance development. In: Avgeriou, P., Zdun, U. (eds.) ECSA 2014. LNCS, vol. 8627, pp. 41–49. Springer, Heidelberg (2014). doi:10.1007/978-3-319-09970-5_4
Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: Proceedings of 39th Annual Meeting of the Association for Computational Linguistics (2001)
Caldiera, V., Rombach, H.D.: The goal question metric approach. Encyclopedia Softw. Eng. 2, 528–532 (1994)
Chu, W., Lin, T.Y.: Foundations and advances in data mining (2005)
Figueiredo, A.M., Dos Reis, J.C., Rodrigues, M.A.: Improving access to software architecture knowledge an ontology-based search approach (2012)
Garlan, D., Allen, R., Ockerbloom, J.: Architectural mismatch or why it’s hard to build systems out of existing parts. In: ICSE 1995 (1995)
Hallé, S., Bultan, T., Hughes, G., Alkhalaf, M., Villemaire, R.: Runtime verification of web service interface contracts. Computer 43(3), 59–66 (2010)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Conference on Uncertainty in Artificial Intelligence (1995)
Kohavi, R., et al.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol. 14 (1995)
López, C., Codocedo, V., Astudillo, H., Cysneiros, L.M.: Bridging the gap between software architecture rationale formalisms and actual architecture documents: an ontology-driven approach. Sci. Comput. Program. 77, 66–80 (2012)
Pandita, R., Xiao, X., Zhong, H., Xie, T., Oney, S., Paradkar, A.: Inferring method specifications from natural language API descriptions. In: ICSE 2012 (2012)
Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation (2011)
Robertson, S.: Understanding inverse document frequency: on theoretical arguments for IDF. J. Documentation 60(5), 503–520 (2004)
Oakes, M.P., Ji, M. (eds.): Quantitative Methods in Corpus-Based Translation Studies: A Practical Guide to Descriptive Translation Research, vol. 51. John Benjamins Publishing, Amsterdam (2012)
Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empirical Softw. Eng. 14(2), 131–164 (2009)
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)
Wu, Q., Wu, L., Liang, G., Wang, Q., Xie, T., Mei, H.: Inferring dependency constraints on parameters for web services. In: WWW 2013 (2013)
Zhong, H., Zhang, L., Xie, T., Mei, H.: Inferring resource specifications from natural language API documentation. In: ASE 2009 (2009)
Acknowledgment
This work is supervised by Prof. Dieter Rombach and funded by the Ph.D. Program of the CS Department of Kaiserslautern University. We thank Mohammed Abufouda and the anonymous reviewers for the valuable comments and feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Abukwaik, H., Abujayyab, M., Rombach, D. (2016). Towards Seamless Analysis of Software Interoperability: Automatic Identification of Conceptual Constraints in API Documentation. In: Tekinerdogan, B., Zdun, U., Babar, A. (eds) Software Architecture. ECSA 2016. Lecture Notes in Computer Science(), vol 9839. Springer, Cham. https://doi.org/10.1007/978-3-319-48992-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-48992-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48991-9
Online ISBN: 978-3-319-48992-6
eBook Packages: Computer ScienceComputer Science (R0)