An empirical framework for evaluating interoperability of data exchange standards based on their actual usage: A case study on XLIFF
Introduction
The most widely used definition for interoperability is the definition by the IEEE [1]:
“Interoperability is the ability of two or more systems or components to exchange information and to use the information that has been exchanged.”
Interoperability is becoming increasingly important in heterogeneous environments as it facilitates the integration of different entities such as tools, businesses, technologies and processes. Data exchange formats play a prominent role in facilitating interoperability by providing agreed or standardized notations for storage and exchange of data. Data exchange formats that are based on the Extensible Markup Language (XML) are becoming ever-more pervasive [2], [3] and they can be categorized as either open or proprietary. Examples of popular XML-based open data exchange standards (also known as open file formats) include: XHTML, DOCBOOK, and Office Open XML (OOXML). However, the definition of such standards is an arduous time-consuming process due to the constantly evolving nature of the technologies, businesses, and tools in question [4]. That is, standards need to be constantly reviewed and updated to cope with the changing requirements, typically across multiple organizations.
In this paper, we propose a novel empirical framework that can be used as a tool to evaluate the usage of data-exchange, XML-based standards and thus inform on the development, maintenance and evolution of those standards. The utility of this framework is illustrated by applying it to the XML Localization Interchange File Format (XLIFF), an open standard for the exchange of localization data.
The XLIFF standard has been developed and is being maintained by a Technical Committee (TC) of OASIS and is an important standard for enabling interoperability in the localization domain (see Section 2.2.1 for more details on the XLIFF standard). It aims to enable the loss-less exchange of localization-relevant data and metadata between different tools and technologies across the localization process. XLIFF is gaining increased acceptance within the localization community [5], and is increasingly being used not just as an exchange format, but also as a container for the storage and transport of data [6], [7].
XLIFF version 2 was released on the 5th of August 2014 to provide solutions to various significant issues relating to the previous version of the XLIFF standard (version 1.2). However problems remain with respect to adoption, as confirmed by a study conducted in 2011, which revealed that lack of interoperability could cost language service providers more than 20% of their total translation budget. According to this study, the main cause for lack of interoperability is the “lack of compliance to interchange format standards” [8], a finding that suggests the standard may still be immature with respect to adopters' needs.
We aim to evaluate this potential immaturity issue by reporting on experiments where the usage of the XLIFF schema is assessed by our analytical framework. The framework will provide empirical evidence and statistics related to the actual usage of different elements, attributes and attribute values of this standard in-vivo.
More generically, this illustration demonstrates that the XML-DIUE framework proposed can also serve to address similar issues in XML based file format standards in other domains. The empirical results generated by the framework seem useful for identifying important criteria for standard development such as the most frequently used features, the least frequently used features1, usage patterns and formats. The findings will also be helpful in identifying compliance issues associated with implementations supporting the standard under study. Furthermore, the results will be helpful for the development of interoperability test suites that target prevalent compliance issues, from a usage perspective. Thus, we believe that this framework will ultimately contribute to improved interoperability among tools and technologies in many verticals.
The remainder of the paper is organized as follows: Section 2 discusses related work in standards and localization, culminating in a section devoted to XLIFF. This provides a context for the XLIFF running example used in this paper. Section 3 describes the methodology underlying our framework, illustrating it by detailing the data collection performed in our XLIFF study, and the data analysis performed. Section 4 presents the experimental results derived from our illustration which are then discussed in Section 5. This section also outlines some more general limitations of evaluating standards in this fashion. Finally, the paper concludes with a summary and pointers to future work in Section 6.
Section snippets
Related work
Standards are crucial to achieve significant aspects of interoperability among diverse systems [9]. In this review, we focus on the evaluation of data-exchange standards. Specifically the review briefly focuses on research that has considered the end-user usage of standards, as this is a core consideration for XML-DIUE. Subsequently, the review targets research evaluating XLIFF, as it is the subject standard for our illustratory case study.
The XML-DIUE framework3
This section presents the XML Data Interoperability Usage Evaluation (XML-DIUE) framework.
The Parser-Repository-Analysis architecture proposed in the 90s for reengineering toolkits [25], [26] is appropriate here based on the similarity of the concerns. Both involve static analysis (parsing) of structured documents, followed by viewing of interesting information derived from parse-based analysis. The Parser-Repository-Analysis architecture (see Fig. 1) isolates the parsing components from the
An XLIFF study based on XML-DIUE
In this section we expand upon the case study we are using to illustrate the XML-DIUE framework. This study focuses on XLIFF, a data-exchange standard from the localization domain. Section 3 has referred to how the data was captured for this case-study. In this section we discuss the specific research questions for the case study, in the context of the representative research questions suggested for the XML-DIUE framework, and present the associated results and discussion.
Table 3 is a re-print
Limitations of the current framework
This paper has illustrated the potential of the XML-DIUE framework to help discover ‘bloat’ in standards' implementations and standards specification. It has also shown its utility in identifying important features of standards that are core to successful interoperability.
However, there are several limitations of the current framework and the associated illustration using XLIFF. The external validity of our XLIFF study may be considered low, mainly due to the lack of representativeness of our
Conclusion and future work
This paper describes a novel empirical framework for the usage analysis of a corpus of standard XML documents. The paper illustrates the utility of the framework by focusing on usage of the XLIFF standard, and its effect on interoperability.
The research has shown the potential utility of XML-DIUE. It has illustrated how the framework can help discover some of the limitations of XML-based data-exchange standards as used, identifying possible improvements with regard to data-exchange
Acknowledgements
This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation www.cngl.ie at the Localisation Research Centre (Department of Computer Science and Information Systems), University of Limerick, Limerick, Ireland. It was also supported, in part, by Science Foundation Ireland grant 10/CE/I1855 and Science Foundation Grant 12/IP/1351 to Lero - the Irish Software Engineering Research Centre www.lero.ie. Thanks to Dr. Ian
Dr. Asanka Wasala. Asanka Wasala is a Postdoctoral Researcher and the Lead Developer at the Localisation Research Centre at University of Limerick, Ireland. He is also a voting member of the OASIS XML Localization Interchange File Format (XLIFF) Standard Technical Committee. He has published in many different areas including Natural Language Processing, Software Localization, Data Exchange Standards, and Speech Processing. In 2004, Asanka graduated from the prestigious University of Colombo,
References (39)
- et al.
An evaluation and selection framework for interoperability standards
Inf. Softw. Technol.
(2008) - et al.
IEEE standard computer dictionary: Compilation of IEEE standard computer glossaries
(1991) - et al.
Improve the semantic interoperability of information
XML internationalization and localization, Sams White Book Series, Sams
Healthcare interoperability – lessons learned from the manufacturing standards sector
Focus: Standards and interoperability-the localization standards ecosystem
Multilingual
(2012)- et al.
A view of future technologies and challenges for the automation of localisation processes: Visions and scenarios
- et al.
Towards an open source localisation orchestration framework
Tradumàtica: traducció i tecnologies de la informaci ó i la comunicació
(2011) Lack of interoperability costs the translation industry a fortune
- et al.
Why standards are not enough to guarantee end-to-end interoperability
Casting the standards play – which are the roles?
Standardizing retail payment instruments, Information technology standards and standardization: a global perspective
Interoperability challenges for open standards: ODF and OOXML as examples
Evaluating the interoperability of document formats: ODF and OOXML
Google, Web authoring statistics
XLIFF-TC, Xliff 1.2 specification
Xml in localisation: A practical analysis
XLIFF: Theory and reality: Lessons learned by medtronic in 4 years of everyday xliff use
XLIFF-TC, XLIFF version 2.0
Cited by (8)
A novel business context-based approach for improved standards-based systems integration—a feasibility study
2022, Journal of Industrial Information IntegrationCitation Excerpt :The profiling process was performed in a manual, arbitrary, nonstandard way, without a possibility to support reliable and repeatable identification of message schema and component profiles for their reuse in well-defined integration use cases. Such process has a negative impact on integration and interoperability between business partners [15,16]. Traditionally, the DESes were syntax-dependent which made the profiling process even more challenging.
XML interoperability standards for seamless communication: An analysis of industry-neutral and domain-specific initiatives
2017, Computers in IndustryCitation Excerpt :Recent works, although aiming to review e-business interoperability frameworks [10], did not actually tackle domain-specific interoperability frameworks. Other works focus on a specific case study [11], or implementations following a certain standard specifications, as in: [12,13]. An up-to-date review or analysis of current advances of domain-specific initiatives for seamless communication is not available, although highly relevant.
A SOA Based E-Health Services Framework
2022, 2022 10th E-Health and Bioengineering Conference, EHB 2022Advancing Data Exchange Standards for Interoperable Enterprise Networks
2022, CEUR Workshop ProceedingsTowards an e-Government semantic interoperability assessment framework
2020, ACM International Conference Proceeding SeriesInteroperability framework for integrated e-health services
2020, Bulletin of Electrical Engineering and Informatics
Dr. Asanka Wasala. Asanka Wasala is a Postdoctoral Researcher and the Lead Developer at the Localisation Research Centre at University of Limerick, Ireland. He is also a voting member of the OASIS XML Localization Interchange File Format (XLIFF) Standard Technical Committee. He has published in many different areas including Natural Language Processing, Software Localization, Data Exchange Standards, and Speech Processing. In 2004, Asanka graduated from the prestigious University of Colombo, Sri Lanka, receiving the best student award (gold medal) and being the only person to obtain a First Class qualification that year. After completing his BSc in Physical Science, he worked in the PAN Localization project as a Senior Research Assistant at the University of Colombo School of Computing, Sri Lanka. He received a full scholarship by Microsoft Ireland to pursue his Masters degree in University of Limerick, before transferring to the PhD program in 2010. He completed his PhD thesis on identification of limitations of localization data exchange standards, in 2013. His thesis won the LRC Best Thesis Award (2013) and a prize from the Microsoft Ireland.
Dr. Jim Buckley. Jim Buckley obtained a honours BSc degree in Biochemistry from the University of Galway in 1989. In 1994 he was awarded an MSc degree in Computer Science from the University of Limerick and he followed this with a PhD in Computer Science from the same University in 2002. He currently works as a lecturer in the Computer Science and Information Systems Department at the University of Limerick, Ireland. His main research interests are in theories of information seeking, software reengineering and software maintenance. In this context, he has published actively in many peer-reviewed journals/conferences/workshops. His work has involved extensive collaboration with companies in the Financial services sector, the flood mapping sector and with IBM, with whom he was a Visiting Scientist from 2006-2010. He was general Chair of WCRE 2011 and currently coordinates 2 research projects at the University: one in the area of software feature location and the other in architecture-centric re-engineering and evolution.
Mr. Reinhard Schäler. Reinhard Schäler has been involved in the localisation industry in a variety of roles since 1987. He is the founder and editor of Localisation Focus – The International Journal of Localisation, a founding editor of the Journal of Specialised Translation (JosTrans), a former member of the editorial board of Multilingual Computing (Oct 97 to Jan 07, covering 70 issues), a founder and CEO of The Institute of Localisation Professionals (TILP), and a member of OASIS. He has attracted more than €5.5 m euro in research funding and has published more than 50 articles, book chapters and conference papers on language technologies and localisation. He has been an invited speaker at EU and international government-organised conferences in Africa, the Middle East, South America and Asia. He is a Principal Investigator in the Centre for Next Generation Localisation (CNGL), a lecturer at the Department of Computer Science and Information Systems (CSIS), University of Limerick, and the founder and director of the Localisation Research Centre (LRC) at UL, established in 1995. In 2009, he established The Rosetta Foundation and the Dynamic Coalition for a Global Localization Platform: Localization4all under the umbrella of the UN's Internet Governance Forum.
Dr. Chris Exton. Chris Exton is currently a lecturer in the department of Computer Science and Information Systems at University of Limerick and is the Research Director of the Localisation Research Centre. He holds a B.Sc. in Psychology and a Ph.D. in Computer Science from Monash University, Melbourne. He has worked extensively in the commercial software development field in a variety of different industries and countries included Software Engineering positions in Australia, Ireland and the UK, where he has worked in a number of diverse companies from electronic manufacturing to banking. In addition his academic positions includes a number of Schools and Departments including Monash University Australia, University College Dublin and Uppsala University, Sweden. He has worked on a number of research projects in the area of crowdsourcing, programmer psychology and software tools and more recently in the areas of software localisation and medical decision support systems. He has researched and published in these areas for over 15 years, as well as taking an active role as a reviewer for the International Journal of Software Maintenance and Evolution and The Software Quality Journal.