Elsevier

Computer Standards & Interfaces

Volume 42, November 2015, Pages 157-170
Computer Standards & Interfaces

An empirical framework for evaluating interoperability of data exchange standards based on their actual usage: A case study on XLIFF

https://doi.org/10.1016/j.csi.2015.05.006Get rights and content

Highlights

  • A framework that can evaluate practitioners' usage of data-exchange standards.

  • Specific usage-analyses to identify possible refinements of such standards.

  • An illustrative application of the framework on the XLIFF data-exchange standard.

  • Results indicating core features/candidates for deprecation/modularization in XLIFF.

Abstract

Data exchange formats play a prominent role in facilitating interoperability. Standardization of data exchange formats is therefore extremely important. In this paper, we present two contributions: an empirical framework called XML-DIUE, for evaluating data exchange format standards in terms of their usage and an illustration of this framework, demonstrating its ability to inform on these standards from their usage in practice. This illustration is derived from the localization domain and focuses on identifying interoperability issues associated with the usage of XML Localization Interchange File Format (XLIFF), an open standard data exchange format.

The initial results from this illustrative XLIFF study suggest the utility of the XML-DIUE approach. Specifically they suggest that there is prevalent ambiguity in the standard's usage, and that there are validation errors across 85% of the XLIFF files studied. The study also suggests several features for deprecation/modularization of the standard, in line with the XLIFF Technical Committee's deliberations, and successfully identifies the core features of XLIFF.

Introduction

The most widely used definition for interoperability is the definition by the IEEE [1]:

“Interoperability is the ability of two or more systems or components to exchange information and to use the information that has been exchanged.”

Interoperability is becoming increasingly important in heterogeneous environments as it facilitates the integration of different entities such as tools, businesses, technologies and processes. Data exchange formats play a prominent role in facilitating interoperability by providing agreed or standardized notations for storage and exchange of data. Data exchange formats that are based on the Extensible Markup Language (XML) are becoming ever-more pervasive [2], [3] and they can be categorized as either open or proprietary. Examples of popular XML-based open data exchange standards (also known as open file formats) include: XHTML, DOCBOOK, and Office Open XML (OOXML). However, the definition of such standards is an arduous time-consuming process due to the constantly evolving nature of the technologies, businesses, and tools in question [4]. That is, standards need to be constantly reviewed and updated to cope with the changing requirements, typically across multiple organizations.

In this paper, we propose a novel empirical framework that can be used as a tool to evaluate the usage of data-exchange, XML-based standards and thus inform on the development, maintenance and evolution of those standards. The utility of this framework is illustrated by applying it to the XML Localization Interchange File Format (XLIFF), an open standard for the exchange of localization data.

The XLIFF standard has been developed and is being maintained by a Technical Committee (TC) of OASIS and is an important standard for enabling interoperability in the localization domain (see Section 2.2.1 for more details on the XLIFF standard). It aims to enable the loss-less exchange of localization-relevant data and metadata between different tools and technologies across the localization process. XLIFF is gaining increased acceptance within the localization community [5], and is increasingly being used not just as an exchange format, but also as a container for the storage and transport of data [6], [7].

XLIFF version 2 was released on the 5th of August 2014 to provide solutions to various significant issues relating to the previous version of the XLIFF standard (version 1.2). However problems remain with respect to adoption, as confirmed by a study conducted in 2011, which revealed that lack of interoperability could cost language service providers more than 20% of their total translation budget. According to this study, the main cause for lack of interoperability is the “lack of compliance to interchange format standards” [8], a finding that suggests the standard may still be immature with respect to adopters' needs.

We aim to evaluate this potential immaturity issue by reporting on experiments where the usage of the XLIFF schema is assessed by our analytical framework. The framework will provide empirical evidence and statistics related to the actual usage of different elements, attributes and attribute values of this standard in-vivo.

More generically, this illustration demonstrates that the XML-DIUE framework proposed can also serve to address similar issues in XML based file format standards in other domains. The empirical results generated by the framework seem useful for identifying important criteria for standard development such as the most frequently used features, the least frequently used features1, usage patterns and formats. The findings will also be helpful in identifying compliance issues associated with implementations supporting the standard under study. Furthermore, the results will be helpful for the development of interoperability test suites that target prevalent compliance issues, from a usage perspective. Thus, we believe that this framework will ultimately contribute to improved interoperability among tools and technologies in many verticals.

The remainder of the paper is organized as follows: Section 2 discusses related work in standards and localization, culminating in a section devoted to XLIFF. This provides a context for the XLIFF running example used in this paper. Section 3 describes the methodology underlying our framework, illustrating it by detailing the data collection performed in our XLIFF study, and the data analysis performed. Section 4 presents the experimental results derived from our illustration which are then discussed in Section 5. This section also outlines some more general limitations of evaluating standards in this fashion. Finally, the paper concludes with a summary and pointers to future work in Section 6.

Section snippets

Related work

Standards are crucial to achieve significant aspects of interoperability among diverse systems [9]. In this review, we focus on the evaluation of data-exchange standards. Specifically the review briefly focuses on research that has considered the end-user usage of standards, as this is a core consideration for XML-DIUE. Subsequently, the review targets research evaluating XLIFF, as it is the subject standard for our illustratory case study.

The XML-DIUE framework3

This section presents the XML Data Interoperability Usage Evaluation (XML-DIUE) framework.

The Parser-Repository-Analysis architecture proposed in the 90s for reengineering toolkits [25], [26] is appropriate here based on the similarity of the concerns. Both involve static analysis (parsing) of structured documents, followed by viewing of interesting information derived from parse-based analysis. The Parser-Repository-Analysis architecture (see Fig. 1) isolates the parsing components from the

An XLIFF study based on XML-DIUE

In this section we expand upon the case study we are using to illustrate the XML-DIUE framework. This study focuses on XLIFF, a data-exchange standard from the localization domain. Section 3 has referred to how the data was captured for this case-study. In this section we discuss the specific research questions for the case study, in the context of the representative research questions suggested for the XML-DIUE framework, and present the associated results and discussion.

Table 3 is a re-print

Limitations of the current framework

This paper has illustrated the potential of the XML-DIUE framework to help discover ‘bloat’ in standards' implementations and standards specification. It has also shown its utility in identifying important features of standards that are core to successful interoperability.

However, there are several limitations of the current framework and the associated illustration using XLIFF. The external validity of our XLIFF study may be considered low, mainly due to the lack of representativeness of our

Conclusion and future work

This paper describes a novel empirical framework for the usage analysis of a corpus of standard XML documents. The paper illustrates the utility of the framework by focusing on usage of the XLIFF standard, and its effect on interoperability.

The research has shown the potential utility of XML-DIUE. It has illustrated how the framework can help discover some of the limitations of XML-based data-exchange standards as used, identifying possible improvements with regard to data-exchange

Acknowledgements

This research is supported by the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for Next Generation Localisation www.cngl.ie at the Localisation Research Centre (Department of Computer Science and Information Systems), University of Limerick, Limerick, Ireland. It was also supported, in part, by Science Foundation Ireland grant 10/CE/I1855 and Science Foundation Grant 12/IP/1351 to Lero - the Irish Software Engineering Research Centre www.lero.ie. Thanks to Dr. Ian

Dr. Asanka Wasala. Asanka Wasala is a Postdoctoral Researcher and the Lead Developer at the Localisation Research Centre at University of Limerick, Ireland. He is also a voting member of the OASIS XML Localization Interchange File Format (XLIFF) Standard Technical Committee. He has published in many different areas including Natural Language Processing, Software Localization, Data Exchange Standards, and Speech Processing. In 2004, Asanka graduated from the prestigious University of Colombo,

References (39)

  • J.A. Mykkänen et al.

    An evaluation and selection framework for interoperability standards

    Inf. Softw. Technol.

    (2008)
  • A. Geraci et al.

    IEEE standard computer dictionary: Compilation of IEEE standard computer glossaries

    (1991)
  • L. Weihua et al.

    Improve the semantic interoperability of information

  • Y. Savourel

    XML internationalization and localization, Sams White Book Series, Sams

  • S. Ray

    Healthcare interoperability – lessons learned from the manufacturing standards sector

  • D. Filip

    Focus: Standards and interoperability-the localization standards ecosystem

    Multilingual

    (2012)
  • L. Aouad et al.

    A view of future technologies and challenges for the automation of localisation processes: Visions and scenarios

  • A. Wasala et al.

    Towards an open source localisation orchestration framework

    Tradumàtica: traducció i tecnologies de la informaci ó i la comunicació

    (2011)
  • TAUS

    Lack of interoperability costs the translation industry a fortune

  • G. Lewis et al.

    Why standards are not enough to guarantee end-to-end interoperability

  • E. Soderstrom

    Casting the standards play – which are the roles?

  • S.L. Lelieveldt

    Standardizing retail payment instruments, Information technology standards and standardization: a global perspective

  • R. Shah et al.

    Interoperability challenges for open standards: ODF and OOXML as examples

  • R. Shah et al.

    Evaluating the interoperability of document formats: ODF and OOXML

  • Google, Web authoring statistics

  • XLIFF-TC, Xliff 1.2 specification

  • R. Raya

    Xml in localisation: A practical analysis

  • M. Bly

    XLIFF: Theory and reality: Lessons learned by medtronic in 4 years of everyday xliff use

  • XLIFF-TC, XLIFF version 2.0

  • Cited by (8)

    • A novel business context-based approach for improved standards-based systems integration—a feasibility study

      2022, Journal of Industrial Information Integration
      Citation Excerpt :

      The profiling process was performed in a manual, arbitrary, nonstandard way, without a possibility to support reliable and repeatable identification of message schema and component profiles for their reuse in well-defined integration use cases. Such process has a negative impact on integration and interoperability between business partners [15,16]. Traditionally, the DESes were syntax-dependent which made the profiling process even more challenging.

    • XML interoperability standards for seamless communication: An analysis of industry-neutral and domain-specific initiatives

      2017, Computers in Industry
      Citation Excerpt :

      Recent works, although aiming to review e-business interoperability frameworks [10], did not actually tackle domain-specific interoperability frameworks. Other works focus on a specific case study [11], or implementations following a certain standard specifications, as in: [12,13]. An up-to-date review or analysis of current advances of domain-specific initiatives for seamless communication is not available, although highly relevant.

    • A SOA Based E-Health Services Framework

      2022, 2022 10th E-Health and Bioengineering Conference, EHB 2022
    • Towards an e-Government semantic interoperability assessment framework

      2020, ACM International Conference Proceeding Series
    • Interoperability framework for integrated e-health services

      2020, Bulletin of Electrical Engineering and Informatics
    View all citing articles on Scopus

    Dr. Asanka Wasala. Asanka Wasala is a Postdoctoral Researcher and the Lead Developer at the Localisation Research Centre at University of Limerick, Ireland. He is also a voting member of the OASIS XML Localization Interchange File Format (XLIFF) Standard Technical Committee. He has published in many different areas including Natural Language Processing, Software Localization, Data Exchange Standards, and Speech Processing. In 2004, Asanka graduated from the prestigious University of Colombo, Sri Lanka, receiving the best student award (gold medal) and being the only person to obtain a First Class qualification that year. After completing his BSc in Physical Science, he worked in the PAN Localization project as a Senior Research Assistant at the University of Colombo School of Computing, Sri Lanka. He received a full scholarship by Microsoft Ireland to pursue his Masters degree in University of Limerick, before transferring to the PhD program in 2010. He completed his PhD thesis on identification of limitations of localization data exchange standards, in 2013. His thesis won the LRC Best Thesis Award (2013) and a prize from the Microsoft Ireland.

    Dr. Jim Buckley. Jim Buckley obtained a honours BSc degree in Biochemistry from the University of Galway in 1989. In 1994 he was awarded an MSc degree in Computer Science from the University of Limerick and he followed this with a PhD in Computer Science from the same University in 2002. He currently works as a lecturer in the Computer Science and Information Systems Department at the University of Limerick, Ireland. His main research interests are in theories of information seeking, software reengineering and software maintenance. In this context, he has published actively in many peer-reviewed journals/conferences/workshops. His work has involved extensive collaboration with companies in the Financial services sector, the flood mapping sector and with IBM, with whom he was a Visiting Scientist from 2006-2010. He was general Chair of WCRE 2011 and currently coordinates 2 research projects at the University: one in the area of software feature location and the other in architecture-centric re-engineering and evolution.

    Mr. Reinhard Schäler. Reinhard Schäler has been involved in the localisation industry in a variety of roles since 1987. He is the founder and editor of Localisation Focus – The International Journal of Localisation, a founding editor of the Journal of Specialised Translation (JosTrans), a former member of the editorial board of Multilingual Computing (Oct 97 to Jan 07, covering 70 issues), a founder and CEO of The Institute of Localisation Professionals (TILP), and a member of OASIS. He has attracted more than €5.5 m euro in research funding and has published more than 50 articles, book chapters and conference papers on language technologies and localisation. He has been an invited speaker at EU and international government-organised conferences in Africa, the Middle East, South America and Asia. He is a Principal Investigator in the Centre for Next Generation Localisation (CNGL), a lecturer at the Department of Computer Science and Information Systems (CSIS), University of Limerick, and the founder and director of the Localisation Research Centre (LRC) at UL, established in 1995. In 2009, he established The Rosetta Foundation and the Dynamic Coalition for a Global Localization Platform: Localization4all under the umbrella of the UN's Internet Governance Forum.

    Dr. Chris Exton. Chris Exton is currently a lecturer in the department of Computer Science and Information Systems at University of Limerick and is the Research Director of the Localisation Research Centre. He holds a B.Sc. in Psychology and a Ph.D. in Computer Science from Monash University, Melbourne. He has worked extensively in the commercial software development field in a variety of different industries and countries included Software Engineering positions in Australia, Ireland and the UK, where he has worked in a number of diverse companies from electronic manufacturing to banking. In addition his academic positions includes a number of Schools and Departments including Monash University Australia, University College Dublin and Uppsala University, Sweden. He has worked on a number of research projects in the area of crowdsourcing, programmer psychology and software tools and more recently in the areas of software localisation and medical decision support systems. He has researched and published in these areas for over 15 years, as well as taking an active role as a reviewer for the International Journal of Software Maintenance and Evolution and The Software Quality Journal.

    View full text