Abstract
Data standards are a powerful, real-world tool for enterprise interoperability, yet there exists no well-grounded methodology for selecting among alternative standards approaches. We focus on a specific sub-problem within a community’s data sharing challenge and identify four major standards-based approaches to that task. We present characteristics of a data sharing community that one should consider in selecting a standards approach—such as relative power, motivation level, and technical sophistication of different participants—and illustrate with real-world examples. These characteristics and other factors are then analyzed to develop decision rules for selecting among the four approaches. Independent of the data exchange problem, we suggest two general practices in choosing a standards approach: (1) vertical decomposition of interoperability issues, in order to define a narrow, formal, tractable problem, and (2) option-exclusion rules, as they are much simpler than stating optimal-choice rules.
Similar content being viewed by others
Notes
The list is not exhaustive, but seems to cover the cases we have seen in practice.
It is not sufficient that the source collects information that is vaguely similar, and given the same name. For example, doctors might report DrugDosage or Exercise as quantity prescribed, while safety researchers want to know the actual amount consumed or performed. As another example, car manufacturers advertise list price, but economists and shopping sites want to know actual sale prices.
A data exchange standard might make certain data mandatory—but this compels sources that do currently lack that information to reject the standard, losing its benefits for data they do have. Incentives and decisions to capture more information should be separate from the exchange standard.
http://niem.gov. NIEM is widely used for data exchanges across agencies of the U.S. government.
For example, for coding a patient’s problems, some ICD-9 values lack distinctions (was an arm injury to left or right arm?) needed for ICD 10 and vice versa. As another example, transforms may be impossible for certain financial data produced by public companies using the Generally Accepted Accounting Principles (GAAP) Taxonomy using the eXtensible Business Reporting Language (XBRL) (Zhu and Wu 2011a).
The Enriched idea also makes sense with flexible encoding, but is omitted in order to avoid obscuring the main arguments.
The story is symmetric for the sub-transform from message schema to consumer. Note that matchings are directional; in addition to “same as”, one can have “is usable for”, a kind of generalization.
The tool limitations described in the previous section still apply. In fact, mature tools comparable to InfoSphere seem some distance off; the ontology community has not adapted the subtle transform semantics pioneered in (Fagin et al. 2003) when multiple producer objects map to multiple consumer objects.
References
Bernstein PA, Haas LM (2008) Information integration in the enterprise. Commun ACM 51(9):72–79
Bernstein PA, Melnik S, Quix C, Petropoulos M (2004) Industrial-strength schema matching. ACM SIGMOD Rec 33(4):38–43
Boh WF, Soh C, Yeo S (2007) Standards development and diffusion: a case study of RosettaNet. Commun ACM 50(12):57–62
Doan A, Domingos P, Halevy A (2003) Learning to match schemas of data sources: a multistrategy approach. Mach Learn 50(3):279–301
Fagin R, Kolaitis P, Miller RJ, Popa L (2003) Data exchange: semantics and query answering. In: 9th International conference on database theory. Springer, Siena, Italy, pp 207–224
Folmer EJA (2012) Quality of semantic standards. PhD thesis, University of Twente. CTIT PhD-thesis series No 13-251. ISBN 978-90-365-3323-2
Folmer E, Luttighuis PO, van Hillegersberg J (2011) Do semantic standards lack quality? A survey among 34 semantic standards. Electron Mark 21(2):99–111
Goh CH, Bressan S, Madnick S, Siegel M (1999) Context interchange: new features and formalisms for the intelligent integration of information. ACM TOIS 17(3):270–293
Hart R, Saunders C (1997) Power and trust: critical factors in the adoption and use of electronic data interchange. Organ Sci 8(1):23–42
Helmer KG, Ambite JL, Ames J, Ananthakrishnan R, Burns G, Chervenak AL, Foster I, Liming L, Keator D, Macciardi F, Madduri R, Navarro JP, Potkin S, Rosen B, Ruffins S, Schuler R, Turner JA, Toga A, Williams C, Kesselman C (2011) Enabling collaborative research using the biomedical informatics research network (BIRN). J Am Med Inform Assoc 18(4):416–422
Madnick SE, Zhu H (2006) Improving data quality with effective use of data semantics. Data Knowl Eng 59(2):460–475
Markus ML (1983) Power, politics, and MIS implementation. Commun ACM 26(6):430–444
Markus ML, Steinfield CW, Wigand RT, Minton G (2006) Industry-wide information systems standardization as collective action: the case of the U.S. residential mortgage industry. MIS Q 30 (Special Issue):439–465
Martinelly CD, Riane F, Guinet A (2009) A Porter-SCOR modelling approach for the hospital supply chain. Int J Logist Syst Manag 5(3–4):436–456
Miller R, Hernandez M, Haas L, Yan L, Ho C, Fagin R, Popa L (2001) The Clio project: managing heterogeneity. SIGMOD Rec 30(1):78–83
Osborn CS, Madnick SE, Wang RY (1990) Motivating strategic alliance for composite information systems: the case of a major regional hospital. J Manag Inf Syst 6(3):99–117
Polites GL, Karahanna E (2012) Shackled to the status quo: the inhibiting effects of incumbent system habit, switching costs, and inertia on new system acceptance. MIS Q 36(1):21–42
Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350
Rosenthal A, Seligman L, Renner S (2004) From semantic integration to semantics management: case studies and a way forward. ACM SIGMOD Rec 33(4):44–50
Rosenthal A, Seligman L, Blaustein B (2006) Beyond the sandbox: how integration researchers can actually help integration. In: Workshop on information integration. University of Pennsylvania
Sarma AD, Dong X, Halevy A (2008) Bootstrapping pay-as-you-go data integration systems. In: ACM SIGMOD international conference on management of data (SIGMOD ‘08). ACM, Vancouver, pp 861–874
Sciore E, Siegel M, Rosenthal A (1994) Using semantic values to facilitate interoperability among heterogeneous information systems. ACM TODS 19(2):254–290
Seligman L, Rosenthal A, Lehner P, Smith A (2002) Data integration: where does the time go? IEEE Bull Tech Comm Data Eng 25(3):3–10
Shvaiko P, Euzenat J (2005) A survey of schema-based matching approaches. In: Spaccapietra S (ed) Journal on data semantics IV (LNCS 3730). Springer, Berlin, pp 146–171
Wigand RT, Steinfield CW, Markus ML (2005) Information technology standards choices and industry structure outcomes: the case of the U.S. home mortgage industry. J Manag Inf Syst 22(2):165–192
Xia M, Zhao K, Mahoney JT (2012) Enhancing value via cooperation: firms’ process benefits from participation in a standard consortium. Ind Corp Change 21(3):699–729
Zhao K, Xia M, Shaw MJ (2005) Vertical e-business standards and standards developing organizations: a conceptual framework. Electron Mark 15(4):289–300
Zhao K, Xia M, Shaw MJ (2007) An integrated model of consortium-based e-business standardization: collaborative development and adoption with network externalities. J Manag Inf Syst 23(4):247–271
Zhao K, Khan SS, Xia M (2011a) Sustainability of vertical standard consortia as communities of practice—a multi-level framework. Int J Electron Commer 16(1):11–40
Zhao K, Xia M, Shaw MJ (2011b) What motivates firms to contribute to consortium-based e-business standardizations? J Manag Inf Syst 28(2):305–334
Zhu H, Wu H (2011a) Interoperability of XBRL financial statements in the U.S. Int J E-Bus Res 7(2):18–33
Zhu H, Wu H (2011b) Quality of data standards: framework and illustration using XBRL taxonomy and instances. Electron Mark 21(2):129–139
Acknowledgments
We thank Kim Warren and the MITRE Innovation Program for funding this effort. We also thank Rob McCready for helpful comments on health IT data standards.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rosenthal, A., Seligman, L., Allen, M.D. et al. Fit for purpose: engineering principles for selecting an appropriate type of data exchange standard. Inf Syst E-Bus Manage 12, 495–515 (2014). https://doi.org/10.1007/s10257-014-0238-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10257-014-0238-3