Abstract
Increasingly, taxonomies are being developed and used by industry practitioners to facilitate information interoperability and retrieval. Within a single industrial domain, there exist many taxonomies that are intended for different applications. Industry specific taxonomies often represent the vocabularies that are commonly used by the practitioners. Their jobs are multi-faceted, which include checking for code and regulatory compliance. As such, it will be very desirable if industry practitioners are able to easily locate and browse regulations of interest. In practice, multiple sources of government regulations exist and they are often organized and classified by the needs of the issuing agencies that enforce them rather than the needs of the communities that use them. One way to bridge these two distinct needs is to develop methods and tools that enable practitioners to browse and retrieve government regulations using their own terms and vocabularies, for example, via existing industry taxonomies. The mapping from a single taxonomy to a single regulation is a trivial keyword matching task. We examine a relatedness analysis approach for mapping a single taxonomy to multiple regulations. We then present an approach for mapping multiple taxonomies to a single regulation by measuring the relatedness of concepts. Cosine similarity, Jaccard coefficient and market basket analysis are used to measure the semantic relatedness between concepts from two different taxonomies. Preliminary evaluations of the three relatedness analysis measures are performed using examples from the civil engineering and building industry. These examples illustrate the potential benefits of regulatory usage from the mapping between various taxonomies and regulations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Al-Kofahi K, Tyrrell A, Vachher A et al (2001) A machine learning approach to prior case retrieval. In: Proceedings of the 8th international conference on artificial intelligence and law (ICAIL 2001), St. Louis, Missouri, pp 88–93
Aumueller D, Do HH, Massmann S et al (2005) Schema and ontology matching with COMA++. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, Baltimore, Maryland, pp 906–908
Begley EF, Palmer ME, Reed KA (2005) Semantic mapping between IAI ifcXML and FIATECH AEX Models for centrifugal pumps. Technical Report NISTIR 7223, NIST
Bench-Capon TJM (1991) Knowledge based systems and legal applications. Academic Press Professional, Inc., San Diego, CA
Bicer V, Laleci G, Dogac A et al (2005) Artemis message exchange framework: semantic interoperability of exchanged messages in the healthcare domain. ACM Sigmod Rec 34(3):71–76
Bonnel N, Lemaire V, Cotarmanac’h A et al (2006) Effective organization and visualization of Web search results. In: Proceedings of the 24th IASTED international conference on internet and multimedia systems and applications, Innsbruck, Austria, pp 209–216
Brüninghaus S, Ashley KD (2001) Improving the representation of legal case texts with information extraction methods. In: Proceedings of the 8th international conference on artificial intelligence and law (ICAIL 2001), St. Louis, Missouri, pp 42–51
Brunnermeier SB, Martin SA (2002) Interoperability costs in the US automotive supply chain. Supply Chain Manag 7(2):71–82
Cheng CP, Lau GT, Law KH (2007) Mapping regulations to industry-specific taxonomies. In: Proceedings of the 11th international conference on artificial intelligence and law (ICAIL), Stanford, CA, USA
Cheng CP, Lau GT, Pan J et al (2008a) Domain-specific ontology mapping by corpus-based semantic similarity. In: Proceedings of 2008 NSF CMMI engineering research and innovation conference, Knoxville, TN, USA
Cheng CP, Pan J, Lau G T et al (2008b) Relating taxonomies with regulations. In: Proceedings of the 9th annual international conference on digital government research (dg.o2008), Montreal, Canada
Construction Specifications Institute (2006) OmniClass Construction Classification System, Edition 1.0, Alexandria, Virginia
Crowley A, Watson A (2000) CIMsteel Integration Standards Release 2. SCI-P-268, the Steel Construction Institute, Berkshire, England
de Bruijn J, Martin-Recuerda F, Manov D et al (2004) State-of-the-art Survey on Ontology Merging and Aligning V1. Technical Report, D4.2.1 (WP4), EU-IST Integrated Project (IP) IST-2003-506826 SEKT, EU
Fountain JE (2002) Information institutions and governance: advancing a Basic Social Science Research Program for Digital Government. Technical Report, National Center for Digital Government, John F. Kennedy School of Government, Harvard University
Gallaher MP, O’Connor AC, Dettbarn JL et al (2004) Cost analysis of inadequate inoperability in the capital facilities industry. Technical Report, GCR 04-867, National Institute of Standards and Technology (NIST)
Gruber TR (1995) Toward principles for the design of ontologies used for knowledge sharing. Int J Hum-Comput Stud 43(5/6):907–928
Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, New York, NY
International Alliance for Interoperability (1997) Guidelines for the development of Industry Foundation Classes (IFC). IAI, Muenchen, Germany
International Conference of Building Officials (2006) International Building Code 2006. Whittier, CA
Kerrigan S (2003) A software infrastructure for regulatory information management and compliance assistance. Ph.D. Thesis, Department of Civil and Environmental Engineering, Stanford University, Stanford, CA
Kerrigan S, Law K (2003) Logic-based regulation compliance-assistance. In: Proceedings of the 9th international conference on artificial intelligence and law (ICAIL 2003), Edinburgh, Scotland, pp 126–135
Kim M-C, Choi K-S (1999) A comparison of collocation-based similarity measures in query expansion. Inf Process Manag 35(1):19–30
Larsen B, Aone C (1999) Fast and effective text mining using linear-time document clustering. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, CA, pp 16–22
Lau G (2004) A comparative analysis framework for semi-structured documents, with applications to government regulations. Ph.D. Thesis, Civil and Environmental Engineering, Stanford University, Stanford, CA
Lau G, Law K, Wiederhold G (2005) Legal information retrieval and application to E-rulemaking. In: Proceedings of the 10th international conference on artificial intelligence and law (ICAIL 2005), Bologna, Italy, pp 146–154
Li J (2004) LOM: a Lexicon-based Ontology Mapping Tool. In: Proceedings of the information interpretation and integration conference (I3CON) and the performance metrics for intelligent systems (PerMIS) workshop, Gaithersburg, MD
Li W, Clifton C, Liu S (2000) Database integration using neural network: implementation and experiences. Knowl Inf Syst 2(1):73–96
Lipman R (2006) Mapping between the CIMsteel Integration Standards (CIS/2) and Industry Foundation Classes (IFC) Product Model for Structural Steel. In: Proceedings of the 11th international conference on computing in civil and building engineering, (ICCCBE XI), Montreal, Canada, pp 3087–3096
Madhavan J, Bernstein PA, Rahm E (2001) Generic schema matching with cupid. In: Proceedings of the 27th international conference on very large data bases (VLDB), Rome, Italy, pp 49–58
Melnik S, Garcia-Molina H, Rahm E (2002) Similarity flooding: a versatile graph matching algorithm. In: Proceedings of the 18th international conference on data engineering (ICDE), San Jose, CA, pp 117–128
Miller GA, Beckwith R, Fellbaun C et al (1993) Five papers on WordNet. Technical Report, Cognitive Science Laboratory, Princeton, NJ
Milo T, Zohar S (1998) Using schema matching to simplify heterogeneous data translation. In: Proceedings of the 24th international conference on very large data bases, New York, NY, pp 122–133
Mitra P (2003) An algebraic framework for the interoperation of ontologies. Ph.D. Thesis, Computer Science Department, Stanford University, Stanford, CA
Moens M-F, Uyttendaele C, Dumortier J (1997) Abstracting of legal cases: the SALOMON experience. In: Proceedings of the 6th international conference on artificial intelligence and law (ICAIL 1997), Melbourne, Australia, pp 114–122
Noy NF (2003) Tools for mapping and merging ontologies. In: Staab S, Stude R (eds) Handbook on ontologies. Springer-Verlag, Berlin, Heidelberg, pp 365–384
Noy NF, Musen MA (2003) The PROMPT suite: interactive tools for ontology merging and mapping. Int J Hum-Comput Stud 59(6):983–1024
Palopoli L, Sacca D, Terracina G et al (1999) A unified graph-based framework for deriving nominal interscheme properties, type conflicts and object cluster similarities. In: Proceedings of the 4th IFCIS international conference on cooperative information systems (CoopIS), Edinburgh, Scotland, pp 34–45
Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137
Ray SR (2002) Interoperability standards in the semantic web. J Comput Inf Sci Eng 2(1):65–69
Roussinov D, Zhao JL (2003) Automatic discovery of similarity relationships through web mining. Decis Support Syst 25:149–166
Salton G (1989) Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley Longman Publishing Co., Inc, Boston, MA, USA
Schweighofer E, Rauber A, Dittenbach M (2001) Automatic text Representation, classification and labeling in European Law. In: Proceedings of the 8th international conference on artificial intelligence and law (ICAIL 2001), St. Louis, Missouri, pp 78–87
Stumme G, Maedche A (2001) Ontology merging for federated ontologies on the semantic web. In: Proceedings of the international workshop on foundations of models for information integration (FMII 2001), Seattle, WA, pp 16–18
Thompson P (2001) Automatic categorization of case law. In: Proceedings of the 8th international conference on artificial intelligence and law (ICAIL 2001), St. Louis, Missouri, pp 70–77
van Hage W, Katrenko S, Schreiber G (2005) A method to combine linguistic ontology-mapping techniques. In Proceedings of the fourth international semantic web conference (ISWC), Galway, Ireland
Wang H, Akinci B, Garrett JH (2007) Formalism for detecting version differences in data models. J Comput Civil Eng 21(5):321–330
Acknowledgments
The authors would like to acknowledge the supports by the US National Science Foundation, Grant No. CMS-0601167 and IIS-0811460, the Center for Integrated Facility Engineering (CIFE) at Stanford University, and the Enterprise Systems Group at the National Institute of Standards and Technology (NIST). The authors would like to thank the International Code Council (ICC) for providing the XML version of the International Building Code 2006. Any opinions and findings are those of the authors, and do not necessarily reflect the views of NSF, CIFE, NIST, or ICC. No approval or endorsement of any commercial product by NIST, NSF, ICC, or Stanford University is intended or implied.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheng, C.P., Lau, G.T., Law, K.H. et al. Regulation retrieval using industry specific taxonomies. Artif Intell Law 16, 277–303 (2008). https://doi.org/10.1007/s10506-008-9065-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10506-008-9065-5