An XML-based methodology for parametric temporal database model implementation

https://doi.org/10.1016/j.jss.2007.08.018Get rights and content

Abstract

Parametric data model is one of dimensional data models. It defines attributes as functions, modeling a real world object into a single tuple in a database. Such one-to-one correspondence between an object in the real world and a tuple provides various advantages in modeling dimensional data, avoiding self-joins which frequently appear in temporal data models which fragment an object into multiple tuples. Despite its modeling advantages, it is impractical to implement the parametric data model on top of conventional database systems because of the data model’s variable attribute sizes. However, such implementation challenge can be resolved by XML because XML is flexible for data boundaries. In this paper, we present an XML-based implementation methodology for the parametric temporal data model. In our implementation, we develop our own XML storage called CanStoreX (Canonical Storage for XML) and build the temporal database system on top of the CanStoreX.

Introduction

Conventional databases are only capable of storing and querying the current perception of reality and relationship among objects, such database systems exclude old data values when the data is updated. Unlike conventional databases, a temporal database is capable of storing evolution of data, thereby allowing users to examine complete object histories (Bhargava, 1989). Applications of temporal databases include finance, organizational record-keeping, and scientific applications (Jensen and Snodgrass, 1999).

Temporal databases are one of active research areas in the database community and tremendous research work has been done by many researchers. According to Jensen and Snodgrass’s survey (1999), more than 2000 research papers on temporal databases had been published. There are many different types of temporal data models and they have their own merits in their specific applications.

In the temporal database literature, we can find three types of timestamps—instants, intervals, and temporal elements. Based on timestamps, temporal data models are termed point-based models, interval-based models, and temporal element-based models, respectively.

It is important to emphasize that only temporal elements are legitimate domains of objects and events in the real world. However, a temporal element cannot be represented using a fixed length because it can have any number of intervals. Therefore, in many approaches, intervals or instants are used to timestamp fragments of objects that are stored in multiple fixed-length tuples. Typically, such timestamps are attached at tuple level rather than value level.

In our implementation, we consider the parametric temporal data model which uses temporal elements for timestamps. The parametric temporal data model defines attributes as functions of time. Because of this, it can model an object in the real world into a single tuple in databases. Such one-to-one correspondence between an object and a tuple can avoid self-joins which are inevitable in temporal data models which fragment an object into multiple tuples. However, such modeling advantage increases the implementation complexity because of variable attribute sizes. In particular, it is difficult to adapt the parametric temporal data model within relational database systems, which are the most popular approaches for temporal database implementation.

In order to meet the implementation challenge, we use XML and develop our own XML-based storage technology called CanStoreX (Canonical Storage for XML). CanStoreX is capable of paginating large XML data into small chunks of pages. Many artifacts, such as parse and expression trees are also represented in XML, which are human readable and more reliable by removing the reliance on traditional linked list-based trees. We also show how to utilize XML in developing a query execution engine. The primitives for the query language are implemented using the DOM parser. We use nine queries to validate the implemented system.

Our XML-based implementation approach is different from other approaches in that we do not use existing conventional database systems. It is a native temporal database system built on top of our own storage technology. Furthermore, our approach overcomes the implementation challenges of the parametric temporal data model, which has debated since the mid 1980s. Therefore, we can expect that our implementation paradigm provides an elegant solution for the implementation of complex data models which have difficulties to use existing database systems.

The organization of this paper is as follows. Section 2 briefly discusses various approaches to implementing temporal database systems. Section 3 discusses XML’s usability and some possible directions for the parametric temporal data model implementation. Section 4 discusses the parametric temporal data model. Section 5 introduces ParaSQL—the query language of the parametric temporal data model. Section 6 explains our XML-based implementation methodology. Section 7 discusses the system performance for a suite of ParaSQL queries. Section 8 summarizes our findings.

Section snippets

Related work

Relational databases are the most popular commercial databases and frequently used to implement temporal databases. Major relational database vendors have introduced technologies that allow their systems to be extended with software plug-ins developed by users or third-party vendors, such as Infomix’s DataBlades, DB2’s Extenders, and Oracle’s Cartridges.

Yang et al. (2000) introduced TIP (Temporal Information Processor) that is an extension of the Informix database system based on its DataBlade

XML for the parametric temporal data model

Defining attributes as functions leads to one-to-one correspondence between an object in the real world and a tuple in a database. Such a property of the data model, however, makes it impractical to utilize existing databases including storage technologies. For example, suppose that John’s salary has increased by $1000 every year while Tom’s salary has remained same. Although they are in the same relation, their salary attributes have different sizes, leading to different tuple sizes. In

Parametric temporal data model

In this section, we will discuss the general concept of the parametric temporal data model. In the parametric data model, attributes are defined as functions and temporal elements are used to represent domains of functions.

Fig. 1 shows an example of a temporal parametric relation whose attributes are Name, Salary, and DName (department name). The relation maintains the history of employees. Based on the example, we will discuss the concept of temporal elements, temporal attribute values,

Parametric structured query language

There are three types of expressions in ParaSQL: relational expression, domain expression, and boolean expression. They evaluate relations, temporal elements, and boolean values, respectively. These three expressions are mutually recursive, in other words, one expression can be nested by another expression. In this section, we will discuss the three expressions with simplified BNF forms, user friendless (Gadia and Nair, 1993, Gadia and Chopra, 1993), and examples of ParaSQL queries.

System architecture

In this section, we will discuss the system architecture of an XML-based temporal database system. The database storage and all internal exchange of data are XML-based. Fig. 6 shows the three-layer system architecture of the framework (Noh and Gadia, 2005, Noh and Gadia, 2004). Layer-1 handles ParaSQL queries. It parses queries and generates expression trees. Layer-2 executes expression trees using DOM API. Parse trees and expression trees are all XML documents. Layer-3 is a storage manager

Performance evaluation

In this section, we will discuss the test data configuration, the proposed XML schema’s effectiveness, and experimental results for the nine ParaSQL queries.

Conclusion

In this paper, we have introduced an implementation methodology for the parametric temporal data model. Our approach is different from other approaches because we have developed our own storage technology and query language, rather than using existing storage technologies including relational, object-oriented, and native XML database systems.

In addition to XML’s usability in modeling dimensional data, we have noted that parse and expression trees can be represented in XML. Since XML models

Acknowledgement

We would like to thank the anonymous reviewers for their valuable comments on this paper.

Seo-Young Noh received the Ph.D. degree in computer science from Iowa State University, USA, in 2006. He is currently a senior research engineer at the LG Electronics Institute of Technology. His research interests include spatiotemporal databases, XML, database system implementation, embedded database systems, and natural language processing in database systems.

References (26)

  • S.-Y. Noh et al.

    A comparison of two approaches to utilizing XML in parametric databases for temporal data

    Information and Software Technology

    (2006)
  • Apache Foundation, 2004, Apache Xindice. URL<http://xml.apache.org/xindice/>...
  • Barbosa, D., Mendelzon, A.O., Keenleyside, J., Lyons, K.A., 2002. ToXgene: an extensible template-based data generator...
  • Bertino, E., Bevilacqua, M., Ferrari, E., Guerrini, G., 1997. Approaches to Handling Temporal Data in Object-oriented...
  • Bhargava, G., 1989. A 2-dimensional Temporal Relational Database Model for Querying Errors and Updates, and for...
  • Chen, C.X., Kong, J., Zaniolo, C., 2003. Design and implementation of a temporal extension of SQL. In: Proceedings of...
  • S.K. Gadia

    A homogeneous relational model and query languages for temporal databases

    ACM Transactions on Database Systems

    (1988)
  • S.K. Gadia et al.

    A relational model and SQL-like query language for spatial databases

  • S.K. Gadia et al.

    Temporal databases: a prelude to parametric data

  • S.K. Gadia et al.

    Algebraic identities and query optimization in a parametric model for relational temporal databases

    IEEE Transactions on Knowledge and Data Engineering

    (1998)
  • Gadia, S.K., Vaishnav, J.H., 1985. A query language for a homogeneous temporal database. In: Proceedings of the Fourth...
  • Hubler, P., Edelweiss, N., 2000. Implementing a temporal database on top of a conventional database: Mapping of the...
  • IBM XML Generator, XML generator. URL http://www.alphaworks.ibm.com/tech/xmlgenerator/ (December...
  • Cited by (13)

    • Subset, subquery and queryable-visualization in parametric big data model

      2021, International Journal of Information Management Data Insights
      Citation Excerpt :

      The ParaDB comes equipped with an efficient and simplified SQL-like query language, called the parametric structure query language (ParaSQL) (Gadia S & Nair S, 1998; Seo-Young, 2006;Sharma & Gadia, 2019). A working prototype (Narayanan, 2009; Seo-Young, S & Ma, 2008) of the ParaDB has already been developed and is validated for various interesting use cases with a comprehensive spatio-temporal NC-94 big dataset (NCRA (North Central Regional Association of State Agricultural Experiment Station Directors. Expected Outcomes 2004; Pham, Jones, Metoyer, Swanson & Pabst, 2013). However, the present design and implementation of the legacy ParaDB lack two important functionalities: - (1) subset capability and, (2) visualization support, which are explained below.

    • TempoXML: Nested bitemporal relationship modeling and conversion tool for fuzzy XML

      2012, Information Sciences
      Citation Excerpt :

      Design methods for temporal databases based on the generalized semantics and the stability constraint have been proposed in [32]. Recently, with the wide spread of the Extensible Markup Language (XML) [62], some research efforts focus on XML based temporal databases, e.g., [59,60,46,44,8,14,34,66]. Further, there are various applications of temporal and spatial data on the Web, e.g., [65,6,42,33].

    • The Storage of Data from TXML document into Temporal Object Relational Database

      2019, 2019 3rd International Conference on Intelligent Computing in Data Sciences, ICDS 2019
    • Conversion of a TXML schema to temporal object-relational database using bitemporal data

      2018, 2018 7th International Conference on Industrial Technology and Management, ICITM 2018
    • Modeling temporal dimensions of semistructured data

      2012, Journal of Intelligent Information Systems
    View all citing articles on Scopus

    Seo-Young Noh received the Ph.D. degree in computer science from Iowa State University, USA, in 2006. He is currently a senior research engineer at the LG Electronics Institute of Technology. His research interests include spatiotemporal databases, XML, database system implementation, embedded database systems, and natural language processing in database systems.

    This work has been sponsored in part by Information Infrastructure Institute and a Grant from Baker Foundation at Iowa State University.

    View full text