Decomposing relationship types by pivoting and schema equivalence
Introduction
Database modeling aims on designing efficient and appropriate databases. It is widely accepted now that databases are best designed first on a conceptual level. The result of this process is a conceptual schema which describes the requirements that the desired database must achieve. Usually, conceptual design does not consist of a single design step, but of a step-by-step process. Each step refines the schema, for example by adding information, changing the structure of the schema, refining types or generating subtypes. Composition, extension and decomposition are well-known operations to refine types during the design process.
In this paper, we are almost exclusively concerned with decomposition. In the relational model, this transformation has been successfully used for reducing redundancy, splitting overloaded concepts or making constraint enforcement more efficient. There is a fairly extensive literature on this strategy, to which the reader may wish to turn for details, e.g. [2], [24], [33]. The same idea may be used in semantic data models which are frequently used in conceptual design. Decomposition primitives in the entity-relationship model have been discussed in [2], [3], [31].
A new approach towards decomposition in object-oriented and semantic data models was suggested by Biskup et al. [5], [6]. They introduced the operation of pivoting for decomposing relationship types. Pivoting separates apart some components of a relationship type and assembles them in a newly generated type. Afterwards, the new type is linked to the old one via a pivot component.
In the relational case, decomposition is usually motivated by functional dependency. In particular, normalization theory up to BCNF is based on this idea. In semantic data models, however, this concept falls short. During the last few decades, the area of integrity constraints has attracted considerable research interest. A large amount of different constraint classes has been discussed in the literature and actually used in database design.
Cardinality constraints are among the most popular classes of integrity constraints. They impose lower and upper bounds on the number of relationships an entity of a given type may be involved in. For cardinality constraints and generalizations, see [22], [31]. The objective of this paper is to study pivoting in the presence of cardinality constraints. Our investigation is not restricted to ordinary cardinality constraints, but we also consider co-occurrence constraints, which include functional and numerical dependencies.
The paper is organized as follows. Section 2 describes the data model to be used. Though cardinality constraints have also been studied in object-oriented context, they are most popular in semantic data models. Therefore, all our considerations are carried out in an extended entity-relationship context. In Section 3 we briefly review decomposition under functional dependencies and highlight the idea of pivoting. Section 4 gives a formal definition of cardinality constraints. In Section 5 we give a generalized definition of pivoting, and in Section 6 we show how to use pivoting under cardinality constraints.
The major objective of this paper is to discuss the equivalence of the pivoted schema with the original schema. It turns out that decomposition by pivoting is always information preserving, that is, lossless. In Section 7 we discuss the equally important problem of preserving constraints. We point out that the usual concept of cardinality constraints is not sufficient to cope with this problem. Finally, in Section 8, we identify conditions for a schema being equivalent to its pivoted version, namely the expressive power of path cardinality constraints.
Section snippets
Database schemas and instances
We start with a brief outline of the semantic data model to be used throughout this paper. The entity-relationship approach to conceptual design provides a simple way of representing data in the form of entities and relationships among them. In this section, we shall review basic structural aspects of this approach. For excellent surveys on entity-relationship modeling, see [24], [31].
Entities and relationships are objects that are stored in a database. Intuitively, entities may be seen as
N-ary relationship types
Potential users of information systems usually express their system requirements in natural language. The abstraction of the requirements specification results in a first proposal of a conceptual schema which reflects the most important concepts and relationships. This first schema is the basis for further communication with the users and for step-by-step refinement.
Due to our experience, requirements engineering provides best results if the users are allowed to express their application
Cardinality constraints
Cardinality constraints are often regarded as one of the basic constituents of the entity-relationship model. They are already present in Chen's seminal paper [8], and have been frequently used in conceptual design since then. Let be a link in the database schema D. A cardinality constraint is an expression , where a is a non-negative integer and b is a non-negative integer or ∞. A database instance Dt satisfies this constraint if every instance u of type plays the role
Equivalence of schemas
Originally [5], [6], the operation of pivoting requires a unary functional dependency specified on the relationship type to be decomposed. To get rid of this prerequisite, we redefine the notion of pivoting. As we shall see later on, our concept of pivoting will further extend the original ideas of [5].
Suppose we are given a database schema D containing a relationship type with component set .
Let S be a subset of , say , where 2⩽m<n. We define a new relationship
Pivoting under cardinality constraints
In this section, we discuss pivoting in the presence of cardinality constraints. We present three cases when pivoting is strongly recommended. The equivalence of the original schema and the pivoted schema can be ensured in all three cases similarly to Lemma 1.
- •
First case: Assume we are given a co-occurrence constraint with b⩾2. Here, we suggest to choose S equal to X. In the pivoted schema, the original co-occurrence constraint may be expressed by the participation constraint
Preserving cardinality constraints
Due to the discussion above, the original schema and its pivoted version should be equivalent. For the designer, this requirement becomes something of a problem. We defined equivalence in terms of the instances of schemas, but these instances are not explicitly available during the design process. Nevertheless, the designer has to decide whether pivoting is useful or not. Evidently, the designer has to be supported in coping with this task.
The following theorems help us to decide whether the
Path cardinality constraints
In order to cope with the problem of insufficient constraint-preservation, we have to strengthen the concept of cardinality constraint. A (directed) path is a sequence of links in the entity-relationship diagram of a schema. Here, is the initial vertex of the path, and is its terminal vertex. The integer k denotes the length of the path. Thus, paths of length 1 are just links. For standard graph-theoretic terminology, we refer to [15].
So far,
Concluding remarks
Pivoting was suggested in [5] for breaking-up a relationship type on the semantic level. It generalizes the well-known decomposition of a relation type guided by a functional dependency. We continued this study by investigating pivoting in the presence of cardinality constraints. In Section 5, we defined a suitable notion of pivoting in an extended entity-relationship model. This is of interest as the entity-relationship model is not only popular as an intuitive tool for conceptual modeling,
Sven Hartmann studied Mathematics and Business Sciences in Rostock (Germany) and Grenoble (France). He received his Ph.D. in Mathematics from the University of Rostock in 1996. Currently, he is Assistant Professor at the University of Rostock. His research interests include database semantics, conceptual modeling, design theory, combinatorial and non-linear optimization, neural networks and their application to image recognition, data retrieval and structure biology. He is a member of DMV and
References (34)
English sentence structure and entity-relationship diagrams
Inf. Sci.
(1983)- et al.
Data abstractions: why and how?
Data Knowledge Eng.
(1999) - et al.
Inferences for numerical dependencies
Theoret. Comput. Sci.
(1985) - et al.
Analysis of binary/ternary cardinality combinations in entity-relationship modeling
Data Knowledge Eng.
(1996) - et al.
On the satisfiability of dependency constraints in entity-relationship schemata
Inf. Syst.
(1990) - et al.
Cardinality constraints in semantic data models
Data Knowledge Eng.
(1993) - et al.
Dependency preserving refinements and the fundamental problem of database design
Data Knowledge Eng.
(1998) Complete rules for n-ary relationship cardinality constraints
Data Knowledge Eng.
(1998)Dependency structures of database relationship
Inf. Process.
(1974)- et al.
Improving quality in conceptual modelling by the use of schema transformations
Database Design with the ER Model
Transforming an entity-relationship schema into object-oriented database schemas
The entity-relationship model: towards a unified view of data
ACM Trans. Database Syst.
A relation model of data for large shared data banks
Commun. ACM
Cited by (12)
Constraint acquisition for Entity-Relationship models
2009, Data and Knowledge EngineeringDeciding implication for functional dependencies in complex-value databases
2006, Theoretical Computer ScienceAxiomatisations of functional dependencies in the presence of records, lists, sets and multisets
2006, Theoretical Computer SciencePossibilistic cardinality constraints and functional dependencies
2016, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Weak functional dependencies on trees with restructuring
2011, Acta CyberneticaA characterisation of coincidence ideals for complex values
2009, Journal of Universal Computer Science
Sven Hartmann studied Mathematics and Business Sciences in Rostock (Germany) and Grenoble (France). He received his Ph.D. in Mathematics from the University of Rostock in 1996. Currently, he is Assistant Professor at the University of Rostock. His research interests include database semantics, conceptual modeling, design theory, combinatorial and non-linear optimization, neural networks and their application to image recognition, data retrieval and structure biology. He is a member of DMV and MO.