Skip to main content
Log in

Guidelines for using UML association classes and their effect on domain understanding in requirements engineering

  • Original Article
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

The analysis and description of the application domain are important parts of the requirements engineering process. Domain descriptions are frequently represented as models in the de-facto standard unified modeling language (UML). Recent research has specified the semantics of various UML language elements for domain modeling, based on ontological considerations. In this paper, we empirically examine ontological modeling guidelines for the UML association construct, which plays a central role in UML class diagrams. Using an experimental study, we find that some, but not all, of the proposed guidelines lead to better application domain models. We use a process-tracing study to investigate in more detail the effects of ontological guidelines. The combined results indicate that ontological guidelines can improve the usefulness of UML class diagrams for describing the application domain, and thus have the potential to improve downstream system development activities and ultimately affect the successful information systems implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Wand Y, Weber R (1993) On the ontological expressiveness of information systems analysis and design grammars. J Inf Syst 3:217–237

    Google Scholar 

  2. Evermann J, Wand Y (2005) Toward formalizing domain modeling semantics and language syntax. IEEE Trans Softw Eng 31:21–37

    Google Scholar 

  3. Dobing B, Parsons J (2006) How UML is used? Commun ACM 49:109–113

    Google Scholar 

  4. Fettke P (2009) How conceptual modeling is used. Commun Assoc Inf Syst 25:571–592

    Google Scholar 

  5. Dobing B, Parsons J (2008) Dimensions of UML diagram use: a survey of practitioners. J Database Manag 19:1–18

    Google Scholar 

  6. Davies I, Green P, Rosemann M, Indulska M, Gallo S (2006) How do practitioners use conceptual modeling in practice? Data Knowl Eng 58:358–380

    Google Scholar 

  7. Parsons J (2011) An experimental study of the effects of representing property precedence on the comprehension of conceptual schemas. J AIS 12:441–462

    Google Scholar 

  8. Evermann J (2005) The association construct in conceptual modeling—an analysis using the Bunge ontological model. CAiSE, Porto

    Google Scholar 

  9. Milicev D (2007) On the semantics of associations and association ends in UML. IEEE Trans Softw Eng 33:238–251

    Google Scholar 

  10. Rumbaugh J, Blaha WP, Eddy F, Lorensen W (1991) Object oriented modeling and design. Prentice Hall, Englewood Cliffs

    Google Scholar 

  11. Martin J, Odell J (1992) Object oriented analysis and design. Prentice Hall, Englewood Cliffs

    Google Scholar 

  12. Bahrami A (1999) Object-oriented systems development using UML, 3rd edn. McGraw-Hill, New York

    Google Scholar 

  13. OM Group (2004) UML 2.0 superstructure specification, revised final adopted specification. Available: http://www.omg.org

  14. Stevens P (2002) On the interpretation of binary associations in the unified modeling language. Softw Syst Model 1:68–79

    Google Scholar 

  15. Embley DW (1992) Object-oriented systems analysis: a model-driven approach. Prentice Hall, Englewood Cliffs

    Google Scholar 

  16. Siegfried S (1995) Understanding object-oriented software engineering. IEEE Press, New York

    Google Scholar 

  17. Liu Z, He Z, Li J, Chen Y (2003) A relational model for formal object-oriented requirement analysis in UM. In: LNCS 2885. Springer, Berlin, pp 641–664

  18. Evermann J, Wand Y (2005) Ontology based object-oriented domain modelling: fundamental concepts. Requir Eng J 10:146–160

    Google Scholar 

  19. Bunge M (1977) Ontology I: the furniture of the world, vol 3. D. Reidel, Dodrecht

    MATH  Google Scholar 

  20. Evermann J, Wand Y (2006) Ontological modelling rules for UML: an empirical assessment. J Comput Inf Syst 47:156–184

    Google Scholar 

  21. Poels G (2011) Understanding business domain models: the effect of recognizing resource-event–agent conceptual modeling structures. J Database Manag 22(4):69–101

    Google Scholar 

  22. Evermann J, Halimi H (2008) Associations and mutual properties—an experimental assessment. In: Americas conference on information systems, Toronto

  23. Calder BJ, Phillips LW, Tybout AM (1981) Designing research for application. J Consum Res 8:197–207

    Google Scholar 

  24. Mayer R (2001) Multimedia learning. Cambridge University Press, Cambridge

    Google Scholar 

  25. Gemino A (1998) Comparing object oriented with structured analysis techniques in conceptual modeling. PhD thesis, Sauder School of Business, University of British Columbia, Vancouver

  26. Gemino A, Wand Y (2004) A framework for empirical evaluation of conceptual modeling techniques. Requir Eng J 9:248–260

    Google Scholar 

  27. Burton-Jones A, Meso P (2006) Conceptualizing systems for understanding: an empirical test of decomposition principles in object-oriented analysis. Inf Syst Res 17:38–60

    Google Scholar 

  28. Parsons J, Cole L (2005) What do the pictures mean? Guidelines for experimental evaluation of representation fidelity in diagrammatical conceptual modeling techniques. Data Knowl Eng 55(3):327–342

    Google Scholar 

  29. Allen MJ, Yen WM (2002) Introduction to measurement theory. Waveland Press, Long Grove

    Google Scholar 

  30. Nunnally J, Bernstein I (1994) Psychometric theory, 3rd edn. McGraw Hill, New York

    Google Scholar 

  31. Levine T, Krehbiel T, Berenson M (2010) Business statistics: a first course, 5th edn. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  32. Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Erlbaum Associates, Hillsdale

    MATH  Google Scholar 

  33. Stephen O, Pearl B, David B (2006) Protocol analysis: a neglected practice. Commun ACM 49:117–122

    Google Scholar 

  34. Hungerford BC, Hevner A, Collins RW (2004) Reviewing software diagrams: a cognitive study. IEEE Trans Softw Eng 30:82–96

    Google Scholar 

  35. Evermann J (2008) An exploratory study of database integration processes. IEEE Trans Knowl Data Eng 20:99–115

    Google Scholar 

  36. Newell A, Simon HA (1972) Human problem solving. Prentice Hall, Englewood Cliffs

    Google Scholar 

  37. Ericsson KA, Simon HA (1984) Protocol analysis: verbal reports as data. MIT Press, Cambridge

    Google Scholar 

  38. Gobet F, Charness N (2006) Chess and games. In: Ericsson KA, Charness N, Fletovich PJ, Hoffman RR (eds) The Cambridge handbook of expert performance. Cambridge University Press, New York, pp 41–67

    Google Scholar 

  39. Vessey I, Conger S (1994) Requirements specification: learning object, process, and data methodologies. Commun ACM 37:102–113

    Google Scholar 

  40. Bera P, Krasnoperova A, Wand Y (2010) Using OWL as a conceptual modeling language. J Database Manag 21:1–28

    Google Scholar 

  41. Vessey I, Galletta D (1991) Cognitive fit: an empirical study of information acquisition. Inf Syst Res 2:63–84

    Google Scholar 

  42. Gemino A, Wand Y (2005) Complexity and clarity in conceptual modeling: comparison of mandatory and optional properties. Data Knowl Eng 55:301–326

    Google Scholar 

  43. Shanks G, Tansley E, Nuredini J, Tobin D, Weber R (2008) Representing part-whole relations in conceptual modeling: an empirical evaluation. MIS Q 32:553–573

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Palash Bera.

Appendices

Appendix 1: sample diagrams (fast-food operation) used in the study

See Figs. 6, 7.

Fig. 6
figure 6

Class diagram developed by following the ontological guidelines

Fig. 7
figure 7

Class diagram developed by violating guideline 2

Appendix 2: experimental materials

2.1 Variables: modeling and domain knowledge

  1. 1.

    To what extent do you know data modeling concepts (such as classes, operations, and attributes)?

  2. 2.

    To what extent do you have experience in using data modeling concepts (such as classes, operations, and attributes)?

  3. 3.

    To what extent do you know UML association class?

  4. 4.

    To what extent do you have experience in using UML association class?

  5. 5.

    To what extent do you know the operation of a food restaurant?

  6. 6.

    To what extent do you have experience in the operation of a food restaurant?

  7. 7.

    To what extent do you know the operation of a hotel reservation?

  8. 8.

    To what extent do you have experience in the operation of a hotel reservation?

2.2 Variable: perceived usefulness of the diagrams

  1. 1.

    To what extent do you think that the diagrams helped to answers the questions?

  2. 2.

    To what extent do you think that the diagrams made it easier to complete answering the questions?

  3. 3.

    To what extent do you think that the diagrams enhanced your effectiveness on answering the questions?

2.3 Variable: domain understanding

  1. 1.

    A customer tried to order food. She has selected the food she wanted to purchase but no food was delivered to her. What could have caused this problem?

  2. 2.

    A Driver went about his route to drop off the ordered food. However, when he reached a delivery point, he could not deliver the ordered food. What could have caused this problem?

  3. 3.

    On a particular day, the partner of the restaurant ordered ingredients for preparing food. The ingredients did not reach on the expected delivery date. What could be the possible reasons?

  4. 4.

    A guest was not a privileged hotel guest but was allowed to get a car pick up service. How could this have happened?

  5. 5.

    A guest had 7 days of reservation in the hotel. At the end of the stay, the guest did not pay for her stay. How could this have happened?

  6. 6.

    A privileged guest received the pick up service even after his membership expired. How could this have happened?

2.4 Task on developing UML class diagram

In the following space draw a UML class diagram for the description below using at least one association class.

A hospital treats patients. For each treatment, the hospital needs to record the doctor, the treatment code, and the date.

2.5 UML class concepts

 

Concept

Definition

Example

Class

A class is set of objects that share the same properties and/or behaviors

Person and hospital are concepts and therefore are modeled as classes

Attribute

Attributes are properties held by the members of a class. Attributes can have constant (such as date of birth) or variable values (such as address)

The person class can have name and address as attributes

Operations

Operations are functions or services that are provided by all the instances of a class to invoke behavior in an object

The two operations of the hospital class are register patients and treat patients

Subclasses

A subclass has more attributes or/and more operations than the general class

A patient is a subclass of a person

Association

Association is the relationship among instances of classes

Hospital and patient are related as hospital treats patients

Association class

An association class is an association that has attributes or/and operations of its own

Registration is an association class that has attributes registration number and registration date

2.6 Training on answering problem-solving questions

Please look at Fig. 8 carefully. The figure is drawn using the concepts mentioned in the earlier page. The figure describes the following situation.

Fig. 8
figure 8

A patient admission situation

A patient class has the attributes name and age and an operation get treated. Admitted patient is a subclass of patient as it has additional attributes—admission date and bed number and an additional operation—get admitted. The physician class is associated with the patient class.

A physician is dissatisfied with her work. Why might this be?

2.7 Sample answers

Using Fig. 8, you come up with answers by making inferences based on the information in the diagram combined with your own background information. For example, to come up with answer 1 (in Table 9), you have to look at the classes admitted patient and patient in Fig. 9 and infer that some patients might not be admitted.

Table 9 Possible answers with explanation
Fig. 9
figure 9

A patient admission situation—sources of answers

Appendix 3: the ANCOVA statistical technique

The ANCOVA technique evaluates the effect of each treatment or control variable by first calculating the mean of the dependent variable (e.g., problem-solving scores) for each experimental group (“treatments”) or control variable. Next, the sum over all observations of the squared differences of the dependent variable score from the mean of the dependent variable score of the group of each observation is computed, called the sums of squares within groups.

$$ {\text{SS}}_{\text{within}} = \sum\limits_{i = 1}^{g} {\sum\limits_{j = 1}^{{n_{i} }} {\left( {Y_{ij} - \bar{Y}_{l} } \right)^{2} } } $$

Here, Y denotes the dependent variable, a bar denotes the mean, g is the number of groups, n i is the number of observations in group i, and c is the overall number of observations. Also, the sum over all groups of the product of the number of observations in that group and the squared differences of the dependent variable mean for that group from the overall mean of the dependent variable is computed, called the sums of squares between groups.

$${\text{SS}}_{{{\text{between}}}} = \sum\limits_{{i = 1}}^{g} {n_{i} (\bar{Y}_{l} - \bar{Y})^{2} } $$

Next, each of these “sums of squares” is divided by their degrees of freedom, defined as the number of data points for that calculation minus the number of parameters calculated. This equals the number of groups minus one (the overall mean is calculated) for the sums of squares between groups, and the number of total observations minus the number of groups (each group has a group mean that is calculated) for the sums of squares within groups. This yields the “mean sums of squares” per degree of freedom.

$$ {\text{MS}}_{\text{within}} = \frac{1}{g - 1}{\text{SS}}_{\text{within}} \quad {\text{MS}}_{\text{between}} = \frac{1}{n - g}{\text{SS}}_{\text{between}} $$

The logic of the ANCOVA rests on the observation that, if the treatment variable had no effect on the group means (i.e., all group means are equal, and thus equal to the overall mean), the mean sums of squares between the groups would be equal to the mean sums of squares within groups; in other words, their ratio should be one. This ratio is called the F statistic, reported in our tables in the text, and it is distributed according to an F distribution.

$$ F = \frac{{{\text{MS}}_{\text{between}} }}{{{\text{MS}}_{\text{within}} }} $$

One can test whether the F statistic that is calculated is significantly different from the expected value of one. This is done by calculating, from the F cumulative distribution function, that probability with which the observed F statistic would be found, if the true F statistics was one. This is the P value reported in our tables in the text. If this probability is sufficiently low (generally this cutoff is assumed to be 0.05), one concludes that the observed F statistic does not come from a distribution for which the true F statistic is one; in other words, the true F statistic is different from one, thus the ratio of mean sums of squares is different from one, and therefore the dependent variable mean differs between the treatment groups or categories. To assess to what extent the different treatment groups or categories explain the observed variation of dependent variable scores, one can compare the sums of squares calculated between the groups to the sum over all observations of the squared difference of the dependent variable from the overall mean of the dependent variable. This is the r 2 value, which can be adjusted to account for the effects of sample size and number of groups.

$$ {\text{SS}}_{\text{total}} = \sum\limits_{i = 1}^{n} {\left( {Y_{i} - \bar{Y}} \right)^{2} } $$
$$ r^{2} = \frac{{{\text{SS}}_{\text{between}} }}{{{\text{SS}}_{\text{total}} }}\quad r_{\text{adj}}^{2} = 1 - \frac{n - 1}{n - g}(1 - r^{2} ) $$

Generally, a higher r 2 value is better, though a value of approximately 0.25 is suggested to indicate a medium-strength effect [31, 32].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bera, P., Evermann, J. Guidelines for using UML association classes and their effect on domain understanding in requirements engineering. Requirements Eng 19, 63–80 (2014). https://doi.org/10.1007/s00766-012-0159-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-012-0159-y

Keywords

Navigation