Non-symmetric correspondence analysis with ordinal variables using orthogonal polynomials

https://doi.org/10.1016/j.csda.2006.12.040Get rights and content

Abstract

Non-symmetrical correspondence analysis (NSCA) is a useful tool for graphically detecting the asymmetric relationship between two categorical variables. Most of the theory associated with NSCA does not distinguish between a two-way contingency table of ordinal variables and a two-way one of nominal variables. Typically, singular value decomposition (SVD) is used in classical NSCA for dimension reduction. A bivariate moment decomposition (BMD) for ordinal variables in contingency tables using orthogonal polynomials and generalized correlations is proposed. This method not only takes into account the ordinal nature of the two categorical variables, but also permits for the detection of significant association in terms of location, dispersion and higher order components.

Introduction

Scientific investigations, including sensory evaluation experiments, market research and health evaluations, often collect data where variables are measured on an ordinal scale. For such data, consideration of the partition of Pearson's chi-squared statistic can be made and has been central to the study of symmetric association in ordinal two-way contingency tables (Lancaster, 1953, Best and Rayner, 1996, Beh, 1997, Beh, 2001). Correspondence analysis (CA) considers a partition of the chi-squared statistic to graphically describe the association of categorical variables (Greenacre, 1984, Lebart et al., 1984). When there exists a non-symmetric association between ordinal variables, an informative analysis can be made based on the partition of the Goodman–Kruskal tau index. It is the decomposition of this statistic that lies at the heart of non-symmetric correspondence analysis (NSCA; D’Ambra and Lauro, 1989; Kroonenberg and Lombardo, 1999). In this paper we propose a special partition of the tau index (Goodman and Kruskal, 1954, Light and Margolin, 1971) using orthogonal polynomials (D’Ambra et al., 2002). It has the advantage that it takes into account the dependence relationship (if one exists) and the ordinal structure of the variables by considering a pre-defined set of scores to reflect this structure. For an analogous, although symmetric, analysis of the orthogonal polynomials have been used to perform simple CA (Beh, 1997).

The methodology presented in this paper, is referred to as doubly ordinal non-symmetric correspondence analysis (DONSCA). It is designed to allow the visualization of the dependence relationship between categories of a response and a predictor variable. Such a visualization is useful when identifying the structure of this relationship and does so in terms of components that reflect sources of variation in terms of the location (mean), dispersion (spread) and higher order moments. It can be used to identify important characteristics in the behavior of the response variable given the presence of a predictor variable. After a brief description of the tau index numerator and of classical NSCA in Section 2, a presentation of DONSCA will be given for the visual identification of dependence between ordinal variables (Section 3). In Section 4, distance measures and the interpretation of correspondence plots will be investigated. Two examples illustrating the application of the technique will be given in Section 5 and some final comments will be left for the conclusion.

Section snippets

Classic NSCA

Consider a two-way contingency table N of dimension I×J according to I and J categories of variables Y (response) and X (predictor), respectively. Denote the matrix of joint relative frequencies by P=(pij) so that i=1Ij=1Jpij=1. Also, define the diagonal matrix DI where the (i,i)th element pi is the row's marginal frequency. Similarly, let the (j,j)th element of the diagonal matrix DJ of column's marginal frequencies be pj. The conditional probability that an individual/unit is classified

Doubly ordered NSCA

The classical approach to NSCA described above is especially useful in cases where the predictor and response variables are nominal. However many studies involve variables that are measured using an ordinal scale. When these situations arise, rather than decomposing πij using SVD, one may consider the bivariate moment decomposition (BMD)πij=u=1I-1v=1J-1zuvaiu*bjv*,where aiu*=pi-1/2a^iu. The vectors a^u and bv* are orthogonal polynomials of generic order u and v associated with the row

Distances in DONSCA

One of the primary reasons for considering NSCA when investigating the asymmetric association between the categorical variables is that a graphical summary (by way of the correspondence plot) of the data can be made. This plot allows the researcher to identify row and/or column categories that are relatively similar or different based on their proximity to one another. For the predictor (column) variable such comparisons can be made by observing the squared distances between the profiles of the

Confidence circles for ordinal and classical NSCA

In the framework of classical CA, Lebart et al. (1984) demonstrated the usefulness of confidence circles for displaying CA results. By considering the derivation of these circles, they can also be applied to ordinal symmetric and non-symmetric correspondence analysis (Beh and D’Ambra, 2007). The radii lengths of the circles using BMD are equivalent to those of Lebart et al. (1984) who considered SVD. However, because of the non-symmetric nature of the association between the row and column

Artificial contingency table

Suppose we consider the artificial two-way contingency table of Table 1. Assume that the categories of the row and column variables are ordered.

Table 1 has been constructed so that the (2,c)th element, associated with row 2 and column c, is relatively very large when compared with the value of the other elements in the table. The (3,b)th cell frequency has been set to be relatively small.

By assuming that there exists an asymmetric relationship between the row and column categories of Table 1

Conclusion

Recent papers that describe the use of orthogonal polynomials for correspondence analysis have shown to be an important tool for identifying sources of association that exist in two-way contingency tables with one ordinal variable (Beh, 2001) and two ordinal variables (Beh, 1997,1998) or in three-way contingency tables (Beh and Davy, 1998, Beh and Davy, 1999; D’Ambra et al., 2006).

In this paper we have discussed the development of non-symmetric correspondence analysis using bivariate moment

References (24)

  • A. Agresti

    Categorical Data Analysis

    (1990)
  • E.J. Beh

    Simple correspondence analysis of ordinal cross-classifications using orthogonal polynomials

    Biometrical Journal

    (1997)
  • E.J. Beh

    A comparative study of scores for correspondence analysis with ordered categories

    Biometrical Journal.

    (1998)
  • E.J. Beh

    Partitioning Pearson's chi-squared statistic for singly ordered two-way contingency tables

    Australian New Zealand J. Statist.

    (2001)
  • E.J. Beh et al.

    Partitioning Pearson's chi-squared statistic for a completely ordered three-way contingency table

    Australian New Zealand J. Statist.

    (1998)
  • E.J. Beh et al.

    Partitioning Pearson's chi-squared statistic for a partially ordered three-way contingency table

    Australian New Zealand J. Statist.

    (1999)
  • Beh, E.J., D’Ambra, L., 2007. Some interpretative tools for nominal and ordinal non symmetric correspondence analysis,...
  • D.J. Best et al.

    Nonparametric analysis for doubly ordered two-way contingency tables

    Biometrics

    (1996)
  • L. D’Ambra et al.

    Non-symmetrical correspondence analysis for three-way contingency table

  • D’Ambra, L., Lombardo, R., 1993. Normalized non symmetrical correspondence analysis for three-way data sets. Bull....
  • D’Ambra, L., Lombardo, R., Amenta, P., 2002. Non symmetric correspondence analysis for ordered two-way contingency...
  • L. D’Ambra et al.

    CATANOVA for two-way contingency tables with ordinal variables using orthogonal polynomials

    Commun. Statist.

    (2005)
  • Cited by (34)

    • Simple correspondence analysis using adjusted residuals

      2012, Journal of Statistical Planning and Inference
      Citation Excerpt :

      For example, the ordinal correspondence analysis technique of Beh (1997) could be adapted such that bivariate moment decomposition is applied to the residuals (6) as an alternative to singular value decomposition. Non-symmetric correspondence analysis of two cross-classified nominal categorical variables (D'Ambra and Lauro, 1989) or ordinal variables (Lombardo et al., 2007) could also be performed keeping in mind a variation of the adjusted residuals. Computationally, the SPLUS code of Beh (2004b, 2005) – which can also be incorporated into R – can be modified to incorporate the decomposition of adjusted residuals when using them to perform correspondence analysis.

    • Investigating the European perception of food using moments obtained from non-symmetrical correspondence analysis

      2011, Journal of Statistical Planning and Inference
      Citation Excerpt :

      Section 4 provides a simple example illustrating how these moments may help to identify features of the configuration in a low dimensional plot. More information on the mathematical aspects of non-symmetrical correspondence analysis can be found in D’Ambra and Lauro (1989), Kroonenberg and Lombardo (1999) and Lombardo et al. (2007). It must also be noted that, while we are treating the row categories as forming the response variable and the column categories form the predictor variable, we can also transpose the asymmetric association.

    • A European perception of food using two methods of correspondence analysis

      2011, Food Quality and Preference
      Citation Excerpt :

      For Table 1, we can treat Country as the predictor variable, and determine how it impacts upon the outcome of Word Association. An account of the mathematical and practical issues of this cousin of simple CA was proposed by D’Ambra and Lauro (1989) and discussed further by Kroonenberg and Lombardo (1999), Lombardo, Beh and D’Ambra (2007), and Lombardo, Kroonenberg and D’Ambra (2000). Therefore the reader is invited to consider any of these for more detail on NSCA.

    • Special issue on correspondence analysis and related methods

      2009, Computational Statistics and Data Analysis
    • An Introduction to Correspondence Analysis

      2021, An Introduction to Correspondence Analysis
    View all citing articles on Scopus
    View full text