Non-symmetric correspondence analysis with ordinal variables using orthogonal polynomials
Introduction
Scientific investigations, including sensory evaluation experiments, market research and health evaluations, often collect data where variables are measured on an ordinal scale. For such data, consideration of the partition of Pearson's chi-squared statistic can be made and has been central to the study of symmetric association in ordinal two-way contingency tables (Lancaster, 1953, Best and Rayner, 1996, Beh, 1997, Beh, 2001). Correspondence analysis (CA) considers a partition of the chi-squared statistic to graphically describe the association of categorical variables (Greenacre, 1984, Lebart et al., 1984). When there exists a non-symmetric association between ordinal variables, an informative analysis can be made based on the partition of the Goodman–Kruskal tau index. It is the decomposition of this statistic that lies at the heart of non-symmetric correspondence analysis (NSCA; D’Ambra and Lauro, 1989; Kroonenberg and Lombardo, 1999). In this paper we propose a special partition of the tau index (Goodman and Kruskal, 1954, Light and Margolin, 1971) using orthogonal polynomials (D’Ambra et al., 2002). It has the advantage that it takes into account the dependence relationship (if one exists) and the ordinal structure of the variables by considering a pre-defined set of scores to reflect this structure. For an analogous, although symmetric, analysis of the orthogonal polynomials have been used to perform simple CA (Beh, 1997).
The methodology presented in this paper, is referred to as doubly ordinal non-symmetric correspondence analysis (DONSCA). It is designed to allow the visualization of the dependence relationship between categories of a response and a predictor variable. Such a visualization is useful when identifying the structure of this relationship and does so in terms of components that reflect sources of variation in terms of the location (mean), dispersion (spread) and higher order moments. It can be used to identify important characteristics in the behavior of the response variable given the presence of a predictor variable. After a brief description of the tau index numerator and of classical NSCA in Section 2, a presentation of DONSCA will be given for the visual identification of dependence between ordinal variables (Section 3). In Section 4, distance measures and the interpretation of correspondence plots will be investigated. Two examples illustrating the application of the technique will be given in Section 5 and some final comments will be left for the conclusion.
Section snippets
Classic NSCA
Consider a two-way contingency table of dimension according to I and J categories of variables Y (response) and X (predictor), respectively. Denote the matrix of joint relative frequencies by so that . Also, define the diagonal matrix where the th element is the row's marginal frequency. Similarly, let the th element of the diagonal matrix of column's marginal frequencies be . The conditional probability that an individual/unit is classified
Doubly ordered NSCA
The classical approach to NSCA described above is especially useful in cases where the predictor and response variables are nominal. However many studies involve variables that are measured using an ordinal scale. When these situations arise, rather than decomposing using SVD, one may consider the bivariate moment decomposition (BMD)where . The vectors and are orthogonal polynomials of generic order u and associated with the row
Distances in DONSCA
One of the primary reasons for considering NSCA when investigating the asymmetric association between the categorical variables is that a graphical summary (by way of the correspondence plot) of the data can be made. This plot allows the researcher to identify row and/or column categories that are relatively similar or different based on their proximity to one another. For the predictor (column) variable such comparisons can be made by observing the squared distances between the profiles of the
Confidence circles for ordinal and classical NSCA
In the framework of classical CA, Lebart et al. (1984) demonstrated the usefulness of confidence circles for displaying CA results. By considering the derivation of these circles, they can also be applied to ordinal symmetric and non-symmetric correspondence analysis (Beh and D’Ambra, 2007). The radii lengths of the circles using BMD are equivalent to those of Lebart et al. (1984) who considered SVD. However, because of the non-symmetric nature of the association between the row and column
Artificial contingency table
Suppose we consider the artificial two-way contingency table of Table 1. Assume that the categories of the row and column variables are ordered.
Table 1 has been constructed so that the th element, associated with row 2 and column c, is relatively very large when compared with the value of the other elements in the table. The th cell frequency has been set to be relatively small.
By assuming that there exists an asymmetric relationship between the row and column categories of Table 1
Conclusion
Recent papers that describe the use of orthogonal polynomials for correspondence analysis have shown to be an important tool for identifying sources of association that exist in two-way contingency tables with one ordinal variable (Beh, 2001) and two ordinal variables (Beh, 1997,1998) or in three-way contingency tables (Beh and Davy, 1998, Beh and Davy, 1999; D’Ambra et al., 2006).
In this paper we have discussed the development of non-symmetric correspondence analysis using bivariate moment
References (24)
Categorical Data Analysis
(1990)Simple correspondence analysis of ordinal cross-classifications using orthogonal polynomials
Biometrical Journal
(1997)A comparative study of scores for correspondence analysis with ordered categories
Biometrical Journal.
(1998)Partitioning Pearson's chi-squared statistic for singly ordered two-way contingency tables
Australian New Zealand J. Statist.
(2001)- et al.
Partitioning Pearson's chi-squared statistic for a completely ordered three-way contingency table
Australian New Zealand J. Statist.
(1998) - et al.
Partitioning Pearson's chi-squared statistic for a partially ordered three-way contingency table
Australian New Zealand J. Statist.
(1999) - Beh, E.J., D’Ambra, L., 2007. Some interpretative tools for nominal and ordinal non symmetric correspondence analysis,...
- et al.
Nonparametric analysis for doubly ordered two-way contingency tables
Biometrics
(1996) - et al.
Non-symmetrical correspondence analysis for three-way contingency table
- D’Ambra, L., Lombardo, R., 1993. Normalized non symmetrical correspondence analysis for three-way data sets. Bull....
CATANOVA for two-way contingency tables with ordinal variables using orthogonal polynomials
Commun. Statist.
Cited by (34)
Simple correspondence analysis using adjusted residuals
2012, Journal of Statistical Planning and InferenceCitation Excerpt :For example, the ordinal correspondence analysis technique of Beh (1997) could be adapted such that bivariate moment decomposition is applied to the residuals (6) as an alternative to singular value decomposition. Non-symmetric correspondence analysis of two cross-classified nominal categorical variables (D'Ambra and Lauro, 1989) or ordinal variables (Lombardo et al., 2007) could also be performed keeping in mind a variation of the adjusted residuals. Computationally, the SPLUS code of Beh (2004b, 2005) – which can also be incorporated into R – can be modified to incorporate the decomposition of adjusted residuals when using them to perform correspondence analysis.
Investigating the European perception of food using moments obtained from non-symmetrical correspondence analysis
2011, Journal of Statistical Planning and InferenceCitation Excerpt :Section 4 provides a simple example illustrating how these moments may help to identify features of the configuration in a low dimensional plot. More information on the mathematical aspects of non-symmetrical correspondence analysis can be found in D’Ambra and Lauro (1989), Kroonenberg and Lombardo (1999) and Lombardo et al. (2007). It must also be noted that, while we are treating the row categories as forming the response variable and the column categories form the predictor variable, we can also transpose the asymmetric association.
A European perception of food using two methods of correspondence analysis
2011, Food Quality and PreferenceCitation Excerpt :For Table 1, we can treat Country as the predictor variable, and determine how it impacts upon the outcome of Word Association. An account of the mathematical and practical issues of this cousin of simple CA was proposed by D’Ambra and Lauro (1989) and discussed further by Kroonenberg and Lombardo (1999), Lombardo, Beh and D’Ambra (2007), and Lombardo, Kroonenberg and D’Ambra (2000). Therefore the reader is invited to consider any of these for more detail on NSCA.
Special issue on correspondence analysis and related methods
2009, Computational Statistics and Data AnalysisVariants of non-symmetric correspondence analysis for nominal and ordinal variables
2024, Journal of the Korean Statistical SocietyAn Introduction to Correspondence Analysis
2021, An Introduction to Correspondence Analysis