Abstract
Suppose an experiment is conducted on pairs of objects with outcome response a continuous variable measuring the interactions among the pairs. Furthermore, assume the response variable is hard to measure numerically but we may code its values into low and high levels of interaction (and possibly a third category in between if neither label applies). In this paper, we estimate the interaction values from the information contained in the coded data and the design structure of the experiment. A novel estimation method is introduced and shown to enjoy several optimal properties including maximum explained variance in the responses with minimum number of parameters and for any probability distribution underlying the responses. Furthermore, the interactions have the simple interpretation of correlation (in absolute value), size of error is estimable from the experiment, and only a single run of each pair is needed for the experiment. We also explore possible applications of the technique. Three applications are presented, one on protein interaction, a second on drug combination, and the third on machine learning. The first two applications are illustrated using real life data while for the third application, the data are generated via binary coding of an image.
Similar content being viewed by others
Change history
14 August 2019
The original version of this article unfortunately contained a mistake in Title and reference Thurstone, L. L. (1927).
References
Al-Ibrahim, A. H. (2015). The analysis of multivariate data using semi-definite programming. Journal of Classification, 32, 3. https://doi.org/10.1007/s00357-015-9184-0.
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. UK: Cambridge University Press.
Burt, C. (1950). The factorial analysis of qualitative data. British Journal of Psychology, 3, 166–185.
Chou, T. C., & Talalay, P. (1984). Quantitative analysis of dose-effect relationships: the combined effects of multiple drugs or enzyme inhibitors. Advances in Enzyme Regulation, 22, 27–55.
Cokol, M., Chua, H., Tasan, M., Mutlu, B., Weinstein, Z. B., Yo, S., Nergiz, M., Costanzo, M., Baryshnikova, A., Giaever, G., Nislow, C., Myers, C. L., Andrews, B., Boone, C., & Roth, F. (2011). Systematic exploration of synergistic drug pairs. Molecular Systems Biology, 7, 544 online Pub.
Cox, T., & Cox, M. (2001). Multidimensional Scaling (2nd ed.). Boca Raton: Chapman Hall.
Fisher, R. A. (1940). The precision of discriminant functions. Annals of Eugenics, 10, 422–429.
Globerson A., Roweis S. (2007). Visualizing pairwise similarity via semidefinite programming. Pro. of the 11thInter. Conf. on Artif, Intelligence and Stat., pp. 139–146.
Guttman, L. (1941). The quantification of a class of attributes: a theory and method of scale construction. In Horst et al. (Ed.), The prediction of personal adjustment (pp. 319–348). New York: Social Science Research Council.
Horst, P. (1935). Measuring complex attitudes. Journal of Social Psychology, 6, 369–374.
Jackups, R., & Liang, J. (2005). Interstrand pairing patterns in β-barrel membrane proteins: the positive-outside rule, aromatic rescue, and strand registration prediction. Journal of Molecular Biology, 354, 979–993.
Joreskog, K. G., & Moustaki, I. (2001). 2007 factor analysis of ordinal variables: a comparison of three approaches. Multivariate Behavioral Research, 36(3), 347–387.
Li Z., Liu J. and Tang X. (2008). Pairwise constraint propagation by semidefinite programming for semi-supervised classification. Pro. of the 25th Conf. on Mach. Learn. pp. 576–583.
Mandel, J. (1969). The partitioning of interaction in analysis of variance. Journal of Research National Bureau Standards - B, V, 738(4).
Mandel, J. (1970). A new analysis of variance model for non-additive data. Technometrics, 13(1), 1–18.
McKeon, J. J. (1966). Canonical analysis: Some relations between canonical correlation, factor analysis, discriminant function analysis and scaling theory. [Monograph No. 13]. Psychometrika.
Mignon, A., & Jurie, F. P. C. C. A. (2012). A new approach for distance learning from sparse pairwise constraints. Computer Vision and Pattern Recognition. IEEE Conference on Biometrics Compendium.
Nishisato, S. (1980). Analysis of categorical data: dual scaling and its applications. Toronto: University of Toronto Press.
Shepard R.N. (1962). The analysis of proximities: multidimensional scaling with an unknown distance functions I, Psychometrika, V 27, no 2, 125–140.
Tenenhaus, M., & Young, F. W. (1985). An analysis and synthesis of multiple correspondence analysis, optimal scaling, dual scaling, homogeneity analysis, and other methods for quantifying categorical multivariate data. Psychometrika, 50(1), 91–119.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286.
Author information
Authors and Affiliations
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original version of this article unfortunately contained a mistake in Title and reference Thurstone, L. L. (1927).
Rights and permissions
About this article
Cite this article
Al-Ibrahim, A.H. A Framework for Quantifying Qualitative Responses in Pairwise Experiments. J Classif 36, 471–492 (2019). https://doi.org/10.1007/s00357-019-09337-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-019-09337-1