Abstract
In this paper we develop a previous work on matching data [2], inserting their contents in the more general framework of contingency tables and dealing with the dimensions problem generated by the combination of the multiple characteristics that define each row and column category. Two concepts related to the matching process are defined: propensity to match and similarity in the matching. Both measures can be divided into partial components which allow a better understanding of the underlying structure of the data. We illustrate our methodology taking as an example a labor market where each worker category and each job category is defined by the combination of two attributes: location and occupational level.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
We have not considered the propensity to match of each category of each variable on the side of the rows, with each category of other variables on the side of the columns because the analysis would be more complex and the methodological gain marginal.
References
Agresti, A.: Categorical Data Analysis. Probability and Statistics. Wiley, Somerset (2013)
Álvarez de Toledo, P., Núñez, F., Usabiaga, C.: An empirical approach on labour segmentation. Applications with individual duration data. Econ. Model. 36, 252–267 (2014)
Bishop, Y.M.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge (1975)
Everitt, B.S., Landau, S., Leese, M., Stahl, D.: Cluster Analysis. Probability and Statistics. Wiley, Chichester (2011)
Fienberg, S.E., Rinaldo, A.: Three centuries of categorical data analysis: log-linear models and maximum likelihood estimation. J. Stat. Plann. Infer. 137(11), 3430–3445 (2007)
Govaert, G., Nadif, M.: Co-clustering. Wiley, New York (2013)
Padilha, V.A., Campello, R.J.G.B.: A systematic comparative evaluation of biclustering techniques. BMC Bioinform. 18(1), 55 (2017)
Stigler, S.: The missing early history of contingency tables. Annales de la Faculté des Sciences de Toulouse. 11(4), 563–573 (2002)
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
de Toledo, P.Á., Núñez, F., Usabiaga, C., Tallón-Ballesteros, A.J. (2017). Understanding Matching Data Through Their Partial Components. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2017. IDEAL 2017. Lecture Notes in Computer Science(), vol 10585. Springer, Cham. https://doi.org/10.1007/978-3-319-68935-7_65
Download citation
DOI: https://doi.org/10.1007/978-3-319-68935-7_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68934-0
Online ISBN: 978-3-319-68935-7
eBook Packages: Computer ScienceComputer Science (R0)