Abstract
A heat-map type of chart for depicting large number of cases and up to twenty-five categorical variables with spreadsheet software is presented. It is implemented in Microsoft® Excel using standard formulas, sorting and simple VBA code. The motivating example depicts accuracy of automated assignment of MeSH® descriptor headings to abstracts of medical articles. Within each abstract, predicted support for each heading is ranked, then for each heading actually assigned/non-assigned by human specialist (depicted by black/white cell), high/low support is depicted on nine-point two-colour scale. Thus, each case (abstract) is depicted by one row of a table and each variable (heading) with two adjacent columns. Rank-based classification accuracy measure is calculated for each case, and rows are sorted in increasing accuracy order downwards. Based on analogous measure, variables are sorted in increasing prediction accuracy order rightwards. Another biomedical dataset is presented with a similar chart. Different methods for predicting binary outcomes can be visualised, and the procedure is easily extended to polytomous variables.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Friendly, M.: Visualizing categorical data. Cary, NC (2000)
Bertin, J.: Graphics and graphic information-processing. de Gruyter, New York (1981)
Hartigan, J.A., Kleiner, B.: A mosaic of television ratings. The American Statistician 38(1), 32–35 (1984)
Friendly, M.: Mosaic displays for multi-way contingency tables. Journal of the American Statistical Association 89(425), 190–200 (1994)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 8(95), 14863–14868 (1998)
Pavlidis, P., Noble, W.S.: Matrix2png: a utility for visualizing matrix data. Bioinformatics 19(2), 295–296 (2003)
Heiser, D.A.: Microsoft Excel, and 2003 faults, problems, workarounds and fixes. (2000), http://www.daheiser.info/excel/frontpage.html
Neuwirth, E., Arganbright, D.: The active modeler – mathematical modeling with Microsoft Excel. Brooks/Cole, Belmont (2004)
Lévy, P.P.: The case view, a generic method of visualization of the case mix. International Journal of Medical Informatics 73(9-10), 713–718 (2004)
Lévy, P.P., Duché, L., Darago, L., Dorléans, Y., Toubiana, L., Vibert, J.-F., Flahault, A.: ICPCview: visualizing the International Classification of Primary Care. In: Engelbrecht, R., et al. (eds.) Connecting Medical Informatics and Bio-Informatics, Proceedings of MIE2005, pp. 623–628. IOS Press, Amsterdam (2005)
Zupancic Pridgar, A.: The influence of vaginal flora on morbidity after conization (MSc thesis). University of Ljubljana, Faculty of Medicine, Ljubljana (2003)
Džeroski, S., Hristovski, D., Peterlin, B.: Using data mining and OLAP to discover patterns in a database of patients with Y-chromosome deletions. Journal of the American Medical Informatics Association 7 (Suppl.), 215–219 (2000)
Wilkinson, L.: The grammar of graphics. Springer, New York (1999)
Tufte, E.: The visual display of quantitative information (16th printing). Graphics Press, Chesire (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Vidmar, G. (2007). Pixelisation-Based Statistical Visualisation for Categorical Datasets with Spreadsheet Software. In: Lévy, P.P., et al. Pixelization Paradigm. VIEW 2006. Lecture Notes in Computer Science, vol 4370. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71027-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-71027-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71026-4
Online ISBN: 978-3-540-71027-1
eBook Packages: Computer ScienceComputer Science (R0)