Abstract
Drawing boundaries and appending text labels for each class of multi-class scatterplot are two common steps to help people perceive and understand class-level spatial and semantic information hidden in the scatterplot. However, massive data points, highly overlapped classes, widespread outliers, extremely non-uniform density of data points lead to readability and scalability issues with existing methods. In this paper, we propose a set of methods that form a three-step framework to overcome these issues. We enable the boundary compact, readable, and controllable, and can find an ideal position that matches the human visual preference for each label. In the first step, we use a MST-based clustering algorithm to further divide classes into clusters and remove class-level outliers to avoid the distortion of boundaries. A stroke-based interaction is integrated into the algorithm, allowing the user to quickly correct the identified clusters or materialize the clusters in his or her mind. In the second step, we design a grid-based boundary construction pipeline which enables the user to tighten the boundary into the main distribution region of its corresponding class in a controlled manner by gradually filtering out cluster-level outliers. Gridding improves scalability at the scale of data points and helps users gain insights by generating different distributions of classes based on a relative or absolute density threshold. In the third step, by combining three factors: the boundary of the target cluster, the boundary of the label, and the density distribution of the target cluster, we can place the label closer to its visually ideal position. Rich illustrations and two cases demonstrate the effectiveness of our methods.
Graphic abstract
Similar content being viewed by others
References
Agafonkin V (2021) A new algorithm for finding a visual center of a polygon. https://blog.mapbox.com. Accessed 12 March 2021
Barrault M (2001) A methodology for placement and evaluation of area map labels. Comput Environ Urban Syst 25(1):33–52
Been K, Daiches E, Yap C (2006) Dynamic map labeling. IEEE Trans Vis Comput Graph 12(5):773–780
Bernard J, Hutter M, Zeppelzauer M, Fellner D, Sedlmair M (2017) Comparing visual-interactive labeling with active learning: an experimental study. IEEE Trans. Vis Comput Graph 24(1):298–308
Börner K, Chen C, Boyack KW (2003) Visualizing knowledge domains. Annu Rev Inf Sci Technol 37(1):179–255
Byelas H, Telea A (2006) Visualization of areas of interest in software architecture diagrams. In: Proceedings of the 2006 ACM symposium on software visualization, pp 105–114
Chen H, Chen W, Mei H, Liu Z, Zhou K, Chen W, Gu W, Ma K-L (2014) Visual abstraction and exploration of multi-class scatterplots. IEEE Trans Vis Comput Graph 20(12):1683–1692
Chen H, Engle S, Joshi A, Ragan ED, Yuksel BF, Harrison L (2018) Using animation to alleviate overdraw in multiclass scatterplot matrices. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–12
Chen X, Ge T, Zhang J, Chen B, Fu C-W, Deussen O, Wang Y (2019) A recursive subdivision technique for sampling multi-class scatterplots. IEEE Trans Vis Comput Graph 26(1):729–738
Collins C, Penn G, Carpendale S (2009) Bubble sets: revealing set relations with isocontours over existing visualizations. IEEE Trans Vis Comput Graph 15(6):1009–1016
Elhami S, Saalfeld A, Kang H (2001) Using shape analyses for placement of polygon labels. In: Esri international user conference, San Diego, CA
Gansner ER, Hu Y, Kobourov S (2010) Gmap: visualizing graphs and clusters as maps. In 2010 IEEE Pacific visualization symposium (PacificVis), pp 201–208. IEEE
Garcia-Castellanos D, Lombardo U (2007) Poles of inaccessibility: a calculation algorithm for the remotest places on earth. Scott Geogr J 123(3):227–233
Ghuneim AG (2021) Moore-neighbor tracing. http://www.imageprocessingplace.com/downloads_V3/root_downloads/tutorials/contour_tracing_Abeer_George_Ghuneim/moore.html. Accessed 12 March 2021
Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track, pp 59–63
Heimerl F, John M, Han Q, Koch S, Ertl T (2016) Docucompass: effective exploration of document landscapes. In: 2016 IEEE conference on visual analytics science and technology (VAST), pp 11–20. IEEE
Heimerl F, Chang C-C, Sarikaya A, Gleicher M (2018) Visual designs for binned aggregation of multi-class scatterplots. arXiv preprint arXiv:1810.02445
Hu R, Sha T, Van Kaick O, Deussen O, Huang H (2019) Data sampling in multi-view and multi-class scatterplots via set cover optimization. IEEE Trans Vis Comput Graph 26(1):739–748
Kingma DP, Welling M (2013) Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114
Kobourov SG, Pupyrev S, Simonetto P (2014) Visualizing graphs as maps with contiguous regions. In: EuroVis (Short Papers)
Kohli D, Cloninger A, Mishne G (2021) Ldle: low distortion local eigenmaps. arXiv preprint arXiv:2101.11055
Kouřil D, Čmolík L, Kozlíková B, Wu H-Y, Johnson G, Goodsell DS, Olson A, Gröller ME, Viola I (2018) Labels on levels: labeling of multi-scale multi-instance and crowded 3d biological environments. IEEE Trans Vis Comput Graph 25(1):977–986
Krumpe F, Mendel T (2020) Computing curved area labels in near-real time. arXiv preprint arXiv:2001.02938
Lespinats S, Aupetit M, Meyer-Baese A (2015) Classimap: a new dimension reduction technique for exploratory data analysis of labeled data. Int J Pattern Recognit Artif Intell 29(06):1551008
Lespinats S, Verleysen M, Giron A, Fertil B (2007) Dd-hds: a method for visualization and exploration of high-dimensional data. IEEE Trans Neural Netw 18(5):1265–1279
Li Z, Zhang C, Jia S, Zhang J (2019) Galex: exploring the evolution and intersection of disciplines. IEEE Trans Vis Comput Graph 26(1):1182–1192
Li Y, Sakamoto M, Shinohara T, Satoh T (2020a) Automatic label placement of area-features using deep learning. Int Arch Photogramm Remote Sens Spat Inf Sci 43:117–122
Li Z, Zhao Y, Botta N, Ionescu C, Hu X (2020b) Copod: copula-based outlier detection. arXiv preprint arXiv:2009.09463
Liu Y, Jun E, Li Q, Heer J (2019) Latent space cartography: visual analysis of vector space embeddings. In: Computer graphics forum, vol 38, pp 67–78. Wiley Online Library
Lu M, Wang S, Lanir J, Fish N, Yue Y, Cohen-Or D, Huang H (2019) Winglets: visualizing association with uncertainty in multi-class scatterplots. IEEE Trans Vis Comput Graph 26(1):770–779
Lu K, Feng M, Chen X, Sedlmair M, Deussen O, Lischinski D, Cheng Z, Wang Y (2020) Palettailor: discriminable colorization for categorical data. IEEE Trans Vis Comput Graph 27(2):475–484
March WB, Ram P, Gray AG (2010) Fast Euclidean minimum spanning tree: algorithm, analysis, and applications. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 603–612
Mashima D, Kobourov S, Hu Y (2011) Visualizing dynamic data with maps. IEEE Trans Vis Comput Graph 18(9):1424–1437
Mayorga A, Gleicher M (2013) Splatterplots: overcoming overdraw in scatter plots. IEEE Trans Vis Comput Graph 19(9):1526–1538
Meng Y, Zhang H, Liu M, Liu S (2015) Clutter-aware label layout. In 2015 IEEE Pacific visualization symposium (PacificVis), pp 207–214. IEEE
Moreira A, Santos MY (2007) Concave hull: A k-nearest neighbours approach for the computation of the region occupied by a set of points. In: Proceedings of the Second International Conference on Computer Graphics Theory and Applications (GRAPP), pp 61–68. https://doi.org/10.5220/0002080800610068
Mote K (2007) Fast point-feature label placement for dynamic visualizations. Inf Vis 6(4):249–260
Mumtaz H, van Garderen M, Beck F, Weiskopf D (2019) Label placement for outliers in scatterplots. In: EuroVis (Short Papers), pp 1–5
Pevnỳ T (2016) Loda: lightweight on-line detector of anomalies. Mach Learn 102(2):275–304
Pokonieczny K, Borkowska S (2019) Using artificial neural network for labelling polygon features in topographic maps. Geoscape 13(2):125–131
Rayson P, Garside R (2000) Comparing corpora using frequency profiling. In: The workshop on comparing corpora, pp 1–6
Sarikaya A, Gleicher M (2017) Scatterplots: tasks, data, and designs. IEEE Trans Vis Comput Graph 24(1):402–412
Sen S, Swoap AB, Li Q, Boatman B, Dippenaar I, Gold R, Ngo M, Pujol S, Jackson B, Hecht B (2017) Cartograph: unlocking spatial visualization through semantic enhancement. In: Proceedings of the 22nd international conference on intelligent user interfaces, pp 179–190
Sinha A, Shen Z, Song Y, Ma H, Eide D, Hsu B-J, Wang K (2015) An overview of microsoft academic service (mas) and applications. In: Proceedings of the 24th international conference on world wide web, pp 243–246
Stahnke J, Dörk M, Müller B, Thom A (2015) Probing projections: interaction techniques for interpreting arrangements and errors of dimensionality reductions. IEEE Trans Vis Comput Graph 22(1):629–638
VanderPlas J (2016) mst\_clustering: clustering via Euclidean minimum spanning trees. J Open Source Softw 1(1):12. https://doi.org/10.21105/joss.00012
Wang Y, Chen X, Ge T, Bao C, Sedlmair M, Fu C-W, Deussen O, Chen B (2018) Optimizing color assignment for perception of class separability in multiclass scatterplots. IEEE Trans Vis Comput Graph 25(1):820–829
Wu C, Ding Y, Zhou X, Lu G (2016) A grid algorithm suitable for line and area feature label placement. Environ Earth Sci 75(20):1–11
Yuan J, Xiang S, Xia J, Yu L, Liu S (2020) Evaluation of sampling methods for scatterplots. IEEE Trans Vis Comput Graph 27(2):1720–1730
Zhao Y, Luo X, Lin X, Wang H, Kui X, Zhou F, Wang J, Chen Y, Chen W (2019) Visual analytics for electromagnetic situation awareness in radio monitoring and management. IEEE Trans Vis Comput Graph 26(1):590–600
Zhou Z, Zhang X, Yang Z, Chen Y, Liu Y, Wen J, Chen B, Zhao Y, Chen W (2020a) Visual abstraction of geographical point data with spatial autocorrelations. In: 2020 IEEE conference on visual analytics science and technology (VAST), pp 60–71. IEEE
Zhao Y, Jiang H, Qin Y, Xie H, Wu Y, Liu S, Zhou Z, Xia J, Zhou F et al (2020b) Preserving minority structures in graph sampling. IEEE Trans Vis Comput Graph 27(2):1698–1708
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, Z., Wang, T., Wang, M. et al. Construct boundaries and place labels for multi-class scatterplots. J Vis 25, 407–426 (2022). https://doi.org/10.1007/s12650-021-00791-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12650-021-00791-x