Skip to main content

Advertisement

Log in

Construct boundaries and place labels for multi-class scatterplots

  • Regular Paper
  • Published:
Journal of Visualization Aims and scope Submit manuscript

Abstract

Drawing boundaries and appending text labels for each class of multi-class scatterplot are two common steps to help people perceive and understand class-level spatial and semantic information hidden in the scatterplot. However, massive data points, highly overlapped classes, widespread outliers, extremely non-uniform density of data points lead to readability and scalability issues with existing methods. In this paper, we propose a set of methods that form a three-step framework to overcome these issues. We enable the boundary compact, readable, and controllable, and can find an ideal position that matches the human visual preference for each label. In the first step, we use a MST-based clustering algorithm to further divide classes into clusters and remove class-level outliers to avoid the distortion of boundaries. A stroke-based interaction is integrated into the algorithm, allowing the user to quickly correct the identified clusters or materialize the clusters in his or her mind. In the second step, we design a grid-based boundary construction pipeline which enables the user to tighten the boundary into the main distribution region of its corresponding class in a controlled manner by gradually filtering out cluster-level outliers. Gridding improves scalability at the scale of data points and helps users gain insights by generating different distributions of classes based on a relative or absolute density threshold. In the third step, by combining three factors: the boundary of the target cluster, the boundary of the label, and the density distribution of the target cluster, we can place the label closer to its visually ideal position. Rich illustrations and two cases demonstrate the effectiveness of our methods.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. https://github.com/AndrewB330/EuclideanMST.

  2. https://dblp.org.

  3. http://csrankings.org/.

References

  • Agafonkin V (2021) A new algorithm for finding a visual center of a polygon. https://blog.mapbox.com. Accessed 12 March 2021

  • Barrault M (2001) A methodology for placement and evaluation of area map labels. Comput Environ Urban Syst 25(1):33–52

    Article  Google Scholar 

  • Been K, Daiches E, Yap C (2006) Dynamic map labeling. IEEE Trans Vis Comput Graph 12(5):773–780

    Article  Google Scholar 

  • Bernard J, Hutter M, Zeppelzauer M, Fellner D, Sedlmair M (2017) Comparing visual-interactive labeling with active learning: an experimental study. IEEE Trans. Vis Comput Graph 24(1):298–308

    Article  Google Scholar 

  • Börner K, Chen C, Boyack KW (2003) Visualizing knowledge domains. Annu Rev Inf Sci Technol 37(1):179–255

    Article  Google Scholar 

  • Byelas H, Telea A (2006) Visualization of areas of interest in software architecture diagrams. In: Proceedings of the 2006 ACM symposium on software visualization, pp 105–114

  • Chen H, Chen W, Mei H, Liu Z, Zhou K, Chen W, Gu W, Ma K-L (2014) Visual abstraction and exploration of multi-class scatterplots. IEEE Trans Vis Comput Graph 20(12):1683–1692

    Article  Google Scholar 

  • Chen H, Engle S, Joshi A, Ragan ED, Yuksel BF, Harrison L (2018) Using animation to alleviate overdraw in multiclass scatterplot matrices. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–12

  • Chen X, Ge T, Zhang J, Chen B, Fu C-W, Deussen O, Wang Y (2019) A recursive subdivision technique for sampling multi-class scatterplots. IEEE Trans Vis Comput Graph 26(1):729–738

    Article  Google Scholar 

  • Collins C, Penn G, Carpendale S (2009) Bubble sets: revealing set relations with isocontours over existing visualizations. IEEE Trans Vis Comput Graph 15(6):1009–1016

    Article  Google Scholar 

  • Elhami S, Saalfeld A, Kang H (2001) Using shape analyses for placement of polygon labels. In: Esri international user conference, San Diego, CA

  • Gansner ER, Hu Y, Kobourov S (2010) Gmap: visualizing graphs and clusters as maps. In 2010 IEEE Pacific visualization symposium (PacificVis), pp 201–208. IEEE

  • Garcia-Castellanos D, Lombardo U (2007) Poles of inaccessibility: a calculation algorithm for the remotest places on earth. Scott Geogr J 123(3):227–233

    Article  Google Scholar 

  • Ghuneim AG (2021) Moore-neighbor tracing. http://www.imageprocessingplace.com/downloads_V3/root_downloads/tutorials/contour_tracing_Abeer_George_Ghuneim/moore.html. Accessed 12 March 2021

  • Goldstein M, Dengel A (2012) Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track, pp 59–63

  • Heimerl F, John M, Han Q, Koch S, Ertl T (2016) Docucompass: effective exploration of document landscapes. In: 2016 IEEE conference on visual analytics science and technology (VAST), pp 11–20. IEEE

  • Heimerl F, Chang C-C, Sarikaya A, Gleicher M (2018) Visual designs for binned aggregation of multi-class scatterplots. arXiv preprint arXiv:1810.02445

  • Hu R, Sha T, Van Kaick O, Deussen O, Huang H (2019) Data sampling in multi-view and multi-class scatterplots via set cover optimization. IEEE Trans Vis Comput Graph 26(1):739–748

    Article  Google Scholar 

  • Kingma DP, Welling M (2013) Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114

  • Kobourov SG, Pupyrev S, Simonetto P (2014) Visualizing graphs as maps with contiguous regions. In: EuroVis (Short Papers)

  • Kohli D, Cloninger A, Mishne G (2021) Ldle: low distortion local eigenmaps. arXiv preprint arXiv:2101.11055

  • Kouřil D, Čmolík L, Kozlíková B, Wu H-Y, Johnson G, Goodsell DS, Olson A, Gröller ME, Viola I (2018) Labels on levels: labeling of multi-scale multi-instance and crowded 3d biological environments. IEEE Trans Vis Comput Graph 25(1):977–986

    Article  Google Scholar 

  • Krumpe F, Mendel T (2020) Computing curved area labels in near-real time. arXiv preprint arXiv:2001.02938

  • Lespinats S, Aupetit M, Meyer-Baese A (2015) Classimap: a new dimension reduction technique for exploratory data analysis of labeled data. Int J Pattern Recognit Artif Intell 29(06):1551008

    Article  Google Scholar 

  • Lespinats S, Verleysen M, Giron A, Fertil B (2007) Dd-hds: a method for visualization and exploration of high-dimensional data. IEEE Trans Neural Netw 18(5):1265–1279

    Article  Google Scholar 

  • Li Z, Zhang C, Jia S, Zhang J (2019) Galex: exploring the evolution and intersection of disciplines. IEEE Trans Vis Comput Graph 26(1):1182–1192

    Google Scholar 

  • Li Y, Sakamoto M, Shinohara T, Satoh T (2020a) Automatic label placement of area-features using deep learning. Int Arch Photogramm Remote Sens Spat Inf Sci 43:117–122

    Article  Google Scholar 

  • Li Z, Zhao Y, Botta N, Ionescu C, Hu X (2020b) Copod: copula-based outlier detection. arXiv preprint arXiv:2009.09463

  • Liu Y, Jun E, Li Q, Heer J (2019) Latent space cartography: visual analysis of vector space embeddings. In: Computer graphics forum, vol 38, pp 67–78. Wiley Online Library

  • Lu M, Wang S, Lanir J, Fish N, Yue Y, Cohen-Or D, Huang H (2019) Winglets: visualizing association with uncertainty in multi-class scatterplots. IEEE Trans Vis Comput Graph 26(1):770–779

    Article  Google Scholar 

  • Lu K, Feng M, Chen X, Sedlmair M, Deussen O, Lischinski D, Cheng Z, Wang Y (2020) Palettailor: discriminable colorization for categorical data. IEEE Trans Vis Comput Graph 27(2):475–484

    Article  Google Scholar 

  • March WB, Ram P, Gray AG (2010) Fast Euclidean minimum spanning tree: algorithm, analysis, and applications. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 603–612

  • Mashima D, Kobourov S, Hu Y (2011) Visualizing dynamic data with maps. IEEE Trans Vis Comput Graph 18(9):1424–1437

    Article  Google Scholar 

  • Mayorga A, Gleicher M (2013) Splatterplots: overcoming overdraw in scatter plots. IEEE Trans Vis Comput Graph 19(9):1526–1538

    Article  Google Scholar 

  • Meng Y, Zhang H, Liu M, Liu S (2015) Clutter-aware label layout. In 2015 IEEE Pacific visualization symposium (PacificVis), pp 207–214. IEEE

  • Moreira A, Santos MY (2007) Concave hull: A k-nearest neighbours approach for the computation of the region occupied by a set of points. In: Proceedings of the Second International Conference on Computer Graphics Theory and Applications (GRAPP), pp 61–68. https://doi.org/10.5220/0002080800610068

  • Mote K (2007) Fast point-feature label placement for dynamic visualizations. Inf Vis 6(4):249–260

    Article  Google Scholar 

  • Mumtaz H, van Garderen M, Beck F, Weiskopf D (2019) Label placement for outliers in scatterplots. In: EuroVis (Short Papers), pp 1–5

  • Pevnỳ T (2016) Loda: lightweight on-line detector of anomalies. Mach Learn 102(2):275–304

    Article  MathSciNet  Google Scholar 

  • Pokonieczny K, Borkowska S (2019) Using artificial neural network for labelling polygon features in topographic maps. Geoscape 13(2):125–131

    Article  Google Scholar 

  • Rayson P, Garside R (2000) Comparing corpora using frequency profiling. In: The workshop on comparing corpora, pp 1–6

  • Sarikaya A, Gleicher M (2017) Scatterplots: tasks, data, and designs. IEEE Trans Vis Comput Graph 24(1):402–412

    Article  Google Scholar 

  • Sen S, Swoap AB, Li Q, Boatman B, Dippenaar I, Gold R, Ngo M, Pujol S, Jackson B, Hecht B (2017) Cartograph: unlocking spatial visualization through semantic enhancement. In: Proceedings of the 22nd international conference on intelligent user interfaces, pp 179–190

  • Sinha A, Shen Z, Song Y, Ma H, Eide D, Hsu B-J, Wang K (2015) An overview of microsoft academic service (mas) and applications. In: Proceedings of the 24th international conference on world wide web, pp 243–246

  • Stahnke J, Dörk M, Müller B, Thom A (2015) Probing projections: interaction techniques for interpreting arrangements and errors of dimensionality reductions. IEEE Trans Vis Comput Graph 22(1):629–638

    Article  Google Scholar 

  • VanderPlas J (2016) mst\_clustering: clustering via Euclidean minimum spanning trees. J Open Source Softw 1(1):12. https://doi.org/10.21105/joss.00012

    Article  Google Scholar 

  • Wang Y, Chen X, Ge T, Bao C, Sedlmair M, Fu C-W, Deussen O, Chen B (2018) Optimizing color assignment for perception of class separability in multiclass scatterplots. IEEE Trans Vis Comput Graph 25(1):820–829

    Article  Google Scholar 

  • Wu C, Ding Y, Zhou X, Lu G (2016) A grid algorithm suitable for line and area feature label placement. Environ Earth Sci 75(20):1–11

    Google Scholar 

  • Yuan J, Xiang S, Xia J, Yu L, Liu S (2020) Evaluation of sampling methods for scatterplots. IEEE Trans Vis Comput Graph 27(2):1720–1730

    Article  Google Scholar 

  • Zhao Y, Luo X, Lin X, Wang H, Kui X, Zhou F, Wang J, Chen Y, Chen W (2019) Visual analytics for electromagnetic situation awareness in radio monitoring and management. IEEE Trans Vis Comput Graph 26(1):590–600

    Article  Google Scholar 

  • Zhou Z, Zhang X, Yang Z, Chen Y, Liu Y, Wen J, Chen B, Zhao Y, Chen W (2020a) Visual abstraction of geographical point data with spatial autocorrelations. In: 2020 IEEE conference on visual analytics science and technology (VAST), pp 60–71. IEEE

  • Zhao Y, Jiang H, Qin Y, Xie H, Wu Y, Liu S, Zhou Z, Xia J, Zhou F et al (2020b) Preserving minority structures in graph sampling. IEEE Trans Vis Comput Graph 27(2):1698–1708

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiawan Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Wang, T., Wang, M. et al. Construct boundaries and place labels for multi-class scatterplots. J Vis 25, 407–426 (2022). https://doi.org/10.1007/s12650-021-00791-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12650-021-00791-x

Keywords

Navigation