Abstract
For a given set of samples with a numeric variable and a set of nominal variables, we address a problem of constructing a histogram drawn by K bins with variable widths, so as to have relatively large numbers of narrow bins for some ranges where numeric values distribute densely and change substantially, while small numbers of wide bins for the other ranges, together with the characteristic nominal values for describing these bins as annotation terms. For this purpose, we propose a new method, which incorporates a change point detection method to numeric values based on an L1 or L2 error criterion, and an annotation terms identification method for these bins based on the z-score with respect to the distribution of nominal values. In our experiments using four datasets of humidity deficit (HD) collected from vinyl greenhouses, we show that our proposed method can construct more natural histograms with appropriate variable bin widths than those with an equal bin width constructed by the standard method based on square-root choice or Sturges’ formula, the histograms constructed with the L1 error criterion has more desirable property than those with the L2 error criterion, and our method can produce a series of naturally interpretable annotation terms for the constructed bins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Saishin nogyo gijutsu yasai. 8. Rural Culture Association Japan (2015). http://amazon.co.jp/o/ASIN/454015057X/
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009). https://doi.org/10.1145/1541880.1541882
Denby, L., Mallows, C.: Variations on the histogram. J. Comput. Graph. Stat. 18, 21–31 (2009)
Kim, C.J., Piger, J., Startz, R.: Estimation of Markov regime-switching regression models with endogenous switching. J. Econom. 143(2), 263–273 (2008)
Saito, K., Ohara, K., Kimura, M., Motoda, H.: Change point detection for burst analysis from an observed information diffusion sequence of tweets. J. Intell. Inf. Syst. 44(2), 243–269 (2015). https://doi.org/10.1007/s10844-013-0283-2
Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization, 2nd edn. Wiley, New York (1992)
Yamada, H., Watanabe, C.: Approach feature extraction of nature image with observation report and transition of histogram. Technical Report 16 (2010)
Acknowledgments
This material is based upon work supported by JSPS Grant-in-Aid for Scientific Research (C) (No. 18K11441), (B) (No. 17H01826) and Early-Career Scientists (No. 19K20417).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Fushimi, T., Iwasaki, K., Okubo, S., Saito, K. (2019). Construction of Histogram with Variable Bin-Width Based on Change Point Detection. In: Kralj Novak, P., Ĺ muc, T., DĹľeroski, S. (eds) Discovery Science. DS 2019. Lecture Notes in Computer Science(), vol 11828. Springer, Cham. https://doi.org/10.1007/978-3-030-33778-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-33778-0_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33777-3
Online ISBN: 978-3-030-33778-0
eBook Packages: Computer ScienceComputer Science (R0)