Abstract
Multivariate histogram snapshots are complex data structures that frequently occur in predictive maintenance. Histogram snapshots store large amounts of data in devices with small memory capacity, though it remains a challenge to analyze them effectively. In this paper, we propose Z-Hist, a novel framework for representing and temporally abstracting histogram snapshots by converting them into a set of temporal intervals. This conversion enables the exploitation of frequent arrangement mining techniques for extracting disproportionally frequent patterns of such complex structures. Our experiments on a turbo failure dataset from a truck Original Equipment Manufacturer (OEM) demonstrate a promising use-case of Z-Hist. We also benchmark Z-Hist on six synthetic datasets for studying the relationship between distribution changes over time and disproportionality values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allen, J.F.: Maintaining knowledge about temporal intervals. CACM 26(11), 832–843 (1983)
Billard, L., Diday, E.: Symbolic Data Analysis: Conceptual Statistics and Data Mining. Wiley, Hoboken (2007)
Bornemann, L., Lecerf, J., Papapetrou, P.: STIFE: a framework for feature-based classification of sequences of temporal intervals. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 85–100. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_6
Evans, S., Waller, P.C., Davis, S.: Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports. PDS 10(6), 483–486 (2001)
Forero, M.G., Arias-Rubio, C., González, B.T.: Analytical comparison of histogram distance measures. In: Vera-Rodriguez, R., Fierrez, J., Morales, A. (eds.) CIARP 2018. LNCS, vol. 11401, pp. 81–90. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13469-3_10
Gurung, R.B.: Adapted random survival forest for histograms to analyze NOx sensor failure in heavy trucks. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds.) LOD 2019. LNCS, vol. 11943, pp. 83–94. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37599-7_8
Irpino, A., Verde, R., De Carvalho, F.D.A.: Dynamic clustering of histogram data based on adaptive squared Wasserstein distances. Expert Syst. Appl. 41(7), 3351–3366 (2014)
Karlsson, I., Papapetrou, P., Asker, L., Boström, H., Persson, H.E.: Mining disproportional itemsets for characterizing groups of heart failure patients from administrative health records. In: PETRA, pp. 394–398. ACM (2017)
Le-Rademacher, J., Billard, L.: Principal component analysis for histogram-valued data. Adv. Data Anal. Classif. 11(2), 327–351 (2016). https://doi.org/10.1007/s11634-016-0255-9
Lee, Z.: Z-Hist repository (2020). https://github.com/zedshape/zhist
Lee, Z., Lindgren, T., Papapetrou, P.: Z-miner: an efficient method for mining frequent arrangements of event intervals. In: KDD, pp. 524–534 (2020)
Li, X., et al.: Meningioma grading using conventional MRI histogram analysis based on 3D tumor measurement. Eur. J. Radiol. 110, 45–53 (2019)
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. DMKD 15(2), 107–144 (2007)
Liu, L., Wang, S., Hu, B., Qiong, Q., Wen, J., Rosenblum, D.S.: Learning structures of interval-based Bayesian networks in probabilistic generative model for human complex activity recognition. PR 81, 545–561 (2018)
Montastruc, J.L., Sommet, A., Bagheri, H., Lapeyre-Mestre, M.: Benefits and strengths of the disproportionality analysis for identification of adverse drug reactions in a pharmacovigilance database. BJCP 72(6), 905 (2011)
Papapetrou, P., Kollios, G., Sclaroff, S., Gunopulos, D.: Discovering frequent arrangements of temporal intervals. In: ICDM, pp. 8-pp. IEEE (2005)
Prytz, R., Nowaczyk, S., Rögnvaldsson, T., Byttner, S.: Predicting the need for vehicle compressor repairs using maintenance records and logged vehicle data. Eng. Appl. Artif. Intell. 41, 139–150 (2015)
van Puijenbroek, E.P., Bate, A., Leufkens, H.G., Lindquist, M., Orre, R., Egberts, A.C.: A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. PDS 11(1), 3–10 (2002)
Schäfer, P.: The boss is concerned with time series classification in the presence of noise. DMKD 29(6), 1505–1530 (2015)
Schweizer, B.: Distributions are the numbers of the future. In: Proceedings of the Mathematics of Fuzzy Systems Meeting, pp. 137–149 (1984)
Sheetrit, E., Nissim, N., Klimov, D., Shahar, Y.: Temporal probabilistic profiles for sepsis prediction in the ICU. In: KDD, pp. 2961–2969 (2019)
Yang, L., et al.: Rectal cancer: can T2WI histogram of the primary tumor help predict the existence of lymph node metastasis? Eur. Radiol. 29(12), 6469–6476 (2019)
Yu, Y., Zhu, Y., Wan, D., Liu, H., Zhao, Q.: A novel symbolic aggregate approximation for time series. In: Lee, S., Ismail, R., Choo, H. (eds.) IMCOM 2019. AISC, vol. 935, pp. 805–822. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19063-7_65
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lee, Z., Anton, N., Papapetrou, P., Lindgren, T. (2021). Z-Hist: A Temporal Abstraction of Multivariate Histogram Snapshots. In: Abreu, P.H., Rodrigues, P.P., Fernández, A., Gama, J. (eds) Advances in Intelligent Data Analysis XIX. IDA 2021. Lecture Notes in Computer Science(), vol 12695. Springer, Cham. https://doi.org/10.1007/978-3-030-74251-5_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-74251-5_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-74250-8
Online ISBN: 978-3-030-74251-5
eBook Packages: Computer ScienceComputer Science (R0)