Abstract
Interval data, a special case of symbolic data, are becoming more frequent in different fields of applications due to the uncertainty in the observations or to reduce large data volume. Objects in an interval dataset with two interval variables can be defined as a set of rectangles in two dimensional spaces, where each variable contains an interval describing an attribute of an object. Such a dataset can be named as rectangle dataset. In this paper, we introduce a new notion called maximal frequent rectangle, which is an extension of the notion of maximal frequent intervals and provide solutions to mine maximal frequent rectangles from a rectangle dataset. Moreover, some important properties of rectangles as well as maximal frequent rectangles with related mathematical proofs and probable applications are discussed here.
Similar content being viewed by others
References
Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843
Bock HH, Diday E (2000) Symbolic objects. In: Bock HH, Diday E (eds) Analysis of symbolic data. Studies in classification, data analysis, and knowledge organization, pp 54–77
Boghey RK, Singh S (2016) A sequential tree approach for incremental sequential pattern mining. Sadhana 41(12):1369–1380
Chavent M (2004) A Hausdorff distance between hyper-rectangles for clustering interval data. In: Banks D, McMorris FR, Arabie P, Gaul W (eds) Classification, clustering, and data mining applications. Studies in Classification, data analysis and knowledge organisation, pp 333–339
Chavent Marie (2005) Normalized k-means clustering of hyper-rectangles. In: Proceedings of the XIth international symposium of applied stochastic models and data analysis, pp 670–677
Dutta M (2012) Development of efficient algorithm for some problems in interval data mining. Ph.D. Thesis, Department of Computer Science, Gauhati University, India
Edmonds J , Gryz J, Liang D, Miller RJ (2001) Mining for empty rectangles in large data sets. In: Proceedings of the 8th international conference on database theory, pp 174–188
Fung BCM, Wang K, Ester M (2003) Hierarchical document clustering using frequent itemsets. In: Proceedings of SIAM international conference on data mining, pp 59–70
Irpino A, Verde R (2007) Dynamic clustering of interval data using a Wasserstein based distance. Pattern Recogn Lett 29(11):1648–1658
Kejžar N, Korenjak-Černe S, Batagelj V (2020) Clustering of modal-valued symbolic data. Adv Data Anal Classif
Lin JL (2000) Mining maximal frequent intervals. In: Proceedings of the 2003 ACM symposium on applied computing, pp 426–431
Mahanta AK, Dutta M (2012) Mining closed frequent intervals from interval data. Int J Appl Sci Adv Technol 1(1):1–3
Namad A, Lee DT (1986) On the maximum empty rectangle problem. Discrete Appl Math 8:267–277
Noirhomme-Fraiture M, Brito P (2011) Far beyond the classical data models: symbolic data analysis. Stat Anal Data Min 4:157–170
Patil H, Thakur RS (2017) Maximal frequent term based document clustering. Int J Appl Eng Res 12(22):12232–12236
Roh JW, Yi BK (2008) Efficient indexing of interval time sequences. Inf Process Lett 109(1):1–12
Sarma NJ (2016) Study and design of algorithms for certain problems in interval data mining. Ph.D. Thesis, Department of Computer Science, Gauhati University, India
Sarmah NJ, Mahanta AK (2014) An incremental approach for mining all closed intervals from an interval database. In: IEEE international advance computing conference (IACC), pp 529–532
Sarmah NJ, Mahanta AK (2014) An efficient algorithm for mining maximal sparse interval from interval dataset. Int J Comput Appl 107(16):28–32
Wu SY, Chen YL (2007) Mining non-ambiguous temporal patterns for interval-based events. IEEE Trans Knowl Data Eng 19(6):742–758
Zhang W, Yoshida T, Tang X, Wang Q (2010) Text clustering using frequent itemsets. Knowl-Based Syst 23:379–388
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hazarika, I., Mahanta, A.K. Mining maximal frequent rectangles. Adv Data Anal Classif 16, 593–616 (2022). https://doi.org/10.1007/s11634-021-00451-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-021-00451-w