Skip to main content
Log in

HIL-Tree: A Hierarchical Structure for Guiding Search into Test and Measurement Data Archives

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

This paper describes a novel algorithm that uses discontinuity detection to discover index vectors in test and measurement data archives containing multidimensional data. The index vectors are generated from individual data series in the archive and hold location information about jumps and changes in trends (discontinuities). They are related in a hierarchical manner to form a tree-like structure based on the alignment of the location information across the vectors. We call such trees Hierarchical Index Locations trees (HIL-trees), which are useful in guiding navigation into the raw data and in speeding up the process of retrieving data subsets based on given criteria. To demonstrate the practical value of the algorithm, we present a case study through which the algorithm is applied to real automotive emission test data archives, and show how it works. We also compare the HIL-tree to the well-known R-tree index structure and show how HIL-trees are advantageous in many aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arge, L., de Berg, M., Haverkort, H., and Yi, K. 2004. The Priority R-tree: A practically efficient and worst-case optimal R-tree. In Proceedings of the ACM SIGMOD Conference.

  • Arning, A., Agrawal, R., and Raghavan, P. 1996. A linear method for deviation detection in large databases. In Proceedings of the Knowledge Discovery and Data Mining Conference, Portland, Oregon.

  • Artail, H. and Bedi, J. 2001. Determination of multipath channel parameters using wavelet decomposition. Integrated Computer-Aided Engineering, 8(2):119–133.

    Google Scholar 

  • Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. In Proceedings of the ACM SIGMOD Conference, 19(2):322–331.

    Google Scholar 

  • Berchtold, S., Keim, D., and Kriegel, H. 1996. The X-tree: An Index Structure for High-Dimensional Data. In Proceedings of the VLDB Conference, Bombay, India, pp. 28–39.

  • Bettini, C., Wang, S., Jajodia, S., and Lin, J. 1998. Discovering frequent event patterns with multiple granularities in time sequences. IEEE Transactions on Knowledge and Data Engineering, vol. 10, no. 2, pp. 222–237.

    Google Scholar 

  • Chau, T. and Wong, A. 1999. Pattern discovery by residual analysis and recursive partitioning. IEEE Transactions on Knowledge and Data Engineering, 11(6):833–852.

    Google Scholar 

  • Donoho, D. 1993. Non-linear wavelet methods for recovery of signals, densities, and spectra from indirect and noisy data. In Proceedings of Symposia in Applied Mathematics, 47:173–205.

    Google Scholar 

  • Donoho, D. 1995. Denoising by soft thresholding. IEEE Transactions on Information Theory, vol. 41, pp. 613–627.Donoho, D. 1995. Denoising by soft thresholding. IEEE Transactions on Information Theory, vol. 41, pp. 613–627.

    Google Scholar 

  • Fawcett, T. and Provost, F. 1999. Activity monitoring: Noticing interesting changes in behavior. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, San Diego, CA, pp. 53–62.

  • Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD Conference, vol. 14, no. 2, pp. 47–57.

    Google Scholar 

  • Hadjieleftheriou, M. 1999. R-Tree Visualization Demo, Knowledge and Database Systems Laboratory-National Technical University of Athens, http://www.dbnet.ece.ntua.gr/~mario/rtree/.

  • Katayama, N. and Satoh, S. 1997. The SR-tree: an index structure for high-dimensional nearest neighbor queries. In Proceedings of the ACM SIGMOD Conference, vol. 26, no. 2, pp. 369–380.

    Google Scholar 

  • Keogh, E. 2001. A tutorial on indexing and mining time-series data. In Proceedings of the IEEE International Conference on Data Mining, San Jose, CA.

  • Keogh, E., Leonardi, S., and Chiu, W. 2002. Finding surprising patterns in a time series database in linear time and space. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Alberta, Canada, pp. 550–556.

  • Lin, K.-I., Jagadish, H., and Faloutsos, C. 1994. The TV-tree: An index structure for high-dimensional data. VLDB Journal, 3(4):517–542.

    Google Scholar 

  • Mallat, S. and Hwang, W. 1992. Singularity detection and processing wavelets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:674–693.

    Google Scholar 

  • Muller, H. 1992. Change points in nonparametric regression analysis. Annals of Statistics, 20:737–761.

    Google Scholar 

  • Murthy S. 1998. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2(4):345–389.

    Google Scholar 

  • Ogden, R. 1997. Essential Wavelets for Statistical Applications and Data Analysis. Boston: Birkhauser.

    Google Scholar 

  • Shahabi, C., Tian, X., and Zhao, W. 2000. TSA-tree: A wavelet-based approach to improve the efficiency of multi-level surprise and trend queries. In Proceedings of the Scientific and Statistical Database Management Conference, Berlin, Germany.

  • Srivastava, A., Han, E., Kumar, V., and Singh, V. 1999. Parallel formulations of decision-tree classification algorithms. Data Mining and Knowledge Discovery, 3(3):237–261.

    Google Scholar 

  • Strang, G. and Nguyen, T. 1996. Wavelets and Filter Banks. Welleseley: Cambridge Press.

    Google Scholar 

  • Wang, Y. 1995. Jump and sharp cusp detection by wavelets. Biometrika, 82:385–397.

    Google Scholar 

  • White, D. and Jain, R. 1996. Similarity indexing with the SS-tree. In Proceedings of the Data Engineering Conference, New Orleans, LA, pp. 516–523.

  • Wu, J. and Chu, C. 1993. Kernel type estimators of jump points and values of regression function. Annals of Statistics, 21:1545–1566.

    Google Scholar 

  • Yi, B. and Faloutsos, C. 2000. Fast time sequence indexing for arbitrary Lp norms. In Proceedings of the VLDB Conference, Cairo, Egypt.

  • Yi, B., Jagadish, H., and Faloutsos, C. 1998. Efficient retrieval of similar time sequences under time warping. In Proceedings of the Data Engineering Conference, Orlando, FL.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hassan A. Artail.

Additional information

Hassan Artail worked as a system development supervisor at the Scientific Labs of DaimlerChrysler, Michigan before joining AUB in 2001. At DaimlerChrysler, he worked for 11 years in the field of software and system development for vehicle testing applications, covering the areas of instrument control, computer networking, distributed computing, data acquisition, and data processing. He obtained a B.S. and M.S. in Electrical Engineering from the University of Detroit in 1985 and 1986 respectively and a Ph.D. from Wayne State University in 1999. His research interests are in the areas of Internet and mobile computing, distributed computing and systems, and computer plus network security.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Artail, H.A. HIL-Tree: A Hierarchical Structure for Guiding Search into Test and Measurement Data Archives. Data Min Knowl Disc 10, 229–250 (2005). https://doi.org/10.1007/s10618-005-0388-5

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-005-0388-5

Keywords

Navigation