Abstract
Spatio-temporal datasets are often very large and difficult to analyse. Recently a lot of interest has arisen towards data-mining techniques to reduce very large spatio-temporal datasets into relevant subsets as well as to help visualisation tools to effectively display the results. Cluster-based mining methods have proven to be successful at reducing the large size of raw data by extracting useful knowledge as representatives. As a consequence, instead of dealing with a large size of raw data, we can use these representatives to visualise or to analyse the data without losing important information. In this paper, we present a new hybrid approach for reducing large spatio-temporal datasets. This approach is based on the combination of density-based and graph-based clustering. Drawing on the Shared Nearest Neighbour concept, it applies the Euclidean metric distance to determine the nearest neighbour similarity. We also present and discuss the evaluation of the results for this approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dunham, M.H.: Data Mining: Introductory and Advanced Topics. Prentice Hall (2003)
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley (2006)
Ye, N. (ed.): The Handbook of Data Mining. Lawrence Erlbaum Associates Publishers, Mahwah (2003)
Johnston, W.L.: Model visualisation. In: Information Visualisation in Data Mining and Knowledge Discovery, pp. 223–227. Morgan Kaufmann, Los Altos (2001)
Le-Khac, N.-A., Bue, M., Whelan, M., Kechadi, M.-T.: A Clustering-Based Data Reduction for Very Large Spatio-Temporal Datasets. In: Cao, L., Zhong, J., Feng, Y. (eds.) ADMA 2010, Part II. LNCS, vol. 6441, pp. 43–54. Springer, Heidelberg (2010)
Roddick, J.F., Hornsby, K., Spiliopoulou, M.: An updated bibliography of temporal, spatial, and spatio-temporal data mining research. In: Roddick, J., Hornsby, K.S. (eds.) TSDM 2000. LNCS (LNAI), vol. 2007, pp. 147–163. Springer, Heidelberg (2001)
Roddick, J.F., Lees, B.G.: Paradigms for Spatial and Spatio-Temporal Data Mining. In: Miller, H., Han, J. (eds.) Geographic Data Mining and Knowledge Discovery. Taylor & Francis (2001)
Kivinen, J., Mannila, H.: The power of sampling in knowledge discovery. In: Proceedings of the ACM SIGACT-SIGMOD-SIGART, Minneapolis, Minnesota, United States, May 24-27, pp. 77–85 (1994)
Sayood, K.: Introduction to Data Compression, 2nd edn. Morgan Kaufmann (2000)
Compieta, P., Di Martino, S., Bertolotto, M., Ferrucci, F., Kechadi, T.: Exploratory Spatio-Temporal Data Mining and Visualization. Journal of Visual Languages and Computing 18(3), 255–279 (2007)
Whelan, M., Le-Khac, N.-A., Kecahdi, M.-T.: Data Reduction in Very Large Spatio-Temporal Data Sets. In: IEEE International Workshop On Cooperative Knowledge Discovery and Data Mining 2010 (WETICE 2010), Larissa, Greece (June 2010)
Bertolotto, M., Di Martino, S., Ferrucci, F., Kechadi, T.: Towards a Framework for Mining and Analysing Spatio-Temporal Datasets. International Journal of Geographical Information Science 21(8), 895–906 (2007)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering clusters in Large Spatial Databases with Noise. In: Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD 1996), Portland, OR, USA, pp. 226–231 (1996)
Januzaj, E., Kriegel, H.-P., Pfeifle, M.: DBDC: Density-Based Distributed Clustering. In: Jarke, M., Bubenko, J., Jeffery, K. (eds.) EDBT 1994. LNCS, vol. 779, pp. 88–105. Springer, Heidelberg (1994)
Jarvis, R.A., Patrick, E.A.: Clustering using a similarity measure based on shared nearest neighbours. IEEE Transactions on Computers C-22(11), 1025–1034 (1973)
Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: Proceedings of Second SIAM International Conference on Data Mining (2003)
National Hurricane Center, Tropical Cyclone Report: Hurricane Isabel (2003), http://www.tpc.ncep.noaa.gov/2003isabel.shtml
Le Khac, N.-A., Whelan, M., Kechadi, M.-T.: Performance Evaluation of a Density-based Clustering Method for Reducing Very Large Spatio-temporal Dataset. In: The 2011 International Conference on Information and Knowledge Engineering (IKE 2011), Las Vegas, USA, July 18-21 (2011)
Ankerst, M., Breunig, M., Kriegel, H.-P., Sander, J.: OPTICS: Ordering Points To Identify the Clustering Structure. In: ACM SIGMOD International Conference on Management of Data, pp. 49–60. ACM Press (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Whelan, M., Le-Khac, NA., Kechadi, M.T. (2011). A New Hybrid Clustering Method for Reducing Very Large Spatio-temporal Dataset. In: Tang, J., King, I., Chen, L., Wang, J. (eds) Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science(), vol 7120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25853-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-25853-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25852-7
Online ISBN: 978-3-642-25853-4
eBook Packages: Computer ScienceComputer Science (R0)