Skip to main content

Online and Offline Trend Cluster Discovery in Spatially Distributed Data Streams

  • Conference paper
Analysis of Social Media and Ubiquitous Data (MUSE 2010, MSM 2010)

Abstract

Emerging real life applications, such as environmental compliance, ecological studies and meteorology, are characterized by real-time data acquisition through remote sensor networks. The most important aspect of the sensor readings is that they comprise a space dimension and a time dimension which are both information bearing. Additionally, they usually arrive at a rapid rate in a continuous, unbounded stream. Streaming prevents us from storing all readings and performing multiple scans of the entire data set. The drift of data distribution poses the additional problem of mining patterns which may change over the time. We address these challenges for the trend cluster cluster discovery, that is, the discovery of clusters of spatially close sensors which transmit readings, whose temporal variation, called trend polyline, is similar along the time horizon of a window. We present a stream framework which segments the stream into equally-sized windows, computes online intra-window trend clusters and stores these trend clusters in a database. Trend clusters are queried offline at any time, to determine trend clusters along larger windows (i.e. windows of windows). Experiments with several streams demonstrate the effectiveness of the proposed framework in discovering accurate and relevant to human trend clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Aberer, K., Koubarakis, M., Kalogeraki, V. (eds.) VLDB 2003. LNCS, vol. 2944, pp. 81–92. Springer, Heidelberg (2004)

    Google Scholar 

  2. Aggarwal, C.C., Yu, P.S.: A survey of synopsis construction in data streams. In: Data Streams: Models and Algorithms, vol. 31, pp. 170–207. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Armenakis, C.: Estimation and organization of spatio-temporal data. In: Frank, A.U., Formentini, U., Campari, I. (eds.) GIS 1992. LNCS, vol. 639. Springer, Heidelberg (1992)

    Google Scholar 

  4. Babcock, B., Datar, M., Motwani, R., O’Callaghan, L.: Maintaining variance and k-medians over data stream windows. In: PODS 2003, pp. 234–243. ACM, New York (2003)

    Google Scholar 

  5. Bhaduri, K., Sivakumar, K.D.K., Kargupta, H., Wolff, R., Chen, R.: In: Aggarwal, C.C. (ed.) Data Streams. Advances in Database Systems, vol. 31, pp. 309–332. Springer, US (2007)

    Chapter  Google Scholar 

  6. Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: Ghosh, J., Lambert, D., Skillicorn, D.B., Srivastava, J. (eds.) SIAM SDM 2006 (2006)

    Google Scholar 

  7. Chang, W., Zeng, D., Chen, H.: A stack-based prospective spatio-temporal data analysis approach. Decis. Support Syst. 45(4), 697–713 (2008)

    Article  Google Scholar 

  8. Chen, Y., Tu, L.: Density-based clustering for real-time stream data. In: KDD 2007, pp. 133–142. ACM, New York (2007)

    Google Scholar 

  9. Ciampi, A., Appice, A., Malerba, D.: Summarization for geographically distributed data streams. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010. LNCS, vol. 6278, pp. 339–348. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. ACM SIGMOD Record 34(2), 18–26 (2005)

    Article  MATH  Google Scholar 

  11. Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: KDD 1999, pp. 63–72. ACM, New York (1999)

    Google Scholar 

  12. Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams. In: FOCS, pp. 359–366 (2000)

    Google Scholar 

  13. Hadjieleftheriou, M., Kollios, G., Gunopulos, D., Tsotras, V.J.: On-line discovery of dense areas in spatio-temporal databases. In: Hadzilacos, T., Manolopoulos, Y., Roddick, J., Theodoridis, Y. (eds.) SSTD 2003. LNCS, vol. 2750, pp. 306–324. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Kalnis, P., Mamoulis, N., Bakiras, S.: On discovering moving clusters in spatio-temporal data. In: Anshelevich, E., Egenhofer, M.J., Hwang, J. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 364–381. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Lee, J.-G., Han, J., Whang, K.-Y.: Trajectory clustering: a partition-and-group framework. In: SIGMOD 2007, pp. 593–604. ACM, New York (2007)

    Google Scholar 

  16. Li, Y., Han, J., Yang, J.: Clustering moving objects. In: KDD 2004, pp. 617–622. ACM, New York (2004)

    Google Scholar 

  17. Munro, R., Chawla, S.: An integrated approach to mining data streams. In: Technical Report, University of Sydney. School of Information Technologies (2004)

    Google Scholar 

  18. Nanni, M., Pedreschi, D.: Time-focused clustering of trajectories of moving objects. J. Intell. Inf. Syst. 27(3), 267–289 (2006)

    Article  Google Scholar 

  19. O’Callaghan, L., Meyerson, A., Motwani, R., Mishra, N., Guha, S.: Streaming-data algorithms for high-quality clustering. In: ICDE, p. 685. IEEE, Los Alamitos (2002)

    Google Scholar 

  20. Shekhar, S., Chawla, S.: Spatial databases: A tour. Prentice Hall, Englewood Cliffs (2003)

    Google Scholar 

  21. Tobler, W.: Cellular geography. Philosophy in Geography, 379–386 (1979)

    Google Scholar 

  22. Vlachos, M., Gunopoulos, D., Kollios, G.: Discovering similar multidimensional trajectories. In: ICDE 2002, p. 673. IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

  23. Wan, L., Ng, W., Dang, X., Yu, P., Zhang, K.: Density-based clustering of data streams at multiple resolutions. Trans. Knowl. Discov. Data 3(3), 1–28 (2009)

    Article  Google Scholar 

  24. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. SIGMOD Rec. 25(2), 103–114 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ciampi, A., Appice, A., Malerba, D. (2011). Online and Offline Trend Cluster Discovery in Spatially Distributed Data Streams. In: Atzmueller, M., Hotho, A., Strohmaier, M., Chin, A. (eds) Analysis of Social Media and Ubiquitous Data. MUSE MSM 2010 2010. Lecture Notes in Computer Science(), vol 6904. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23599-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23599-3_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23598-6

  • Online ISBN: 978-3-642-23599-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics