Skip to main content

A Heuristic Approach for On-line Discovery of Unidentified Spatial Clusters from Grid-Based Streaming Algorithms

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9829))

Included in the following conference series:

Abstract

On-line spatial clustering of large position streams are useful for several applications, such as monitoring urban traffic and massive events. To rapidly and timely detect in real-time these spatial clusters, algorithms explored grid-based approaches, which segments the spatial domain into discrete cells. The primary benefit of this approach is that it switches the costly distance comparison of density-based algorithms to counting the number of moving objects mapped to each cell. However, during this process, the algorithm may fail to identify clusters of spatially and temporally close moving objects that get mapped to adjacent cells. To overcome this answer loss problem, we propose a density heuristic that is sensible to moving objects in adjacent cells. The heuristic further subdivides each cell into inner slots. Then, we calculate the density of a cell by adding the object count of the cell itself with the object count of the inner slots of its adjacent cells, using a weight function. To avoid collateral effects and detecting incorrect clusters, we apply the heuristic only to transient cells, that is, cells whose density are less than, but close to the density threshold value. We evaluate our approach using real-world datasets and explore how different transient thresholds and the number of inner slots influence the similarity and the number of detected, correct and incorrect, and undetected clusters when compared to the baseline result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The crawled dataset is available to be downloaded and reproduced at: http://www.lac.inf.puc-rio.br/answerloss/.

References

  1. Zheng, Y., Capra, L., Wolfson, O., Yang, H.: Urban computing. ACM Trans. Intell. Syst. Technol. 5, 1–55 (2014)

    Google Scholar 

  2. Dodge, S., Weibel, R., Lautenschütz, A.-K.: Towards a taxonomy of movement patterns. Inf. Vis. 7, 240–252 (2008)

    Article  Google Scholar 

  3. Amini, A., Wah, T., Saboohi, H.: On density-based data streams clustering algorithms: a survey. J. Comput. Sci. Technol. 29, 116–141 (2014)

    Article  Google Scholar 

  4. Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)

    Google Scholar 

  5. Ester, M., Kriegel, H.-P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th International Conference on Very Large Data Bases, San Francisco, CA, USA, pp. 323–333 (1998)

    Google Scholar 

  6. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco (2011)

    MATH  Google Scholar 

  7. Garofalakis, M., Gehrke, J., Rastogi, R.: Querying and mining data streams: you only get one look a tutorial. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, p. 635. ACM, New York (2002)

    Google Scholar 

  8. He, Y., Tan, H., Luo, W., Mao, H., Ma, D., Feng, S., Fan, J.: MR-DBSCAN: an efficient parallel density-based clustering algorithm using mapreduce. In: 2011 IEEE 17th International Conference on Parallel and Distributed Systems, pp. 473–480 (2011)

    Google Scholar 

  9. Roriz Junior, M., Endler, M., da Silva e Silva, F.J.: An on-line algorithm for cluster detection of mobile nodes through complex event processing. Inf. Syst. (2016)

    Google Scholar 

  10. Chen, Y., Tu, L.: Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, USA, pp. 133–142 (2007)

    Google Scholar 

  11. Jensen, C.S., Lin, D., Ooi, B.C.: Continuous clustering of moving objects. IEEE Trans. Knowl. Data Eng. 19, 1161–1174 (2007)

    Article  Google Scholar 

  12. Jensen, C.S., Lin, D., Ooi, B.C., Zhang, R.: Effective density queries on continuously moving objects. In: Proceedings of the IEEE 22nd International Conference on Data Engineering (2006)

    Google Scholar 

  13. Ni, J., Ravishankar, C.V.: Pointwise-dense region queries in spatio-temporal databases. In: Proceedings of the IEEE 23rd International Conference on Data Engineering, pp. 1066–1075 (2007)

    Google Scholar 

  14. Jeung, H., Shen, H.T., Zhou, X.: Mining trajectory patterns using Hidden Markov Models. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 470–480. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  15. Etzion, O., Niblett, P.: Event Processing in Action. Manning Publications Co., USA (2010)

    Google Scholar 

  16. EsperTech: Esper - Complex Event Processing. http://www.espertech.com/esper/

Download references

Acknowledgment

This work was partly funded by CNPq under grants 557128/2009-9, 303332/2013-1, 442338/2014-7, by FAPERJ under grants E-26-170028/2008, E-26/201.337/2014 and E-01/209996/2015, and by Microsoft Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marcos Roriz Junior .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Roriz Junior, M., Endler, M., Casanova, M.A., Lopes, H., Silva e Silva, F. (2016). A Heuristic Approach for On-line Discovery of Unidentified Spatial Clusters from Grid-Based Streaming Algorithms. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43946-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43945-7

  • Online ISBN: 978-3-319-43946-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics