Abstract
On-line spatial clustering of large position streams are useful for several applications, such as monitoring urban traffic and massive events. To rapidly and timely detect in real-time these spatial clusters, algorithms explored grid-based approaches, which segments the spatial domain into discrete cells. The primary benefit of this approach is that it switches the costly distance comparison of density-based algorithms to counting the number of moving objects mapped to each cell. However, during this process, the algorithm may fail to identify clusters of spatially and temporally close moving objects that get mapped to adjacent cells. To overcome this answer loss problem, we propose a density heuristic that is sensible to moving objects in adjacent cells. The heuristic further subdivides each cell into inner slots. Then, we calculate the density of a cell by adding the object count of the cell itself with the object count of the inner slots of its adjacent cells, using a weight function. To avoid collateral effects and detecting incorrect clusters, we apply the heuristic only to transient cells, that is, cells whose density are less than, but close to the density threshold value. We evaluate our approach using real-world datasets and explore how different transient thresholds and the number of inner slots influence the similarity and the number of detected, correct and incorrect, and undetected clusters when compared to the baseline result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The crawled dataset is available to be downloaded and reproduced at: http://www.lac.inf.puc-rio.br/answerloss/.
References
Zheng, Y., Capra, L., Wolfson, O., Yang, H.: Urban computing. ACM Trans. Intell. Syst. Technol. 5, 1–55 (2014)
Dodge, S., Weibel, R., Lautenschütz, A.-K.: Towards a taxonomy of movement patterns. Inf. Vis. 7, 240–252 (2008)
Amini, A., Wah, T., Saboohi, H.: On density-based data streams clustering algorithms: a survey. J. Comput. Sci. Technol. 29, 116–141 (2014)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Ester, M., Kriegel, H.-P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th International Conference on Very Large Data Bases, San Francisco, CA, USA, pp. 323–333 (1998)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco (2011)
Garofalakis, M., Gehrke, J., Rastogi, R.: Querying and mining data streams: you only get one look a tutorial. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, p. 635. ACM, New York (2002)
He, Y., Tan, H., Luo, W., Mao, H., Ma, D., Feng, S., Fan, J.: MR-DBSCAN: an efficient parallel density-based clustering algorithm using mapreduce. In: 2011 IEEE 17th International Conference on Parallel and Distributed Systems, pp. 473–480 (2011)
Roriz Junior, M., Endler, M., da Silva e Silva, F.J.: An on-line algorithm for cluster detection of mobile nodes through complex event processing. Inf. Syst. (2016)
Chen, Y., Tu, L.: Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, USA, pp. 133–142 (2007)
Jensen, C.S., Lin, D., Ooi, B.C.: Continuous clustering of moving objects. IEEE Trans. Knowl. Data Eng. 19, 1161–1174 (2007)
Jensen, C.S., Lin, D., Ooi, B.C., Zhang, R.: Effective density queries on continuously moving objects. In: Proceedings of the IEEE 22nd International Conference on Data Engineering (2006)
Ni, J., Ravishankar, C.V.: Pointwise-dense region queries in spatio-temporal databases. In: Proceedings of the IEEE 23rd International Conference on Data Engineering, pp. 1066–1075 (2007)
Jeung, H., Shen, H.T., Zhou, X.: Mining trajectory patterns using Hidden Markov Models. In: Song, I.-Y., Eder, J., Nguyen, T.M. (eds.) DaWaK 2007. LNCS, vol. 4654, pp. 470–480. Springer, Heidelberg (2007)
Etzion, O., Niblett, P.: Event Processing in Action. Manning Publications Co., USA (2010)
EsperTech: Esper - Complex Event Processing. http://www.espertech.com/esper/
Acknowledgment
This work was partly funded by CNPq under grants 557128/2009-9, 303332/2013-1, 442338/2014-7, by FAPERJ under grants E-26-170028/2008, E-26/201.337/2014 and E-01/209996/2015, and by Microsoft Research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Roriz Junior, M., Endler, M., Casanova, M.A., Lopes, H., Silva e Silva, F. (2016). A Heuristic Approach for On-line Discovery of Unidentified Spatial Clusters from Grid-Based Streaming Algorithms. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-43946-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)