Skip to main content
Log in

Spatial neighborhood based anomaly detection in sensor datasets

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Success of anomaly detection, similar to other spatial data mining techniques, relies on neighborhood definition. In this paper, we argue that the anomalous behavior of spatial objects in a neighborhood can be truly captured when both (a) spatial autocorrelation (similar behavior of nearby objects due to proximity) and (b) spatial heterogeneity (distinct behavior of nearby objects due to difference in the underlying processes in the region) are taken into consideration for the neighborhood definition. Our approach begins by generating micro neighborhoods around spatial objects encompassing all the information about a spatial object. We selectively merge these based on spatial relationships accounting for autocorrelation and inferential relationships accounting for heterogeneity, forming macro neighborhoods. In such neighborhoods, we then identify (i) spatio-temporal outliers, where individual sensor readings are anomalous, (ii) spatial outliers, where the entire sensor is an anomaly, and (iii) spatio-temporally coalesced outliers, where a group of spatio-temporal outliers in the macro neighborhood are separated by a small time lag indicating the traversal of the anomaly. We demonstrate the effectiveness of our approach in neighborhood formation and anomaly detection with experimental results in (i) water monitoring and (ii) highway traffic monitoring sensor datasets. We also compare the results of our approach with an existing approach for spatial anomaly detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • ARC (2002) ARC IMS 4.0, ArcView 8.3. http://www.esri.com/

  • Aurenhammer F (1991) Voronoi diagrams—a survey of a fundamental geometric data structure. ACM Comput Surv 23(3): 345–405

    Article  Google Scholar 

  • Birant D, Kut A (2006) Spatio-temporal outlier detection in large databases. J Comput Inf Technol 14(4): 291–297

    Google Scholar 

  • Chatfield C (1983) Statistics for technology, a course in applied statistics. Science Paperbacks. Chapman & Hall/CRC, Boca Raton, FL

    Google Scholar 

  • Dasgupta D, Forrest S (1999) Novelty detection in time series data using ideas from immunology. In: International conference on intelligent systems

  • Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases. In: KDD, AAAI Press, USA, pp 44–49

  • Ester M, Kriegel H, Sander J (1997) Spatial data mining: a database approach. In: 5th International symposium on advances in spatial databases, Springer, London, pp 47–66

  • Ester M, Frommelt A, Kriegel HP, Sander J (1998) Algorithms for characterization and trend detection in spatial databases. In: 4th International conference on KDD

  • Ester M, Kriegel HP, Sander J (1999) Knowledge discovery in spatial databases. In: KI ’99: proceedings of the 23rd annual German conference on artificial intelligence, Springer, London, pp 61–74

  • Estivill-Castro V, Lee I (2000) Autoclust: automatic clustering via boundary extraction for mining massive point—data sets. In: 5th International conference on geocomputation

  • Griffith D (1987) Spatial autocorrelation: a primer. Assoc Am Geogr

  • Haining R (2003) Spatial data analysis: theory and practice. Cambridge University Press, Cambridge

    Google Scholar 

  • Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12): 1472–1485

    Article  Google Scholar 

  • Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. GeoInformatica 10(3): 239–260

    Article  Google Scholar 

  • Kang I, Kim T, Li K (1997) A spatial data mining method by delaunay triangulation. In: 5th ACM international workshop on advances in geographic information systems, pp 35–39. doi:10.1145/267825.267836

  • Kang JM, Shekhar S, Wennen C, Novak P (2008) Discovering flow anomalies: a sweet approach. In: ICDM, IEEE computer society, pp 851–856

  • Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. John Wiley & Sons Inc., Hoboken, NJ

    Google Scholar 

  • Keogh E, Lonardi S, Chiu BY (2002) Finding surprising patterns in a time series database in linear time and space. In: 8th ACM international conference on knowledge discovery and data mining, ACM Press, New York, NY, pp 550–556. doi:10.1145/775047.775128

  • Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: 24th International conference on very large data bases, NY, USA, pp 392–403. http://www.vldb.org/conf/1998/p392.pdf

  • Kou Y, Lu CT, Santos RFD (2007) Spatial outlier detection: a graph-based approach. In: ICTAI ’07: proceedings of the 19th IEEE international conference on tools with artificial intelligence, vol 1 (ICTAI 2007), IEEE Computer Society, Washington, DC, pp 281–288. doi:10.1109/ICTAI.2007.169

  • Kulldorff M (1997) A spatial scan statistic. Commun Stat Theory Methods 26(6): 1481–1496

    Article  MATH  MathSciNet  Google Scholar 

  • Kulldorff M, Athas WF, Feurer EJ, AMiller B, Key CR (1998) Evaluating cluster alarms: a space-time scan statistic and brain cancer in los alamos, new mexico. Am J Public Health 88(9): 1377–1380

    Article  Google Scholar 

  • Lu C, Chen D, Kou Y (2003) Detecting spatial outliers with multiple attributes. In: 15th IEEE international conference on tools with artificial intelligence, p 122

  • Lu CT, Kou Y, Zhao J, Chen L (2007) Detecting and tracking regional outliers in meteorological data. Inf Sci 177(7): 1609–1632

    Article  Google Scholar 

  • McGuire MP, Janeja V, Gangopadhyay A (2008) Spatiotemporal neighborhood discovery for sensor data. In: Proceedings of the 2nd international workshop on knowledge discovery from sensor data (Sensor-KDD 2007), held in conjunction with the 14th international conference on knowledge discovery and data mining (ACM SIG-KDD 2008)

  • Miller HJ, Han J (2001) Geographic data mining and knowledge discovery. Taylor & Francis Inc., New York, NY

    Book  Google Scholar 

  • Moran P (1948) The interpretation of statistical maps. J R Stat Soc B 10(243): 51

    Google Scholar 

  • NASQAN (2002) USGS, National stream water quality network (NASQAN), published data. http://pubs.usgs.gov/dds/wqn96cd/html/wqn/wq/region05.htm. Accessed 25 Aug 2009

  • Naus J (1965) The distribution of the size of the maximum cluster of points on the line. J Am Stat Assoc 60: 532–538

    Article  MathSciNet  Google Scholar 

  • Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: 20th International conference on very large data bases, Morgan Kaufmann, Los Altos, CA, pp 144–155

  • Okabe A, Boots B, Sugihara K, Chiu S (2000) Spatial tessellations: concepts and applications of Voronoi diagrams. John Wiley & Sons Ltd., West Sussex, England

    MATH  Google Scholar 

  • Shahabi C, Tian X, Zhao W (2000) TSA-tree: a wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data. In: 12th International conference on scientific and statistical database management

  • Shekhar S, Lu C, Zhang P (2001) Detecting graph-based spatial outliers: algorithms and applications (a summary of results). In: 7th ACM international conference on knowledge discovery and data mining, pp 371–376. doi:10.1145/502512.502567

  • Shekhar S, Schrater P, Vatsavai R, Wu W, Chawla S (2002) Spatial contextual classification and prediction models for mining geospatial data. In: IEEE transaction on multimedia

  • Shekhar S, Lu CT, Zhang P, Shekhar S, Lu CT, Zhang P (2003) A unified approach to spatial outliers detection. GeoInformatica 7: 139–166

    Article  Google Scholar 

  • Shewchuk JR (1996) Triangle: engineering a 2d quality mesh generator and delaunay triangulator. In: Selected papers from the workshop on applied computational geormetry, towards geometric engineering, Springer, London, pp 203–222

  • Sun P, Chawla S (2004) On local spatial outliers. In: 4th IEEE international conference on data mining, pp 209–216

  • Unwin D (1982) Introductory spatial analysis. Methuen, London

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaideep Vaidya.

Additional information

Responsible editor: Sanjay Chawla.

This work is supported in part by the National Science Foundation under grants IIS-0306838 and CNS-0746943.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Janeja, V.P., Adam, N.R., Atluri, V. et al. Spatial neighborhood based anomaly detection in sensor datasets. Data Min Knowl Disc 20, 221–258 (2010). https://doi.org/10.1007/s10618-009-0147-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-009-0147-0

Keywords

Navigation