Article

On detecting space-time clusters

Author:
Vijay S. Iyengar

Thomas J. Watson Research Center, Yorktown Heights, NY

Thomas J. Watson Research Center, Yorktown Heights, NY
View Profile

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2004Pages 587–592https://doi.org/10.1145/1014052.1014124

Published:22 August 2004Publication History

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 587–592

ABSTRACT

Detection of space-time clusters is an important function in various domains (e.g., epidemiology and public health). The pioneering work on the spatial scan statistic is often used as the basis to detect and evaluate such clusters. State-of-the-art systems based on this approach detect clusters with restrictive shapes that cannot model growth and shifts in location over time. We extend these methods significantly by using the flexible square pyramid shape to model such effects. A heuristic search method is developed to detect the most likely clusters using a randomized algorithm in combination with geometric shapes processing. The use of Monte Carlo methods in the original scan statistic formulation is continued in our work to address the multiple hypothesis testing issues. Our method is applied to a real data set on brain cancer occurrences over a 19 year period. The cluster detected by our method shows both growth and movement which could not have been modeled with the simpler cylindrical shapes used earlier. Our general framework can be extended quite easily to handle other flexible shapes for the space-time clusters.

References

R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the ACM-SIGMOD International Conference on Management of Data, pages 94--105, 1998. Google ScholarDigital Library
L. Duczmal and R. Assuncao. A simulated annealing strategy for the detection of arbitrary shaped spatial clusters. Computational Statistics and Data Analysis, March 2003.Google Scholar
J. Fleiss. Statistical methods for Rates and Proportions. John Wiley & Sons, 1981.Google Scholar
J. Glaz and N. Balakrishnan. Scan Statistics and Applications. Birkhauser, 1999. Google ScholarDigital Library
D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, 1989. Google ScholarDigital Library
M. Kulldorff. A spatial scan statistic. Communications in Statistics: Theory and Methods, 26(6):1481--1496, 1997.Google ScholarCross Ref
M. Kulldorff. Spatial scan statistics: models, calculations, and applications. In Scan Statistics and Applications, edited by Glaz and Balakrishnan, 1999.Google Scholar
M. Kulldorff, W. Athas, E. Feuer, B. Miller, and C. Key. Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos. American Journal of Public Health, 88:1377--1380, 1998.Google ScholarCross Ref
M. Kulldorff and Information Management Services Inc. Satscan v. 3.1: Software for the spatial and space-time scan statistics. Technical report, 2002. http://www.satscan.org/.Google Scholar
National Cancer Institute. Brain cancer in New Mexico. Technical Report Data set (1973-1991), Division of Cancer Prevention, Biometry Research Group.Google Scholar
D. Neill and A. Moore. A fast multi-resolution method for detection of significant spatial overdensities. Technical Report Carnegie Mellon CSD Technical Report CMU-CS-03-154 (Abbreviated version to appear in NIPS 2003), Carnegie Mellon University, June 2003.Google Scholar
P. van Laarhoven and E. Aarts. Simulated Annealing: Theory and Applications. D. Reidel Publishing Company, 1987. Google ScholarDigital Library
D. Wilson and B. Rudin. Introduction to the IBM Optimization Subroutine Library. IBM Systems Journal, 31(1):4--10, 1992. Google ScholarDigital Library

Index Terms

On detecting space-time clusters
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Detection of emerging space-time clusters
KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

We propose a new class of spatio-temporal cluster detection methods designed for the rapid detection of emerging space-time clusters. We focus on the motivating application of prospective disease surveillance: detecting space-time clusters of disease ...
Read More
Detecting Clusters and Outliers for Multi-dimensional Data
MUE '08: Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering

Nowadays many data mining algorithms focus on clustering methods. There are also a lot of approaches designed for outlier detection. We observe that, in many situations, clusters and outliers are concepts whose meanings are inseparable to each other, ...
Read More
A novel method for selecting initial centroids in K-means clustering algorithm

In data mining, clustering is a method of grouping similar points together. This grouping can be done using partitioning or hierarchical clustering algorithms. K-means is one of the partitioning clustering algorithms which is simple and faster than ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2004
874 pages
ISBN:1581138881
DOI:10.1145/1014052
General Chairs:
Won Kim
Cyber Database Solutions
,
Ronny Kohavi
Amazon.com
,
Program Chairs:
Johannes Gehrke
Cornell University
,
William DuMouchel
AT&T Labs Research
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 August 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Monte Carlo
clusters
search
space-time region
spatial scan statistic
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 875
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On detecting space-time clusters

KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Detection of emerging space-time clusters

Detecting Clusters and Outliers for Multi-dimensional Data

A novel method for selecting initial centroids in K-means clustering algorithm