skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Detecting outliers in streaming time series data from ARM distributed sensors

Conference ·

The Atmospheric Radiation Measurement (ARM) Data Center at ORNL collects data from a number of permanent and mobile facilities around the globe. The data is then ingested to create high level scientific products. High frequency streaming measurements from sensors and radar instruments at ARM sites require high degree of accuracy to enable rigorous study of atmospheric processes. Outliers in collected data are common due to instrument failure or extreme weather events. Thus, it is critical to identify and flag them. We employed multiple univariate, multivariate and time series techniques for outlier detection methods and studied their effectiveness. First, we examined Pearson correlation coefficient which is used to measure the pairwise correlations between variables. Singular Spectrum Analysis (SSA) was applied to detect outliers by removing the anticipated annual and seasonal cycles from the signal to accentuate anomalies. K-means was applied for multivariate examination of data from collection of sensors to identify any deviation from expected and known patterns and identify abnormal observation. The Pearson correlation coefficient, SSA and K-means methods were later combined together in a framework to detect outliersthrough a range of checks. We applied the developed method to data from meteorological sensors at ARM Southern Great Plains site and validated against existing database of known data quality issues.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1491320
Resource Relation:
Conference: IEEE International Conference on Data Mining Workshops - Singapore, , Singapore - 11/17/2018 10:00:00 AM-11/20/2018 10:00:00 AM
Country of Publication:
United States
Language:
English

References (21)

The ARM Data Quality Program journal April 2016
The ARM Data System and Archive journal April 2016
Potential of Multivariate Quantitative Methods for Delineation and Visualization of Ecoregions journal April 2004
The benefits of multivariate singular spectrum analysis over the univariate version journal January 2018
Deploying the ARM Sites and Supporting Infrastructure journal April 2016
Spatio-Temporal Outlier Detection in Large Databases journal January 2006
Anomaly detection in streaming environmental sensor data: A data-driven modeling approach journal September 2010
An analysis of Australian seasonal rainfall anomalies: 1950–1987. II: Temporal variability and teleconnection patterns journal March 1993
Outlier Detection for Temporal Data: A Survey journal September 2014
An algorithm for the machine calculation of complex Fourier series journal May 1965
A scientific data processing framework for time series NetCDF data journal October 2014
Anomaly Detection and Diagnosis Algorithms for Discrete Symbol Sequences with Applications to Airline Safety journal January 2009
Using Clustered Climate Regimes to Analyze and Compare Predictions from Fully Coupled General Circulation Models journal July 2005
Outliers in Time Series journal July 1972
Admit conference July 2002
Comparative Evaluation of Anomaly Detection Techniques for Sequence Data conference December 2008
The Atmospheric Radiation Measurement (ARM) Program: Programmatic Background and Design of the Cloud and Radiation Test Bed journal July 1994
NetCDF: an interface for scientific data access journal July 1990
Basic Singular Spectrum Analysis and forecasting with R journal March 2014
VII. Note on regression and inheritance in the case of two parents journal January 1895
Relationship between Singular Spectrum Analysis and Fourier analysis: Theory and application to the monitoring of volcanic activity journal August 2010

Similar Records

Real-time statistical quality control and ARM
Conference · Fri May 01 00:00:00 EDT 1992 · OSTI ID:1491320

Real-time statistical quality control and ARM
Conference · Fri May 01 00:00:00 EDT 1992 · OSTI ID:1491320

Analysis and Calibration of CRF Raman Lidar Cloud Liquid Water Measurements
Technical Report · Wed Oct 31 00:00:00 EDT 2007 · OSTI ID:1491320

Related Subjects