ABSTRACT
The advent of smart meters and advanced communication infrastructures catalyzes numerous smart grid applications such as dynamic demand response, and paves the way to solve challenging research problems in sustainable energy consumption. The space of solution possibilities are restricted primarily by the huge amount of generated data requiring considerable computational resources and efficient algorithms. To overcome this Big Data challenge, data clustering techniques have been proposed. Current approaches however do not scale in the face of the "increasing dimensionality" problem, where a cluster point is represented by the entire customer consumption time series. To overcome this aspect we first rethink the way cluster points are created and designed, and then devise OPTIC, an efficient online time series clustering technique for demand response (DR), in order to analyze high volume, high dimensional energy consumption time series data at scale, and on the fly. OPTIC is randomized in nature, and provides optimal performance guarantees (Section 2.3.2) in a computationally efficient manner. Unlike prior work we (i) study the consumption properties of the whole population simultaneously rather than developing individual models for each customer separately, claiming it to be a 'killer' approach that breaks the "of dimensionality" in online time series clustering, and (ii) provide tight performance guarantees in theory to validate our approach. Our insights are driven by the field of sociology, where collective behavior often emerges as the result of individual patterns and lifestyles. We demonstrate the efficacy of OPTIC in practice using real-world data obtained from the fully operational USC microgrid.
- F. Rahimi and A. Ipakchi, "Demand response as a market resource under the smart grid paradigm," Smart Grid, IEEE Transactions on, vol. 1, no. 1, pp. 82--88, 2010.Google ScholarCross Ref
- M. Frincu, C. Chelmis, M. U. Noor, and V. K. Prasanna, "Accurate and efficient selection of the best consumption prediction method in smart grids," in Proc. IEEE International Conference on Big Data, IEEE, 2014.Google Scholar
- Y. Simmhan and M. Noor, "Scalable prediction of energy consumption using incremental time series clustering," in Big Data, 2013 IEEE International Conference on, pp. 29--36, Oct 2013.Google ScholarCross Ref
- T. W. Liao, "Clustering of time series data - a survey," Pattern Recognition, vol. 38, no. 11, 2005. Google ScholarDigital Library
- S.-L. Chua, S. Marsland, and H. Guesgen, "Unsupervised learning of human behaviors," in AAAI, 2011.Google Scholar
- H. Hino, H. Shen, N. Murata, S. Wakao, and Y. Hayashi, "A versatile clustering method for electricity consumption pattern analysis in households," IEEE Transactions on Smart Grid, 2013.Google ScholarCross Ref
- F. Martinez-Alvarez, A. Troncoso, J. C. Riquelme, and J. S. Ruiz, "Energy time series forecasting based on pattern similarity," IEEE Transactions on Knowledge ad Data Engineering, 2011. Google ScholarDigital Library
- R. Ding, Q. Wang, Y. Dang, Q. Fu, H. Zhang, and D. Zhang, "Yading: Fast clustering of large-scale time series data," in VLDB, 2015. Google ScholarDigital Library
- B. Everitt, Cluster Analysis. Heinemann Educational, 1974.Google Scholar
- C. J. van Rijsbergen, Information Retrieval. Buttersworth, 1979. Google ScholarDigital Library
- G. Salton and M. J. Gill, Introduction to Modern Information Retrieval. McGraw-Hill Book Compnay, 1983. Google ScholarDigital Library
- P. Willet, "Recent trends in hierarchical document clustering: A critical review," Information Processing and Management, vol. 24, 1988. Google ScholarDigital Library
- M. Bern and D. Eppstein, Approximation Algorithms for Geometric Problems. PWS Publishing Company, 1996.Google Scholar
- D. Hochbaum, Various Notions of Approximations: Good, Better, Best, and More. PWS Publishing Company, 1996.Google Scholar
- M. R. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979. Google ScholarDigital Library
- O. Kariv and S. L. Hakimi, "An algorithmic approach to network location problems," SIAM Journal of Applied Mathematics, vol. 37, 1979.Google Scholar
- T. Feder and D. H. Greene, "Optimal algorithms for approximate clustering," in STOC, 1988. Google ScholarDigital Library
- T. E. Gonzalez, "Clustering to minimize the maximum inter-cluster distance," Theoretical Computer Science, vol. 38, 1985.Google Scholar
- M. Charikar, C. Chekuri, T. Ferer, and R. Motwani, "Incremental clustering and dynamic information retrieval," SIAM Journal on Computing, vol. 33, no. 6, 2004. Google ScholarDigital Library
- A. King, "Online k-means clustering of non-stationary data." Technical Report, Massachusetts Institute of Technology, May 2012.Google Scholar
- W. Barbakh and C. Fyfe, "Online clustering algorithms," International Journal of Neural Systems, vol. 18, no. 3, 2008.Google ScholarCross Ref
- A. Choromanska and C. Monteleoni, "Online clustering with experts," Journal of Machine Learning Research, vol. 22, 2012.Google Scholar
- D. Arthur and S. Vassilvitskii, "k-means++," in SODA, 2007.Google Scholar
Index Terms
- Challenge: On Online Time Series Clustering for Demand Response: Optic - A Theory to Break the 'Curse of Dimensionality'
Recommendations
Mathematical Programming Formulations and Algorithms for Discrete k-Median Clustering of Time-Series Data
<P>Discrete k-median (DKM) clustering problems arise in many real-life applications that involve time-series data sets, in which nondiscrete clustering methods may not represent the problem domain adequately. In this study, we propose mathematical ...
Data center demand response: Avoiding the coincident peak via workload shifting and local generation
Demand response is a crucial aspect of the future smart grid. It has the potential to provide significant peak demand reduction and to ease the incorporation of renewable energy into the grid. Data centers' participation in demand response is becoming ...
Demand Response clustering: Automatically finding optimal cluster hyper-parameter values
e-Energy '18: Proceedings of the Ninth International Conference on Future Energy SystemsTime series clustering methods, such as Fuzzy C-Means (FCM) noise clustering, can be efficiently used to obtain typical price-influenced load profiles (TPILPs) through the data-driven analysis and modelling of the consumption behaviour of household ...
Comments