Abstract
The integration of the global Photovoltaic (PV) market with real time data-loggers has enabled large scale PV data analytical pipelines for power forecasting and reliability assessment of PV fleets. Nevertheless, the performance of PV data analysis depends on the quality of PV timeseries data. We propose a novel Spatio-Temporal Denoising Graph Autoencoder STD-GAE framework to impute missing PV Power Data. STD-GAE exploits temporal correlation, spatial coherence, and value dependencies from domain knowledge to recover missing data. It is empowered by two modules. (1) To cope with sparse yet various scenarios of missing data, STD-GAE incorporates a domain-knowledge aware data augmentation module to create plausible variations of missing data patterns. This generalizes STD-GAE to robust imputation over different seasons and environment. (2) STD-GAE nontrivially integrates spatiotemporal graph convolution layers and denoising autoencoder to improve the accuracy of imputation accuracy at PV fleet level. Experimental results on two PV datasets show that STD-GAE can achieve a gain of 43.14% in imputation accuracy and remains less sensitive to missing rate, different seasons, and missing scenarios, compared with state-of-the-art data imputation methods.
Supplemental Material
- Alan J. Curran, Tyler Burleyson, Sascha Lindig, David Moser, and Roger H. French,. 2020. PVplr: Performance Loss Rate Analysis Pipeline. https://CRAN.R-project.org/package=PVplr tex.ids: a.j.curranPVplrSDLEPerformance2020,curranPVplrPerformanceLoss2020.Google Scholar
- Alan J Curran, Tyler L Burleyson, Sascha Lindig, Joshua Stein, Laura S Bruckman, David Moser, and Roger H French. 2020. PVplr: R Package Implementation of Multiple Filters and Algorithms for Time-series Performance Loss Rate Analysis. In PVSC 47.Google Scholar
- Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive graph convolutional recurrent network for traffic forecasting. Advances in Neural Information Processing Systems 33 (2020), 17804--17815.Google Scholar
- Alessandro Betti, Maria Luisa Lo Trovato, Fabio Leonardi, Giuseppe Leotta, Fabrizio Ruffini, and Ciro Lanzetta. 2019.Predictive Maintenance in Photovoltaic Plants with a Big Data Approach. ArXiv (2019).Google Scholar
- Thierry Blu, Philippe Thévenaz, and Michael Unser. 2004. Linear interpolation revitalized. IEEE Transactions on Image Processing 13, 5 (2004), 710--719.Google ScholarDigital Library
- Ajoy Kumar Chakraborty and Navonita Sharma. 2016. Advanced metering infrastructure: Technology and challenges. In 2016 IEEE/PES Transmission and Distribution Conference and Exposition (T D).Google ScholarCross Ref
- Xinyu Chen, Jinming Yang, and Lijun Sun. 2020. A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation. Transportation Research Part C: Emerging Technologies 117 (2020), 102673.Google ScholarCross Ref
- Jens Christiansen. 2021. Global Market Outlook for Solar Power. Technical Report. SolarPower Europe. 136 pages.Google Scholar
- Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems.Google ScholarDigital Library
- Zulong Diao, Xin Wang, Dafang Zhang, Yingru Liu, Kun Xie, and Shaoyao He. 2019. Dynamic Spatial-Temporal Graph Convolutional Neural Networks for Traffic Forecasting. In AAAI.Google Scholar
- A. P. Dobos. 2014. PVWatts Version 5 Manual. Technical Report NREL/TP-6A20--62641. National Renewable Energy Lab. (NREL), Golden, CO (United States). https://doi.org/10.2172/1158421Google Scholar
- A Rogier T Donders, Geert JMG Van Der Heijden, Theo Stijnen, and Karel GM Moons. 2006. A gentle introduction to imputation of missing values. Journal of clinical epidemiology 59, 10 (2006), 1087--1091.Google ScholarCross Ref
- Roger H. French, Laura S. Bruckman, David Moser, Sascha Lindig, Mike van Iseghem, Björn Müller, Joshua S. Stein, Mauricio Richter, Magnus Herz, Wilfried Van Sark, Franz Baumgartner, Julián Ascencio-Vásquez, Dario Bertani, Giosué Maugeri, Alan J. Curran, Kunal Rath, JiQi Liu, Arash Khalilnejad, Mohammed Meftah, Dirk Jordan, Chris Deline, Georgios Makrides, George Georghiou, Andreas Livera, Bennet Meyers, Gilles Plessis, Marios Theristis, and Wei Luo. 2001. Assessment of Performance Loss Rate of PV Power Systems. IEA-PVPS.Google Scholar
- Lovedeep Gondara and Ke Wang. 2018. Mida: Multiple imputation using denoising autoencoders. In Pacific-Asia conference on knowledge discovery and data mining. Springer, 260--272.Google ScholarDigital Library
- Ahmad Maroof Karimi, Yinghui Wu, Mehmet Koyuturk, and Roger H French. 2021. Spatiotemporal Graph Neural Network for Performance Prediction of Photovoltaic Power Systems. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Arash Khalilnejad, Ahmad M. Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H. French, and Alexis R. Abramson. 2020. Automated Pipeline Framework for Processing of Large-Scale Building Energy Time Series Data. PLOS ONE 15 (2020).Google Scholar
- Arash Khalilnejad, Ahmad M Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H French, and Alexis R Abramson. 2020. Automated pipeline framework for processing of large-scale building energy time series data. PloS one (2020).Google Scholar
- Hufsa Khan, Xizhao Wang, and Han Liu. 2022. Handling missing data through deep convolutional neural network. Information Sciences 595 (2022), 278--293.Google ScholarDigital Library
- Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).Google Scholar
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.Google Scholar
- Sascha Lindig, Atse Louwen, M Herz, J Ascencio-Vásquez, David Moser, and M Topic. 2021. Performance Imputation Techniques for Assessing Costs of Technical Failures in PV Systems. In Proceedings / 38th European Photovoltaic Solar Energy Conference and Exhibition.Google Scholar
- Sascha Lindig, Atse Louwen, David Moser, and Marko Topic. 2020. Outdoor PV system monitoring-input data quality, data imputation and filtering approaches. Energies (2020).Google Scholar
- Sascha Lindig, David Moser, Alan J. Curran, Kunal Rath, Arash Khalilnejad, Roger H. French, Magnus Herz, Björn Müller, George Makrides, George Georghiou, Andreas Livera, Mauricio Richter, Julián Ascencio-Vásquez, Mike van Iseghem, Mohammed Meftah, Dirk Jordan, Chris Deline, Wilfried van Sark, Joshua S. Stein, Marios Theristis, Bennet Meyers, Franz Baumgartner, and Wei Luo. 2021. International collaboration framework for the calculation of performance loss rates: Data quality, benchmarks, and trends (towards a uniform methodology). Progress in Photovoltaics: Research and Applications (2021).Google Scholar
- Shao-Hsien Liu, Stavroula A Chrysanthopoulou, Qiuzhi Chang, Jacob N Hunnicutt, and Kate L Lapane. 2019. Missing data in marginal structural models: a plasmode simulation study comparing multiple imputation and inverse probabilityweighting. Medical care 57, 3 (2019), 237.Google ScholarCross Ref
- Javier López-de Lacalle. 2019. tsoutliers: Detection of Outliers in Time Series. https://CRAN.R-project.org/package=tsoutliers tex.ids: lopez-de lacalleTsoutliersDetectionOutliers2016, lopez2016tsoutliers.Google Scholar
- R Malarvizhi and Antony Selvadoss Thanamani. 2012. K-nearest neighbor in missing data imputation. International Journal of Engineering Research and Development 5, 1 (2012), 5--7.Google Scholar
- Noor Bariah Mohamad, Boon-Han Lim, and An-Chow Lai. 2021. Imputation of Missing Values for Solar Irradiance Data under Different Weathers using Univariate Methods. IOP Conference Series: Earth and Environmental Science (2021).Google ScholarCross Ref
- Ricardo Cardoso Pereira, Miriam Seoane Santos, Pedro Pereira Rodrigues, and Pedro Henriques Abreu. 2020. Reviewing autoencoders for missing data imputation: Technical trends, applications and outcomes. Journal of Artificial Intelligence Research 69 (2020), 1255--1285.Google ScholarCross Ref
- Ethan M. Pickering, Mohammad A. Hossain, Roger H. French, and Alexis R. Abramson. 2018. Building electricity consumption: Data analytics of building operations with classical time series decomposition and case based subsetting. Energy and Buildings (2018).Google Scholar
- Fabrício José Pontes, GF Amorim, Pedro Paulo Balestrassi, AP Paiva, and João Roberto Ferreira. 2016. Design of experiments and focused grid search for neural network parameter optimization. Neurocomputing 186 (2016), 22--34.Google ScholarDigital Library
- Irene Romero-Fiances, Andreas Livera, Marios Theristis, George Makrides, Joshua S. Stein, Gustavo Nofuentes, Juan de la Casa, and George E. Georghiou. 2022. Impact of duration and missing data on the long-term photovoltaic degradation rate estimation. Renewable Energy 181 (2022), 738--748.Google ScholarCross Ref
- Patrick Royston and Ian R White. 2011. Multiple imputation by chained equations (MICE): implementation in Stata. Journal of statistical software 45 (2011), 1--20.Google ScholarCross Ref
- Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzman Lopez, Nicolas Collignon, and Rik Sarkar. 2021. PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management.Google ScholarDigital Library
- Shaun R Seaman and Ian R White. 2013. Review of inverse probability weighting for dealing with missing data. Statistical methods in medical research 22, 3 (2013), 278--295.Google Scholar
- Shaun R Seaman, Ian R White, Andrew J Copas, and Leah Li. 2012. Combining multiple imputation and inverse-probability weighting. Biometrics 68, 1 (2012), 129--137.Google ScholarCross Ref
- Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson. 2018. Structured sequence modeling with graph convolutional recurrent networks. In International Conference on Neural Information Processing. Springer, 362--373.Google ScholarDigital Library
- Concepción Crespo Turrado, María del Carmen Meizoso López, Fernando Sánchez Lasheras, Benigno Antonio Rodríguez Gómez, José Luis Calvo Rollé, and Francisco Javier de Cos Juez. 2014. Missing data imputation of solar radiation data under different atmospheric conditions. Sensors (2014).Google Scholar
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4--24.Google ScholarCross Ref
- Bohong Xiang, Feng Yan, Tao Wu, Weiwei Xia, Jin Hu, and Lianfeng Shen. 2020. An Improved Multiple Imputation Method Based on Chained Equations for Distributed Photovoltaic Systems. In 2020 IEEE 6th International Conference on Computer and Communications (ICCC).Google Scholar
- Mao Yang, Dingze Liu, Yang Cui, Xin Huang, and Gangui Yan. 2020. Research on complementary algorithm of photovoltaic power missing data based on improved cloud model. International Transactions on Electrical Energy Systems 30, 7 (2020), e12350.Google ScholarCross Ref
- Yongchao Ye, Shiyao Zhang, and James JQ Yu. 2021. Spatial-temporal traffic data imputation via graph attention convolutional network. In International Conference on Artificial Neural Networks. Springer, 241--252.Google ScholarDigital Library
- Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3634--3640.Google ScholarCross Ref
- Xiyue Zhang, Chao Huang, Yong Xu, and Lianghao Xia. 2020. Spatial-Temporal Convolutional Graph Attention Networks for Citywide Traffic Flow Forecasting. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 1853--1862.Google ScholarDigital Library
- Chuanpan Zheng, Xiaoliang Fan, Cheng Wang, and Jianzhong Qi. 2020. Gman: A graph multi-attention network for traffic prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1234--1241.Google ScholarCross Ref
- Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57--81.Google ScholarCross Ref
Index Terms
- Spatio-Temporal Denoising Graph Autoencoders with Data Augmentation for Photovoltaic Data Imputation
Recommendations
MIDIA: exploring denoising autoencoders for missing data imputation
AbstractDue to the ubiquitous presence of missing values (MVs) in real-world datasets, the MV imputation problem, aiming to recover MVs, is an important and fundamental data preprocessing step for various data analytics and mining tasks to effectively ...
Four Factors Affecting Missing Data Imputation
SSDBM '23: Proceedings of the 35th International Conference on Scientific and Statistical Database ManagementMissing data is a common problem in datasets and impacts the reliability of data analysis. Numerous methods to impute (i.e., predict and replace) missing values have been proposed. The quality of these imputed values depends on factors like correlation,...
Siamese Autoencoder-Based Approach for Missing Data Imputation
Computational Science – ICCS 2023AbstractMissing data is an issue that can negatively impact any task performed with the available data and it is often found in real-world domains such as healthcare. One of the most common strategies to address this issue is to perform imputation, where ...
Comments