skip to main content
research-article

Spatio-Temporal Denoising Graph Autoencoders with Data Augmentation for Photovoltaic Data Imputation

Authors Info & Claims
Published:30 May 2023Publication History
Skip Abstract Section

Abstract

The integration of the global Photovoltaic (PV) market with real time data-loggers has enabled large scale PV data analytical pipelines for power forecasting and reliability assessment of PV fleets. Nevertheless, the performance of PV data analysis depends on the quality of PV timeseries data. We propose a novel Spatio-Temporal Denoising Graph Autoencoder STD-GAE framework to impute missing PV Power Data. STD-GAE exploits temporal correlation, spatial coherence, and value dependencies from domain knowledge to recover missing data. It is empowered by two modules. (1) To cope with sparse yet various scenarios of missing data, STD-GAE incorporates a domain-knowledge aware data augmentation module to create plausible variations of missing data patterns. This generalizes STD-GAE to robust imputation over different seasons and environment. (2) STD-GAE nontrivially integrates spatiotemporal graph convolution layers and denoising autoencoder to improve the accuracy of imputation accuracy at PV fleet level. Experimental results on two PV datasets show that STD-GAE can achieve a gain of 43.14% in imputation accuracy and remains less sensitive to missing rate, different seasons, and missing scenarios, compared with state-of-the-art data imputation methods.

Skip Supplemental Material Section

Supplemental Material

PACMMOD-V1mod050.mp4

mp4

23.8 MB

References

  1. Alan J. Curran, Tyler Burleyson, Sascha Lindig, David Moser, and Roger H. French,. 2020. PVplr: Performance Loss Rate Analysis Pipeline. https://CRAN.R-project.org/package=PVplr tex.ids: a.j.curranPVplrSDLEPerformance2020,curranPVplrPerformanceLoss2020.Google ScholarGoogle Scholar
  2. Alan J Curran, Tyler L Burleyson, Sascha Lindig, Joshua Stein, Laura S Bruckman, David Moser, and Roger H French. 2020. PVplr: R Package Implementation of Multiple Filters and Algorithms for Time-series Performance Loss Rate Analysis. In PVSC 47.Google ScholarGoogle Scholar
  3. Lei Bai, Lina Yao, Can Li, Xianzhi Wang, and Can Wang. 2020. Adaptive graph convolutional recurrent network for traffic forecasting. Advances in Neural Information Processing Systems 33 (2020), 17804--17815.Google ScholarGoogle Scholar
  4. Alessandro Betti, Maria Luisa Lo Trovato, Fabio Leonardi, Giuseppe Leotta, Fabrizio Ruffini, and Ciro Lanzetta. 2019.Predictive Maintenance in Photovoltaic Plants with a Big Data Approach. ArXiv (2019).Google ScholarGoogle Scholar
  5. Thierry Blu, Philippe Thévenaz, and Michael Unser. 2004. Linear interpolation revitalized. IEEE Transactions on Image Processing 13, 5 (2004), 710--719.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ajoy Kumar Chakraborty and Navonita Sharma. 2016. Advanced metering infrastructure: Technology and challenges. In 2016 IEEE/PES Transmission and Distribution Conference and Exposition (T D).Google ScholarGoogle ScholarCross RefCross Ref
  7. Xinyu Chen, Jinming Yang, and Lijun Sun. 2020. A nonconvex low-rank tensor completion model for spatiotemporal traffic data imputation. Transportation Research Part C: Emerging Technologies 117 (2020), 102673.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jens Christiansen. 2021. Global Market Outlook for Solar Power. Technical Report. SolarPower Europe. 136 pages.Google ScholarGoogle Scholar
  9. Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Zulong Diao, Xin Wang, Dafang Zhang, Yingru Liu, Kun Xie, and Shaoyao He. 2019. Dynamic Spatial-Temporal Graph Convolutional Neural Networks for Traffic Forecasting. In AAAI.Google ScholarGoogle Scholar
  11. A. P. Dobos. 2014. PVWatts Version 5 Manual. Technical Report NREL/TP-6A20--62641. National Renewable Energy Lab. (NREL), Golden, CO (United States). https://doi.org/10.2172/1158421Google ScholarGoogle Scholar
  12. A Rogier T Donders, Geert JMG Van Der Heijden, Theo Stijnen, and Karel GM Moons. 2006. A gentle introduction to imputation of missing values. Journal of clinical epidemiology 59, 10 (2006), 1087--1091.Google ScholarGoogle ScholarCross RefCross Ref
  13. Roger H. French, Laura S. Bruckman, David Moser, Sascha Lindig, Mike van Iseghem, Björn Müller, Joshua S. Stein, Mauricio Richter, Magnus Herz, Wilfried Van Sark, Franz Baumgartner, Julián Ascencio-Vásquez, Dario Bertani, Giosué Maugeri, Alan J. Curran, Kunal Rath, JiQi Liu, Arash Khalilnejad, Mohammed Meftah, Dirk Jordan, Chris Deline, Georgios Makrides, George Georghiou, Andreas Livera, Bennet Meyers, Gilles Plessis, Marios Theristis, and Wei Luo. 2001. Assessment of Performance Loss Rate of PV Power Systems. IEA-PVPS.Google ScholarGoogle Scholar
  14. Lovedeep Gondara and Ke Wang. 2018. Mida: Multiple imputation using denoising autoencoders. In Pacific-Asia conference on knowledge discovery and data mining. Springer, 260--272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ahmad Maroof Karimi, Yinghui Wu, Mehmet Koyuturk, and Roger H French. 2021. Spatiotemporal Graph Neural Network for Performance Prediction of Photovoltaic Power Systems. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  16. Arash Khalilnejad, Ahmad M. Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H. French, and Alexis R. Abramson. 2020. Automated Pipeline Framework for Processing of Large-Scale Building Energy Time Series Data. PLOS ONE 15 (2020).Google ScholarGoogle Scholar
  17. Arash Khalilnejad, Ahmad M Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H French, and Alexis R Abramson. 2020. Automated pipeline framework for processing of large-scale building energy time series data. PloS one (2020).Google ScholarGoogle Scholar
  18. Hufsa Khan, Xizhao Wang, and Han Liu. 2022. Handling missing data through deep convolutional neural network. Information Sciences 595 (2022), 278--293.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).Google ScholarGoogle Scholar
  20. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.Google ScholarGoogle Scholar
  21. Sascha Lindig, Atse Louwen, M Herz, J Ascencio-Vásquez, David Moser, and M Topic. 2021. Performance Imputation Techniques for Assessing Costs of Technical Failures in PV Systems. In Proceedings / 38th European Photovoltaic Solar Energy Conference and Exhibition.Google ScholarGoogle Scholar
  22. Sascha Lindig, Atse Louwen, David Moser, and Marko Topic. 2020. Outdoor PV system monitoring-input data quality, data imputation and filtering approaches. Energies (2020).Google ScholarGoogle Scholar
  23. Sascha Lindig, David Moser, Alan J. Curran, Kunal Rath, Arash Khalilnejad, Roger H. French, Magnus Herz, Björn Müller, George Makrides, George Georghiou, Andreas Livera, Mauricio Richter, Julián Ascencio-Vásquez, Mike van Iseghem, Mohammed Meftah, Dirk Jordan, Chris Deline, Wilfried van Sark, Joshua S. Stein, Marios Theristis, Bennet Meyers, Franz Baumgartner, and Wei Luo. 2021. International collaboration framework for the calculation of performance loss rates: Data quality, benchmarks, and trends (towards a uniform methodology). Progress in Photovoltaics: Research and Applications (2021).Google ScholarGoogle Scholar
  24. Shao-Hsien Liu, Stavroula A Chrysanthopoulou, Qiuzhi Chang, Jacob N Hunnicutt, and Kate L Lapane. 2019. Missing data in marginal structural models: a plasmode simulation study comparing multiple imputation and inverse probabilityweighting. Medical care 57, 3 (2019), 237.Google ScholarGoogle ScholarCross RefCross Ref
  25. Javier López-de Lacalle. 2019. tsoutliers: Detection of Outliers in Time Series. https://CRAN.R-project.org/package=tsoutliers tex.ids: lopez-de lacalleTsoutliersDetectionOutliers2016, lopez2016tsoutliers.Google ScholarGoogle Scholar
  26. R Malarvizhi and Antony Selvadoss Thanamani. 2012. K-nearest neighbor in missing data imputation. International Journal of Engineering Research and Development 5, 1 (2012), 5--7.Google ScholarGoogle Scholar
  27. Noor Bariah Mohamad, Boon-Han Lim, and An-Chow Lai. 2021. Imputation of Missing Values for Solar Irradiance Data under Different Weathers using Univariate Methods. IOP Conference Series: Earth and Environmental Science (2021).Google ScholarGoogle ScholarCross RefCross Ref
  28. Ricardo Cardoso Pereira, Miriam Seoane Santos, Pedro Pereira Rodrigues, and Pedro Henriques Abreu. 2020. Reviewing autoencoders for missing data imputation: Technical trends, applications and outcomes. Journal of Artificial Intelligence Research 69 (2020), 1255--1285.Google ScholarGoogle ScholarCross RefCross Ref
  29. Ethan M. Pickering, Mohammad A. Hossain, Roger H. French, and Alexis R. Abramson. 2018. Building electricity consumption: Data analytics of building operations with classical time series decomposition and case based subsetting. Energy and Buildings (2018).Google ScholarGoogle Scholar
  30. Fabrício José Pontes, GF Amorim, Pedro Paulo Balestrassi, AP Paiva, and João Roberto Ferreira. 2016. Design of experiments and focused grid search for neural network parameter optimization. Neurocomputing 186 (2016), 22--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Irene Romero-Fiances, Andreas Livera, Marios Theristis, George Makrides, Joshua S. Stein, Gustavo Nofuentes, Juan de la Casa, and George E. Georghiou. 2022. Impact of duration and missing data on the long-term photovoltaic degradation rate estimation. Renewable Energy 181 (2022), 738--748.Google ScholarGoogle ScholarCross RefCross Ref
  32. Patrick Royston and Ian R White. 2011. Multiple imputation by chained equations (MICE): implementation in Stata. Journal of statistical software 45 (2011), 1--20.Google ScholarGoogle ScholarCross RefCross Ref
  33. Benedek Rozemberczki, Paul Scherer, Yixuan He, George Panagopoulos, Alexander Riedel, Maria Astefanoaei, Oliver Kiss, Ferenc Beres, Guzman Lopez, Nicolas Collignon, and Rik Sarkar. 2021. PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Shaun R Seaman and Ian R White. 2013. Review of inverse probability weighting for dealing with missing data. Statistical methods in medical research 22, 3 (2013), 278--295.Google ScholarGoogle Scholar
  35. Shaun R Seaman, Ian R White, Andrew J Copas, and Leah Li. 2012. Combining multiple imputation and inverse-probability weighting. Biometrics 68, 1 (2012), 129--137.Google ScholarGoogle ScholarCross RefCross Ref
  36. Youngjoo Seo, Michaël Defferrard, Pierre Vandergheynst, and Xavier Bresson. 2018. Structured sequence modeling with graph convolutional recurrent networks. In International Conference on Neural Information Processing. Springer, 362--373.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Concepción Crespo Turrado, María del Carmen Meizoso López, Fernando Sánchez Lasheras, Benigno Antonio Rodríguez Gómez, José Luis Calvo Rollé, and Francisco Javier de Cos Juez. 2014. Missing data imputation of solar radiation data under different atmospheric conditions. Sensors (2014).Google ScholarGoogle Scholar
  38. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4--24.Google ScholarGoogle ScholarCross RefCross Ref
  39. Bohong Xiang, Feng Yan, Tao Wu, Weiwei Xia, Jin Hu, and Lianfeng Shen. 2020. An Improved Multiple Imputation Method Based on Chained Equations for Distributed Photovoltaic Systems. In 2020 IEEE 6th International Conference on Computer and Communications (ICCC).Google ScholarGoogle Scholar
  40. Mao Yang, Dingze Liu, Yang Cui, Xin Huang, and Gangui Yan. 2020. Research on complementary algorithm of photovoltaic power missing data based on improved cloud model. International Transactions on Electrical Energy Systems 30, 7 (2020), e12350.Google ScholarGoogle ScholarCross RefCross Ref
  41. Yongchao Ye, Shiyao Zhang, and James JQ Yu. 2021. Spatial-temporal traffic data imputation via graph attention convolutional network. In International Conference on Artificial Neural Networks. Springer, 241--252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Bing Yu, Haoteng Yin, and Zhanxing Zhu. 2018. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3634--3640.Google ScholarGoogle ScholarCross RefCross Ref
  43. Xiyue Zhang, Chao Huang, Yong Xu, and Lianghao Xia. 2020. Spatial-Temporal Convolutional Graph Attention Networks for Citywide Traffic Flow Forecasting. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 1853--1862.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Chuanpan Zheng, Xiaoliang Fan, Cheng Wang, and Jianzhong Qi. 2020. Gman: A graph multi-attention network for traffic prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1234--1241.Google ScholarGoogle ScholarCross RefCross Ref
  45. Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57--81.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Spatio-Temporal Denoising Graph Autoencoders with Data Augmentation for Photovoltaic Data Imputation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Management of Data
      Proceedings of the ACM on Management of Data  Volume 1, Issue 1
      PACMMOD
      May 2023
      2807 pages
      EISSN:2836-6573
      DOI:10.1145/3603164
      Issue’s Table of Contents

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 May 2023
      Published in pacmmod Volume 1, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader