Abstract
Sparse grid imputation (SGI) is a challenging problem, as its goal is to infer the values of the entire grid from a limited number of cells with values. Traditionally, the problem is solved using regression methods such as KNN and kriging, whereas in the real world, there is often extra information—usually imprecise—that can aid inference and yield better performance. In the SGI problem, in addition to the limited number of fixed grid cells with precise target domain values, there are contextual data and imprecise observations over the whole grid. To solve this problem, we propose a distribution estimation theory for the whole grid and realize the theory via the composition architecture of the Target-Embedding and the Contextual CycleGAN trained with contextual information and imprecise observations. Contextual CycleGAN is structured as two generator–discriminator pairs and uses different types of contextual loss to guide the training. We consider the real-world problem of fine-grained PM2.5 inference with realistic settings: a few (less than 1%) grid cells with precise PM2.5 data and all grid cells with contextual information concerning weather and imprecise observations from satellites and microsensors. The task is to infer reasonable values for all grid cells. As there is no ground truth for empty cells, out-of-sample mean squared error and Jensen–Shannon divergence measurements are used in the empirical study. The results show that Contextual CycleGAN supports the proposed theory and outperforms the methods used for comparison.
- [1] . 1994. Neural networks for maximum likelihood clustering. Sign. Process. 36, 1 (1994), 111–126.Google ScholarDigital Library
- [2] . 2018. Spatial estimation of urban air pollution with the use of artificial neural network models. Atmos. Environ. 191 (2018), 205–213.Google ScholarCross Ref
- [3] . 2021. Imputation of missing data with class imbalance using conditional generative adversarial networks. Neurocomputing 453 (2021), 164–171.Google ScholarDigital Library
- [4] . 2022. FIGSI-facial image generation for suspect identification. In Proceedings of 2nd International Conference on Sustainable Expert Systems. Springer, 877–891.Google ScholarCross Ref
- [5] . 1989. Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. Advances in Neural Information Processing Systems 2 (1989).Google Scholar
- [6] . 2015. Phase retrieval via matrix completion. SIAM Rev. 57, 2 (2015), 225–251.Google ScholarDigital Library
- [7] . 2010. Matrix completion with noise. Proc. IEEE 98, 6 (2010), 925–936.Google ScholarCross Ref
- [8] . 2009. Exact matrix completion via convex optimization. Found. Comput. Math. 9, 6 (2009), 717.Google ScholarCross Ref
- [9] . 2017. An open framework for participatory PM2. 5 monitoring in smart cities. IEEE Access 5 (2017), 14441–14454.Google ScholarCross Ref
- [10] . 2016. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In Advances in Neural Information Processing Systems. 2172–2180.Google Scholar
- [11] . 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8789–8797.Google ScholarCross Ref
- [12] . 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8188–8197.Google ScholarCross Ref
- [13] . 2019. Matrix completion with variational graph autoencoders: Application in hyperlocal air quality inference. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’19). IEEE, 7535–7539.Google ScholarCross Ref
- [14] . 1989. Discriminatory analysis. nonparametric discrimination: Consistency properties. Int. Stat. Rev. 57, 3 (1989), 238–247.Google ScholarCross Ref
- [15] . 2004. Jensen-shannon divergence and hilbert space embedding. In Proceedings of the International Symposium on Information Theory (ISIT’04). IEEE, 31.Google ScholarCross Ref
- [16] . 2020. SI-AGAN: Spatial interpolation with attentional generative adversarial networks for environment monitoring. In Proceedings of the 24th European Conference on Artificial Intelligence (ECAI) (2020).Google Scholar
- [17] . 2016. Deep Learning. Vol. 1. MIT Press, Cambridge, MA.Google ScholarDigital Library
- [18] . 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672–2680.Google ScholarDigital Library
- [19] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarCross Ref
- [20] . 2019. AttGAN: Facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28, 11 (2019), 5464–5478.Google ScholarDigital Library
- [21] . 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504–507.Google ScholarCross Ref
- [22] . 2020. Estimating ground-level PM2. 5 levels in taiwan using data from air quality monitoring stations and high coverage of microsensors. Environ. Pollut. 264 (2020), 114810.Google ScholarCross Ref
- [23] . 2015. Inferring air quality for station location recommendation based on urban big data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 437–446.Google ScholarDigital Library
- [24] . 2020. Facial attribute-controlled sketch-to-image translation with generative adversarial networks. EURASIP J. Image Vid. Process. 2020, 1 (2020), 1–13.Google Scholar
- [25] . 2013. Low-rank matrix completion using alternating minimization. In Proceedings of the 45th Annual ACM Symposium on Theory of Computing. 665–674.Google ScholarDigital Library
- [26] . 2020. Target-embedding autoencoders for supervised representation learning. In Proceedings of the International Conference on Learning Representations (ICLR’20).Google Scholar
- [27] . 2020. InvNet: Encoding geometric and statistical invariances in deep generative models. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’20). 4377–4384.Google ScholarCross Ref
- [28] . 2016. Top-n recommender system via matrix completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.Google ScholarCross Ref
- [29] . 2010. Matrix completion from a few entries. IEEE Trans. Inf. Theory 56, 6 (2010), 2980–2998.Google ScholarDigital Library
- [30] . 2022. Deep neural networks for spatiotemporal PM2. 5 forecasts based on atmospheric chemical transport model output and monitoring data. Environ. Pollut. 306 (2022), 119348.Google ScholarCross Ref
- [31] . 1951. A statistical approach to some basic mine valuation problems on the witwatersrand. J. South. Afr. Inst. Min. Metallurg. 52, 6 (1951), 119–139.Google Scholar
- [32] . 1951. On information and sufficiency. Ann. Math. Stat. 22, 1 (1951), 79–86.Google ScholarCross Ref
- [33] . 1989. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 4 (1989), 541–551.Google ScholarDigital Library
- [34] . 2022. Variational cycle-consistent imputation adversarial networks for general missing patterns. Pattern Recogn. (2022), 108720.Google ScholarDigital Library
- [35] . 2020. Integrating low-cost air quality sensor networks with fixed and satellite monitoring systems to study ground-level PM2. 5. Atmos. Environ. 223 (2020), 117293.Google ScholarCross Ref
- [36] . 2019. Misgan: Learning from incomplete data with generative adversarial networks. In Proceedings of the International Conference on Learning Representations (ICLR’19).Google Scholar
- [37] . 2022. Spatial-temporal characterization of air pollutants using a hybrid deep learning/Kriging model incorporated with a weather normalization technique. Atmos. Environ. 289 (2022), 119304.Google ScholarCross Ref
- [38] . 2016. Coupled generative adversarial networks. Adv. Neural Inf. Process. Syst. 29 (2016).Google Scholar
- [39] . 2019. Inferring fine-grained air pollution map via a spatiotemporal super-resolution scheme. In Adjunct Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers. 498–504.Google ScholarDigital Library
- [40] . 2018. Attribute-guided face generation using conditional cyclegan. In Proceedings of the European Conference on Computer Vision (ECCV’18). 282–297.Google ScholarDigital Library
- [41] . 2022. CMAFGAN: A cross-modal attention fusion based generative adversarial network for attribute word-to-face synthesis. Knowl.-Bas. Syst. 255 (2022), 109750.Google ScholarDigital Library
- [42] . 2019. A temporal-spatial interpolation and extrapolation method based on geographic long short-term memory neural network for PM2. 5. J. Clean. Prod. 237 (2019), 117729.Google ScholarCross Ref
- [43] . 2018. Guiding the data learning process with physical model in air pollution inference. In Proceedings of the IEEE International Conference on Big Data (Big Data’18). IEEE, 4475–4483.Google ScholarCross Ref
- [44] . 2021. Generative semi-supervised learning for multivariate time series imputation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 8983–8991.Google ScholarCross Ref
- [45] . 2014. Conditional generative adversarial nets. arXiv:1411.1784. Retrieved from https://arxiv.org/abs/1411.1784Google Scholar
- [46] . 2018. Foundations of Machine Learning. MIT Press.Google ScholarDigital Library
- [47] . 2021. Introduction to Linear Regression Analysis. John Wiley & Sons.Google Scholar
- [48] . 2014. PyKrige: Development of a kriging toolkit for python. In AGU Fall Meeting Abstracts. H51K–0753.Google Scholar
- [49] . 2020. Quality guided sketch-to-photo image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 820–821.Google ScholarCross Ref
- [50] . 2020. IFGAN: Missing value imputation using feature-specific generative adversarial networks. In Proceedings of the IEEE International Conference on Big Data (Big Data’20). IEEE, 4715–4723.Google ScholarCross Ref
- [51] . 2022. Deep matrix factorization models for estimation of missing data in a low-cost sensor network to measure air quality. Ecol. Inf. 71 (2022), 101775.Google ScholarCross Ref
- [52] . 2022. Conditional generative adversarial networks with total variation and color correction for generating indonesian face photo from sketch. Appl. Sci. 12, 19 (2022), 10006.Google ScholarCross Ref
- [53] . 1985. Learning Internal Representations by Error Propagation.
Technical Report . Institute for Cognitive Science, University of California, San Diego, San Diego, CA.Google ScholarCross Ref - [54] . 2021. Temporal convolutional denoising autoencoder network for air pollution prediction with missing values. Urb. Clim. 38 (2021), 100872.Google ScholarCross Ref
- [55] . 2017. VIGAN: Missing view imputation with generative adversarial networks. In Proceedings of the IEEE International Conference on Big Data (Big Data’17). IEEE, 766–775.Google ScholarCross Ref
- [56] . 2020. Imputation and low-rank estimation with missing not at random data. Stat. Comput. 30, 6 (2020), 1629–1643.Google ScholarDigital Library
- [57] . 2011. The mathematics of atmospheric dispersion modeling. SIAM Rev. 53, 2 (2011), 349–372.Google ScholarDigital Library
- [58] . 2022. Reconstructing global PM2. 5 monitoring dataset from OpenAQ using a two-step spatio-temporal model based on SES-IDW and LSTM. Environ. Res. Lett. 17, 3 (2022), 034014.Google ScholarCross Ref
- [59] . 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 12 (2010).Google Scholar
- [60] . 2021. Respiratory diseases are positively associated with PM2. 5 concentrations in different areas of taiwan. PLoS One 16, 4 (2021), e0249694.Google ScholarCross Ref
- [61] . 2019. The impacts of urbanization on fine particulate matter (PM2. 5) concentrations: Empirical evidence from 135 countries worldwide. Environ. Pollut. 247 (2019), 989–998.Google ScholarCross Ref
- [62] . 2022. GAGIN: Generative adversarial guider imputation network for missing data. Neural Comput. Appl. (2022), 1–14.Google Scholar
- [63] . 2021. Using a land use regression model with machine learning to estimate ground level PM2. 5. Environ. Poll. 277 (2021), 116846.Google ScholarCross Ref
- [64] . 2023. Composite neural network: Theory and application to PM2.5 prediction. IEEE Trans. Knowl. Data Eng. 35, 2 (2023), 1311–1323. Google ScholarCross Ref
- [65] . 2018. Gain: Missing data imputation using generative adversarial nets. In International Conference on Machine Learning. PMLR, 5689–5698. https://github.com/jsyoon0823/GAINGoogle Scholar
- [66] . 2020. Gamin: Generative adversarial multiple imputation network for highly missing data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8456–8464.Google ScholarCross Ref
- [67] . 2021. Improved 1-km-resolution hourly estimates of aerosol optical depth using conditional generative adversarial networks. Remote Sens. 13, 19 (2021), 3834.Google ScholarCross Ref
- [68] . 2023. KNN classification with one-step computation. IEEE Trans. Knowl. Data Eng. 35, 3 (2023), 2711–2723.Google Scholar
- [69] . 2023. Reachable distance function for KNN classification. IEEE Trans. Knowl. Data Eng. 35, 7 (2023), 7382–7396.Google ScholarCross Ref
- [70] . 2022. Hyper-class representation of data. Neurocomputing 503 (2022), 200–218.Google ScholarDigital Library
- [71] . 2017. Learning k for knn classification. ACM Trans. Intell. Syst. Technol. 8, 3 (2017), 1–19.Google ScholarDigital Library
- [72] . 2017. Efficient kNN classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 29, 5 (2017), 1774–1785.Google ScholarCross Ref
- [73] . 2015. Forecasting fine-grained air quality based on big data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2267–2276.Google ScholarDigital Library
- [74] . 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223–2232.Google ScholarCross Ref
- [75] . 2017. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems. 465–476.Google Scholar
Index Terms
- Sparse Grid Imputation Using Unpaired Imprecise Auxiliary Data: Theory and Application to PM2.5 Estimation
Recommendations
Variable selection in uncertain regression analysis with imprecise observations
AbstractVariable selection is crucial in order to better investigate relationships between variables in regression analysis. However, sometimes data are collected in an imprecise way and can not be described by random variables. As a result, classical ...
Testing Grid Application Workflows Using TTCN-3
ICST '08: Proceedings of the 2008 International Conference on Software Testing, Verification, and ValidationThe collective and coordinated usage of distributed resources for problem solution within dynamic virtual organizations can be realized with the Grid computing technology. For distributing and solving a task, a Grid application involves a complex ...
Using grid computing based components in on demand environmental data delivery
UPGRADE '07: Proceedings of the second workshop on Use of P2P, GRID and agents for the development of content networksThe grid computing technology is evolving from the emergence to the stable and production status. In the computational science field the grid computing approach for storage and computing element resource sharing is a common way thanks to middleware as ...
Comments