An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution

Yang, Chao-Tung; Chan, Yu-Wei; Liu, Jung-Chun; Lou, Ben-Shen

doi:10.1007/s11227-017-2189-1

An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution

Published: 14 November 2017

Volume 76, pages 1416–1437, (2020)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Chao-Tung Yang¹,
Yu-Wei Chan²,
Jung-Chun Liu¹ &
…
Ben-Shen Lou¹

674 Accesses
11 Citations
Explore all metrics

Abstract

Recently, the R package has become a popular tool for big data analysis due to its several matured software packages for the data analysis and visualization, including the analysis of air pollution. The air pollution problem is of increasing global concern as it has greatly impacts on the environment and human health. With the rapid development of IoT and the increase in the accuracy of geographical information collected by sensors, a huge amount of air pollution data were generated. Thus, it is difficult to analyze the air pollution data in a single machine environment effectively and reliably due to its inherent characteristic of memory design. In this work, we construct a distributed computing environment based on both the softwares of RHadoop and SparkR for performing the analysis and visualization of air pollution with the R more reliably and effectively. In the work, we firstly use the sensors, called EdiGreen AirBox to collect the air pollution data in Taichung, Taiwan. Then, we adopt the Inverse Distance Weighting method to transform the sensors’ data into the density map. Finally, the experimental results show the accuracy of the short-term prediction results of PM2.5 by using the ARIMA model. In addition, the verification with respect to the prediction accuracy with the MAPE method is also presented in the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence-based solutions for climate change: a review

Article Open access 13 June 2023

Temporal variation of the PM2.5/PM10 ratio and its association with meteorological factors in a South American megacity: Metropolitan Area of Lima-Callao, Peru

Article 13 April 2024

Spatial machine learning: new opportunities for regional science

Article Open access 24 December 2021

References

Cohen AJ, Ross Anderson H, Ostro B, Pandey KD, Krzyzanowski M, Kunzli N, Gutschmidt K, Pope A, Romieu I, Samet JM, Smith K (2005) The global burden of disease due to outdoor air pollution. J Toxic Environ Health 68(13–14):1301–1307
Article Google Scholar
Mehta S, Shin H, Burnett R, North T, Cohen AJ (2013) Ambient particulate air pollution and acute lower respiratory infections: a systematic review and implications for estimating the global burden of disease. Air Qual Atmos Health 6(1):69–83
Article Google Scholar
Liu L, Yang X, Liu H, Wang M, Welles S, Mrquez S, Frank A, Haas CN (2016) Spatial temporal analysis of airpollution, climate change, and total mortality in 120 cities of china. Front Public Health 4:1–13
Article Google Scholar
da Silva CS, Rossato JM, Rocha JAV, Vargas VM (2015) Characterization of an area of reference for inhalable particulate matter (PM2.5) associated with genetic biomonitoring in children. Mutat Res Genet Toxicol Environ Mutagen 778:44–55
Article Google Scholar
Yorifuji T, Kashima S, Diez MH, Kado Y, Sanada S, Doi H (2017) Prenatal exposure to outdoor air pollution and child behavioral problems at school age in Japan. Environ Int 99:192–198
Article Google Scholar
Ries L (1993) Areas of influence for IDW-interpolation with isotropic environmental data. CATENA 20(1):199–205
Article Google Scholar
Liang Y, Fang L, Pan H, Zhang K, Kan H, Brook JR, Sun Q (2014) PM2.5 in Beijing temporal pattern and its association with influenza. Environ Health 13:102–109
Article Google Scholar
Li X, Peng L, Hu Y, Shao J, Chi T (2016) Deep learning architecture for air quality predictions. Environ Sci Pollut Res 23:22408–22417
Article Google Scholar
Eddelbuettel D (2016) CRAN task view: high-performance and parallel computing with R. https://cran.r-project.org/web/views/HighPerformanceComputing.html
Zhao Y, Cen Y (2013) Data mining applications with R. Academic Press, Cambridge
Google Scholar
Liang M, Trejo C, Muthu L, Ngo LB, Luckow A, Apon AW (2015) Evaluating R-based big data analytic frameworks. In: 2015 IEEE International Conference on Cluster Computing, September 2015
Dousse O, Thiran P, Hasler M (2002) Connectivity in ad-hoc and hybrid networks. In: Proceedings of IEEE INFOCOM 2002, June 2002
Uskenbayeva R, Kuandykov A, Young IC, Temirboltov T, Mnzholov S, Kozhmzhrov D (2015) Integrating of data using the Hadoop and R. Proc Comput Sci 56:145–149
Article Google Scholar
Stachelek J (2017) Spatial interpolation via inverse path distance weighting. https://cran.r-project.org/web/packages/ipdw/vignettes/ipdw2.html
Stachelek J (1993) Spatial interpolation via inverse path distance weighting. West Palm Beach 20:237–240
Google Scholar
Prajapati V (2013) Big data analytics with R and Hadoop. Packt Publishing, Birmingham
Google Scholar
Catalano M, Galatioto F, Bell M, Namdeo A, Bergantinoc AS (2016) Improving the prediction of air pollution peak episodes generated by urban transport networks. Environ Sci Policy 60:69–83
Article Google Scholar
Zafra C, Ngel Y, Torres E (2017) ARIMA analysis of the effect of land surface coverage on PM10 concentrations in a high-altitude megacity. Atmos Pollut Res 8(4):660–668
Article Google Scholar
Wang P, Zhang H, Qin Z, Zhang G (2017) A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting. Atmos Pollut Res 8(5):850–860
Article Google Scholar
Kuandykov A, Cho YI, Temirboltov T, Mnzholov S, Kozhmzhrov D (2016) Optimizing R with SparkR on a commodity cluster for biomedical research. Comput Methods Progr Biomed 137:321–328
Article Google Scholar
Shivaram V, Zongheng Y, Davies L, Eric L, Hossein F, Xiangrui M, Reynold X, Ali G, Michael F, Stoica I, Matei Z (2016) SparkR: scaling R programs with spark. In: Proceedings of the 2016 International Conference on Management of Data, June–July 2016
Siknun GP, Sitanggang IS (2016) Web-based classification application for forest fire data using the shiny framework and the C5.0 algorithm. Proc Environ Sci 33:332–339
Article Google Scholar
Hermawati R, Sitanggang IS (2016) Web-based clustering application using shiny framework and DBSCAN algorithm for hotspots data in peatland in Sumatra. Proc Environ Sci 33:317–323
Article Google Scholar
Ries L (1993) Areas of influence for IDW-interpolation with isotropic environmental data. CATENA 20(1–2):199–205
Article Google Scholar
Wagner M, Darrell K (2015) Tutorial L exploring discrete database networks of triCare health data using R and shiny. Pract Predict Anal Decis Syst Med 30:635–658
Google Scholar

Download references

Acknowledgements

This work was supported in part by the Ministry of Science and Technology, Taiwan, under Grant MOST 105-2634-E-029-001 and MOST 106-2621-M-029-001.

Author information

Authors and Affiliations

Department of Computer Science, Tunghai University, No. 1727, Sec. 4, Taiwan Boulevard, Xitun District, Taichung, 40704, Taiwan
Chao-Tung Yang, Jung-Chun Liu & Ben-Shen Lou
College of Computing and Informatics, Providence University, 200, Sec. 7, Taiwan Boulevard, Shalu District, Taichung, Taiwan
Yu-Wei Chan

Authors

Chao-Tung Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Wei Chan
View author publications
You can also search for this author in PubMed Google Scholar
Jung-Chun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ben-Shen Lou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu-Wei Chan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, CT., Chan, YW., Liu, JC. et al. An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution. J Supercomput 76, 1416–1437 (2020). https://doi.org/10.1007/s11227-017-2189-1

Download citation

Published: 14 November 2017
Issue Date: March 2020
DOI: https://doi.org/10.1007/s11227-017-2189-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence-based solutions for climate change: a review

Temporal variation of the PM2.5/PM10 ratio and its association with meteorological factors in a South American megacity: Metropolitan Area of Lima-Callao, Peru

Spatial machine learning: new opportunities for regional science

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An implementation of cloud-based platform with R packages for spatiotemporal analysis of air pollution

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence-based solutions for climate change: a review

Temporal variation of the PM2.5/PM10 ratio and its association with meteorological factors in a South American megacity: Metropolitan Area of Lima-Callao, Peru

Spatial machine learning: new opportunities for regional science

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation