Abstract
Air pollution has a severe impact on human health, pollution harms human health, and people start to pay attention to how to use the monitoring system in real-time recording for analysis. To maintain smooth monitoring and analysis, we have to manage the historical data separated from the incoming data. Historical data is used when we need to analyze the data, while real-time incoming data is used to visualize the real condition. To achieve this objective, we need to collect real-time data from environmental protection open data resource. However, the data might grow faster and become huge; in this case, the relational database was not designed to process a large amount of data. Therefore, we require a database technology that can handle massive volume data, that is not only Structured Query Language (NoSQL). This method raises an important point regarding how to dump the data to NoSQL without change relational database (RDB) system. Accordingly, this paper proposed an air pollution monitoring system combines Hadoop cluster to dump data from RDB to NoSQL and data backup. By this way, it will not only reduce the performance of RDB loading but also keep the service status. Dump data to NoSQL need to process without affecting the real-time monitoring of air pollution monitoring system. In this part, we focus on without interruption web service, and it can be up to 60%, through optimizing it with dump method and backup data service, MapReduce can restart the service and distribute the database when RDB is impairing. Besides that, through three different types of conversion mode, we can get the best data conversion in our system. Finally, air pollution monitoring service provides a variant of air pollution factor as an essential basis of environment detection and analysis to serve people living in a more comfortable environment.
Similar content being viewed by others
References
Apache Hadoop (2015) http://hadoop.apache.org/. Accessed 15 Jan 2018
Apache Hbase (2015) http://hbase.apache.org/. Accessed 16 Jan 2018
Apache Hive (2014). http://hive.apache.org/. Accessed 25 Feb 2018
Apache Sqoop (2016). http://sqoop.apache.org/. Accessed 18 Mar 2018
Ashraf Q, Hadaebi M (2015) Autonomic schemes for threat mitigation in internet of things. J Netw Comput Appl 49:112–127
Cloud Computing (2015) http://en.wikipedia.org/wiki/Cloud_computing. Accessed 10 Jan 2018
Gadiraju KK, Verma M, Davis KC, Talaga PG (2016) Benchmarking performance for migrating a relational application to a parallel implementation. Future Gener Comput Syst 63:148
Gallagher E (2014) Nosql benchmark study release
Gil D, Song I-Y (2016) Modeling and management of big data: challenges and opportunities. Future Gener Comput Syst 63:96–99
Grolinger K, Higashino WA, Tiwari A, Capretz MA (2013) Data management in cloud environments: nosql and newsql data stores. J Cloud Comput 2:22
Hadoop (2014) Hive vs. RDBMS. http://hadooptutorial.info/hive-vs-rdbms/. Accessed 15 Mar 2018
HDFS (2015) HDFS architecture guide. https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html. Accessed 16 Jan 2018
Jain A, Bhatnagar V (2015) Crime data analysis using pig with hadoop. Proc Comput Sci 78:571–578
Kamal S, Ripon SH, Dey N, Ashour AS, Santhid V (2016) A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset. Comput Methods Progr Biomed 131:191–206
Lee H, Shao B, Kang U (2015) Fast graph mining with hbase. Inf Sci 315:56–66
Li C (2010) Transforming relational database into hbase: a case study. In: International conference on software engineering and service sciences, pp 683–687
Liao Y-T, Zhou J, Chen S-C, Hsu C-H, Chen W, Jiang M-F, Chung Y-C (2016) Data adapter for querying and transformation between SQL and NoSQL database. Future Gener Comput Syst 65:111
Liu J, Huang X, Liu JK (2014) Secure sharing of personal health records in cloud computing: ciphertext policy attribute based signcryption. Future Gener Comput Syst 52:67–76
Luo Y, Luo S, Guan J, Zhou S (2013) A ramcloud storage system based on HDFS: architecture implementation and evaluation. J Syst Softw 86:744–750
Maheshwari N, Nanduri R, Varma V (2012) Dynamic energy efficient data placement and cluster reconfiguration algorithm for mapreduce framework. Future Gener Comput Syst 28:119–127
Merino J, Caballero I, Rivas B, Serrano M, Piattini M (2016) A data quality in use model for big data. Future Gener Comput Syst 63:123–130
Niedermayer H, Holz R, Pahl M-O, Carle G (2009) On using home networks and cloud computing for a future internet of things. Future Internet 6152:70–80
Rathbone M (2015) Apache Hive vs MySQL—what are the key differences? http://blog.matthewrathbone.com/2015/12/08/hive-vs-mysql.html/. Accessed 20 Jan 2018
Rees R (2010) NoSQL, no problem-an introduction to NoSQL databases
Shahzad Farrukh (2014) State-of-the-art survey on cloud computing security challenges. Proc Comput Sci 37:357–362
Sultan N (2014) Discovering the potential of cloud computing in accelerating the search for curing serious illnesses. Int J Inf Manag 34:221–225
Tummalapalli S, Rao Machavarapu V (2016) Managing mysql cluster data using cloudera impala. Proc Comput Sci 85:463–474
Wang H, Xua Z, Fujita H, Liud S (2016) Towards felicitous decision making: an overview on challenges and trends of big data. Inf Sci 367–368:747–765
Yang CT, Chen ST, Walter D, Wang YT, Kristiani E (2018) Implementation of an intelligent indoor environmental monitoring and management system in cloud. Future Gener Comput Syst 96:731
Zhang F, Cao J, Khan SU, Li K, Hwang K (2015) A task-level adaptive mapreduce framework for real-time streaming data in healthcare application. Future Gener Comput Syst 43:149–160
Acknowledgements
This work was sponsored by the Ministry of Science and Technology (MOST), Taiwan, under Grant Nos. 107-2221-E-029-008 and 108-2119-M-029-001-A.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by Mu-Yen Chen.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, CT., Chen, ST., Liu, JC. et al. On construction of the air pollution monitoring service with a hybrid database converter. Soft Comput 24, 7955–7975 (2020). https://doi.org/10.1007/s00500-019-04079-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04079-z