Abstract
A huge amount of sensing data is generated by a large number of pervasive IoT devices. In order to find a meaningful information from the big data, pre-processing is essential, in which many outlier data need to be removed because those are deteriorated as time passes. In this paper, big data pre-processing methods are investigated and proposed. To evaluate the pre-processing methods for accurate analysis, we use collection of digital tachograph (DTG) data. We obtained DTG sensing data of six-thousand driving vehicles over a year. We studied five kinds of pre-processing methods: filtering ranges, excluding meaningless values, comparing filters from variables, applying statistical techniques, and finding driving patterns. In addition, we developed MapReduce programming using a Hadoop ecosystem, and deployed a big data to perform pre-processing analysis. Out of the pre-processing steps, we confirmed the proportion of DTG sensing data including any errors is up to 27.09 %. In addition, we approved that outlier data can be well detected, which is difficult to detect through simple range error pre-processing.
E. Choi—This research was supported by the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the IT/SW Creative research program supervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2013-H0502-13-1071).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lee, S.J., Lee, C.: Short-term impact analysis of dtg installation for commercial vehicles. J. Korea Inst. Intell. Transp. Syst. 11(6), 49–59 (2012)
Standard Specification of DTG (Ministry of Land, Infrastructure and Transport, KS R 5072, February 2009. (in Korea)
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2012)
Atzori, L., Iera, A., Morabito, G.: The internet of things: a survey. Comput. Netw. 54, 2787–2805 (2010)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Elsevier, New York (2011)
Vilaça, A., Aguiar, A., Soares, C.: Estimating fuel consumption from GPS data. In: Paredes, R., Cardoso, Jaime, S., Pardo, Xosé, M. (eds.) IbPRIA 2015. LNCS, vol. 9117, pp. 672–682. Springer, Heidelberg (2015). doi:10.1007/978-3-319-19390-8_75
Cho, W., Choi, E.: A GPS trajectory map-matching mechanism with DTG big data on the HBase system. In: The 2015 International Conference on Big Data Applications and Services, October 2015
Cho, W., Choi, E.: Rural traffic map coverage extension using DTG big data processing. J. Inf. Technol. Architect. 12, 51–57 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cho, W., Choi, E. (2017). Effective Pre-processing Methods with DTG Big Data by Using MapReduce Techniques. In: Park, J., Pan, Y., Yi, G., Loia, V. (eds) Advances in Computer Science and Ubiquitous Computing. UCAWSN CUTE CSA 2016 2016 2016. Lecture Notes in Electrical Engineering, vol 421. Springer, Singapore. https://doi.org/10.1007/978-981-10-3023-9_61
Download citation
DOI: https://doi.org/10.1007/978-981-10-3023-9_61
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3022-2
Online ISBN: 978-981-10-3023-9
eBook Packages: EngineeringEngineering (R0)