Abstract
Today sensors are widely used in many monitoring applications. Due to some random environmental effects and/or sensing failures, the collected sensor data is typically noisy. Thus, it is critical to cleanse the data before using it for answering queries or for data analysis. Popular data cleansing approaches, such as classification, prediction and moving average, are not suited for embedded sensor devices, due to their limit storage and processing capabilities. In this paper, we propose a sensor data cleansing approach using the relational-based technologies, including constraints, triggers and granularity-based data aggregation. The proposed approach is simple but effective to cleanse different types of dirty data, including delayed data, incomplete data, incorrect data, duplicate data and missing data. We evaluate the proposed strategy to verify its efficiency and effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Qin, Z., Han, Q., Mehrotra, S., Venkatasubramanian, N.: Quality-aware sensor data management. In: Ammari, H.M. (ed.) The Art of Wireless Sensor Networks, pp. 429–464. Springer, Heidelberg (2014)
Zimmerman, A.T., Lynch, J.P., Ferrese, F.T.: Market-based Resource Allocation for Distributed Data Processing in Wireless Sensor Networks. ACM Transactions on Embedded Computing Systems 12(3), Article 84 (2013)
Jeffery, S.R., Alonso, G., Franklin, M.J., Hong, W., Widom, J.: Declarative support for sensor data cleaning. In: Fishkin, K.P., Schiele, B., Nixon, P., Quigley, A. (eds.) PERVASIVE 2006. LNCS, vol. 3968, pp. 83–100. Springer, Heidelberg (2006)
Iftikhar, N., Pedersen, T.B.: Using a Time Granularity Table for Gradual Granular Data Aggregation. Fundamenta Informaticae 132(2), 153–176 (2014)
Iftikhar, N., Pedersen, T.B.: A rule-based tool for gradual granular data aggregation. In: 14th ACM Int. Workshop on DW and OLAP, pp. 1–8. ACM Press, NY (2011)
Iftikhar, N., Pedersen T.B.: An embedded database application for the aggregation of farming device data. In: 16th European Conference on Information Systems in Agriculture and Forestry, pp 51–59. Czech University of Life Sciences (2010)
Iftikhar, N., Pedersen, T.B.: Gradual data aggregation in multi-granular fact tables on resource-constrained systems. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010, Part III. LNCS, vol. 6278, pp. 349–358. Springer, Heidelberg (2010)
Iftikhar, N.: Ratio-based gradual aggregation of data. In: Benlamri, R. (ed.) NDT 2012, Part I. CCIS, vol. 293, pp. 316–329. Springer, Heidelberg (2012)
Iftikhar, N.: Integration, aggregation and exchange of farming device data: a high level perspective. In: 2nd International Conference on the Applications of Digital Information and Web Technologies, pp. 14–19. IEEE (2009)
Pedersen, T.B., Jensen, C.S., Dyreson, C.E.: Supporting imprecision in multidimensional databases using granularities. In: 11th International Conference on Scientific and Statistical Database Management, pp. 90–101. IEEE (1999)
LandIT. http://daisy.aau.dk/education/proposals/farmingdevicedata.php
UTC. http://en.wikipedia.org/wiki/Coordinated_Universal_Time
SQLite. https://sqlite.org
Darcy, P., Stantic, B., Sattar, A.: Correcting missing data anomalies with clausal defeasible logic. In: Catania, B., Ivanović, M., Thalheim, B. (eds.) ADBIS 2010. LNCS, vol. 6295, pp. 149–163. Springer, Heidelberg (2010)
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann (2006)
Naumann, F., Herschel, M.: An Introduction to Duplicate Detection. Morgan & Claypool Publishers (2010)
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press (2012)
Maurice, V.K.: Managing Uncertainty: The Road Towards Better Data Interoperability. IT - Information Technology 54(3), 138–146 (2012)
Kim, W., Choi, B.J., Hong, E.K., Kim, S.K., Lee, D.: A Taxonomy of Dirty Data. Data Mining and Knowledge Discovery 7(1), 81–99 (2003)
Rahm, E., Do, H.H.: Data Cleaning: Problems and Current Approaches. IEEE Data Engineering Bulletin 23(4), 3–13 (2000)
Barateiro, J., Galhardas, H.: A Survey of Data Quality Tools. IEEE Data Engineering Bulletin, Datenbank-Spektrum 14, 15–21 (2005)
Rosenmuller, M., Siegmund, N., Schirmeier, H., Sincero, J., Apel, S., Leich, T., Spinczyk, O., Saake, G.: FAME-DBMS: tailor-made data management solutions for embedded systems. In: EDBT Workshop on Software Engineering for Tailor-made Data Management, pp. 1–6. ACM Press, NY (2008)
Kim, G.J., Baek, S.C., Lee, H.S., Lee, H.D., Joe, M.J.: LGeDBMS: a small DBMS for embedded system with flash memory. In: 32nd International Conference on Very Large Data Bases, pp. 1255–1258 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Iftikhar, N., Liu, X., Nordbjerg, F.E. (2015). Relational-Based Sensor Data Cleansing. In: Morzy, T., Valduriez, P., Bellatreche, L. (eds) New Trends in Databases and Information Systems. ADBIS 2015. Communications in Computer and Information Science, vol 539. Springer, Cham. https://doi.org/10.1007/978-3-319-23201-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-23201-0_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23200-3
Online ISBN: 978-3-319-23201-0
eBook Packages: Computer ScienceComputer Science (R0)