ABSTRACT
Web-based sensor data, provided by organizations such as the National Oceanographic and Atmospheric Administration, provide a valuable service to the public and scientific communities. However, much of this data is locked in a variety of presentation formats and is computationally inaccessible. In addition, although these data have a spatiotemporal context, both the spatial and temporal data are usually only implicitly defined. Although storing this data in a consistent database can partially resolve this problem, a data-driven programming model coupled with MapReduce capabilities is a more flexible and extensible solution. Our implementation of this programming model allows users to parse a wide array of sensor data and express complex computation in a simple, scalable manner. In addition, our framework uses a simple key-value storage mechanism and provides convenient geospatial output mechanisms. In this paper, we discuss some early results of our programming model within the context of our current Java-oriented implementation, and demonstrate how the system can be used to create many different applications. We also discuss and evaluate our system with respect to memory usage and scalability.
- T. Abdelzaher, B. Blum, Q. Cao, Y. Chen, D. Evans, J. George, S. George, L. Gu, T. He, S. Krishnamurthy, L. Luo, S. Son, J. Stankovic, R. Stoleru, and A. Wood. Envirotrack: Towards an environmental computing paradigm for distributed sensor networks. In International Conference on Distributed Computing Systems (ICDCS), 2004. Google ScholarDigital Library
- B. Beran, D. Fay, and C. van Ingen. Sciscope: Using virtual globes for environmental data discovery. In American Geophysical Union, 2008.Google Scholar
- F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006. Google ScholarDigital Library
- K. Chang, N. Yau, M. Hansen, and D. Estrin. Sensorbase.org - a centralized repository to slog sensor network data. In Euro-American Workshop on Middleware for Sensor Networks (EAWMS - DCOSS), 2006.Google Scholar
- E. Cheong, E. A. Lee, and Y. Zhao. Viptos: a graphical development and simulation environment for tinyos-based wireless sensor networks. In ACM Conference on Embedded Networked Sensor Systems (SenSys), 2005. Google ScholarDigital Library
- C. J. Date. A guide to the SQL standard. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1986. Google ScholarDigital Library
- J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. 2004. Google ScholarDigital Library
- D. Gay, P. Levis, R. von Behren, M. Welsh, E. Brewer, and D. Culler. The nesc language: A holistic approach to networked embedded systems. In Programming Language Design and Implementation (PLDI), 2003. Google ScholarDigital Library
- O. Gnawali, B. Greenstein, K.-Y. Jang, A. Joki, J. Paek, M. Vieira, D. Estrin, R. Govindan, and E. Kohler. The tenet architecture for tiered sensor networks. In ACM Conference on Embedded Networked Sensor Systems (SenSys), 2006. Google ScholarDigital Library
- B. L. Gorman, D. R. Resseguie, and C. H. Tomkins-Tinch. Sensorpedia: Information sharing across incompatible sensor systems. In International Symposium on Collaborative Technologies and Systems, 2009. Google ScholarDigital Library
- R. Gummadi, O. Gnawali, and R. Govindan. Macro-programming wireless sensor networks using kairos. In International Conference on Distributed Computing in Sensor Systems (DCOSS), 2005. Google ScholarDigital Library
- B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang. Mars: a mapreduce framework on graphics processors. In International Conference on Parallel Architectures and Compilation Techniques (PACT), 2008. Google ScholarDigital Library
- J. Horey, A. Kilzer, J.-C. Tournier, P. Widener, and A. B. Maccabe. A filesystem interface for sensor networks. Technical report, University of New Mexico, 2008.Google Scholar
- J. Horey, A. B. Maccabe, and A. Mielke. Kensho: A dynamic tasking architecture for sensor networks. In Workshop for Wireless Sensor Network Architectures - IPSN, 2007. Google ScholarDigital Library
- P. Levis and D. Culler. Maté: A Tiny Virtual Machine for Sensor Networks. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2002. Google ScholarDigital Library
- J. Liu and F. Zhao. Towards semantic services for sensor-richinformation systems. In International Conference on Broadband Networks (BROADNETS), 2005.Google Scholar
- S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong. Tinydb: an acquisitional query processing system for sensor networks. ACM Transaction Database Systems, pages 122--173, 2005. Google ScholarDigital Library
- G. Mainland and M. Welsh. Programming sensor networks using abstract regions. In Symposium on Networked Systems Design and Implementation (NSDI), 2004. Google ScholarDigital Library
- W. P. McCartney and N. Sridhar. Tosdev: a rapid development environment for tinyos. In ACM Conference on Embedded Networked Sensor Systems (SenSys), 2006. Google ScholarDigital Library
- S. A. Mcilraith, T. C. Son, and H. Zeng. Semantic web services. IEEE Intelligent Systems, 16:46--53, 2001. Google ScholarDigital Library
- E. Meijer, B. Beckman, and G. Bierman. Linq: reconciling object, relations and xml in the .net framework. In ACM SIGMOD International Conference on Management of Data, 2006. Google ScholarDigital Library
- D. Mills. Network time protocol rfc (version 3, march 1992).Google Scholar
- S. Nath, J. Liu, and F. Zhao. Sensormap for wide-area sensor webs. IEEE Computer Magazine, 40(7):90--93, 2007. Google ScholarDigital Library
- R. R. Newton, L. D. Girod, J. G. Morrisett, M. B. Craig, and S. R. Madden. Design and evaluation of a compiler for embedded stream programs. In ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 2008. Google ScholarDigital Library
- R. R. Newton, J. G. Morrisett, and M. Welsh. The regiment macroprogramming system. In Information Processing in Sensor Networks (IPSN), 2007. Google ScholarDigital Library
- C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating mapreduce for multi-core and multiprocessor systems. In IEEE International Symposium on High Performance Computer Architecture (HPCA), 2007. Google ScholarDigital Library
- K. Whitehouse, C. Sharp, E. Brewer, and D. Culler. Hood: a neighborhood abstraction for sensor networks. In International Conference on Mobile Systems, Applications, and Services (MobiSys), 2004. Google ScholarDigital Library
- H.-c. Yang, A. Dasdan, R.-L. Hsiao, and D. S. Parker. Mapreduce-merge: simplified relational data processing on large clusters. In ACM SIGMOD international conference on Management of data (SIGMOD), 2007. Google ScholarDigital Library
- Y. Yao and J. Gehrke. The Cougar Approach to In-Network Query Processing in Sensor Networks. In ACM SIGMOD Conference, 2002. Google ScholarDigital Library
- Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. Kumar, and G. J. Currey. Dryadlinq: A system for general-purpose distributed data-parallel computing using a high-level language. 2008.Google Scholar
Index Terms
- A programming framework for integrating web-based spatiotemporal sensor data with MapReduce capabilities
Recommendations
MapReduce: Review and open challenges
The continuous increase in computational capacity over the past years has produced an overwhelming flow of data or big data, which exceeds the capabilities of conventional processing tools. Big data signify a new era in data exploration and utilization. ...
Challenges for MapReduce in Big Data
SERVICES '14: Proceedings of the 2014 IEEE World Congress on ServicesIn the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce ...
Integrating Big data paradigm in WSNs
BDAW '16: Proceedings of the International Conference on Big Data and Advanced Wireless TechnologiesWSNs consist of large number of small sensors densely deployed to monitor a phenomenon. Most of the data generated from the WSNs represent events happening at time intervals. Sometimes and according to the nature of the applications, this data stream is ...
Comments