skip to main content
10.1145/2675743.2772588acmconferencesArticle/Chapter ViewAbstractPublication PagesdebsConference Proceedingsconference-collections
research-article

A high throughput processing engine for taxi-generated data streams

Published: 24 June 2015 Publication History

Abstract

The ACM DEBS Grand Challenge 2015 focuses on real-time analytics over a high volume geospatial data stream composed of taxi trip reports from New York City. The goal of the challenge is to provide a solution which continuously identifies the most frequent routes (query 1) and most profitable areas (query 2) for taxis in New York City. The solution needs to process the incoming data stream in near real-time to provide valid information about taxi positions to end-users in a real-world deployment. We propose a modular processing engine design which is configured to offer efficient performance with a high data throughput and low processing latency. It consists of three main components: an input processor which pre-processes data objects to detect outliers, and two independent query processors tailored to the requirements of challenge queries. To efficiently compute query results, query processors use algorithms customized to the distribution of the taxi-generated data stream. Our experimental evaluation shows that the system can on average process 350,000 input events per second in a distributed mode, while achieving an average latency of less than 1 ms for both queries. Due to their excellent performance, the proposed algorithms are well suited for efficient tracking of a large number of vehicles that are present in modern urban areas.

References

[1]
Esper. http://www.espertech.com/esper/.
[2]
Wso2 complex event processor. http://wso2.com/products/complex-event-processor/.
[3]
A. Antonić, K. Rožanković, M. Marjanović, K. Pripužić, and I. Podnar Žarko. A mobile crowdsensing ecosystem enabled by a cloud-based publish/subscribe middleware. In Proc. of FiCloud-2014, Aug. 2014.
[4]
A. Bassi, M. Bauer, M. Fiedler, T. Kramp, R. v. Kranenburg, S. Lange, and S. Meissner, editors. Enabling Things to Talk: Designing IoT solutions with the IoT Architectural Reference Model. Springer, Berlin, 2013.
[5]
A. Bodhani. Smart transport. Engineering Technology, 7(6):70--73, July 2012.
[6]
A. Brito, A. Martin, C. Fetzer, I. Rocha, and T. Nobrega. Streammine3g oneclick -- deploy and monitor esp applications with a single click. In Parallel Processing (ICPP), 2013 42nd International Conference on, pages 1014--1019, Oct 2013.
[7]
T. Condie, N. Conway, P. Alvaro, J. M. Hellerstein, K. Elmeleegy, and R. Sears. Mapreduce online. In Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, NSDI'10, pages 21--21, Berkeley, CA, USA, 2010. USENIX Association.
[8]
A. Gal, S. Keren, M. Sondak, M. Weidlich, H. Blom, and C. Bockermann. Grand challenge: The techniball system. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS '13, pages 319--324, New York, NY, USA, 2013. ACM.
[9]
R. Gupta, R. Shah, and A. Mhetre. In-memory, high speed stream processing. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, pages 306--309, New York, NY, USA, 2014. ACM.
[10]
H.-A. Jacobsen, K. Mokhtarian, T. Rabl, M. Sadoghi, R. Sherafat Kazemzadeh, Y. Yoon, and K. Zhang. Grand challenge: The bluebay soccer monitoring engine. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS '13, pages 295--300, New York, NY, USA, 2013. ACM.
[11]
Z. Jerzak and H. Ziekow. The DEBS 2015 Grand Challenge. In DEBS 2015: the 9th ACM International Conference on Distributed Event-Based Systems, June 2015.
[12]
K. G. S. Madsen, L. Su, and Y. Zhou. Grand challenge: Mapreduce-style processing of fast sensor data. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS '13, pages 313--318, New York, NY, USA, 2013. ACM.
[13]
A. Martin, R. Marinho, A. Brito, and C. Fetzer. Predicting energy consumption with streammine3g. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, pages 270--275, New York, NY, USA, 2014. ACM.
[14]
N. Marz. Storm: Distributed and fault-tolerant real-time computation. http://storm.apache.org/.
[15]
L. Neumeyer, B. Robbins, A. Nair, and A. Kesari. S4: Distributed stream computing platform. In Data Mining Workshops (ICDMW), 2010 IEEE International Conference on, pages 170--177, Dec 2010.
[16]
S. Perera, S. Sriskandarajah, M. Vivekanandalingam, P. Fremantle, and S. Weerawarana. Solving the grand challenge using an opensource cep engine. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, pages 288--293, New York, NY, USA, 2014. ACM.
[17]
I. Podnar Žarko, A. Antonić, M. Marjanović, K. Pripužic, and L. Skorin-Kapov. The OpenIoT Approach to Sensor Mobility with Quality-Driven Data Acquisition. Workshop on Interoperability and Open-Source Solutions for the Internet of Things, LNCS vol. 9001, 2015.
[18]
K. Pripužić, I. Podnar Žarko, and K. Aberer. Top-k/w publish/subscribe: A publish/subscribe model for continuous top-k processing over data streams. Inf. Syst., 39:256--276, Jan. 2014.
[19]
K. Pripužić, I. P. Žarko, and K. Aberer. Time- and space-efficient sliding window top-k query processing. ACM Trans. Database Syst., 40(1):1:1--1:44, Mar. 2015.
[20]
M. Riahi, T. G. Papaioannou, I. Trummer, and K. Aberer. Utility-driven data acquisition in participatory sensing. In Proceedings of the 16th International Conference on Extending Database Technology, pages 251--262. ACM, 2013.
[21]
K. Su, J. Li, and H. Fu. Smart city and the applications. In Electronics, Communications and Control (ICECC), 2011 International Conference on, pages 1028--1031, Sept 2011.
[22]
A. Sunderrajan, H. Aydt, and A. Knoll. Real time load prediction and outliers detection using storm. In Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, pages 294--297, New York, NY, USA, 2014. ACM.
[23]
Y. Wu, D. Maier, and K.-L. Tan. Grand challenge: Sprint stream processing engine as a solution. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS '13, pages 301--306, New York, NY, USA, 2013. ACM.

Cited By

View all
  • (2020)Analytic Study of Containerizing Stateful Stream Processing as Microservice to Support Digital Twins in Fog ComputingProgramming and Computing Software10.1134/S036176882008008346:8(511-525)Online publication date: 1-Dec-2020
  • (2019)Stateful Stream Processing for Digital Twins: Microservice-Based Kafka Stream DSL2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)10.1109/SIBIRCON48586.2019.8958367(0804-0809)Online publication date: Oct-2019
  • (2016)Activity Detection in Smart Home EnvironmentProcedia Computer Science10.1016/j.procs.2016.08.24996:C(672-681)Online publication date: 1-Oct-2016

Index Terms

  1. A high throughput processing engine for taxi-generated data streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DEBS '15: Proceedings of the 9th ACM International Conference on Distributed Event-Based Systems
    June 2015
    385 pages
    ISBN:9781450332866
    DOI:10.1145/2675743
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ACM DEBS grand challenge
    2. complex event processing
    3. smart city
    4. trafic monitoring

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    DEBS '15

    Acceptance Rates

    Overall Acceptance Rate 145 of 583 submissions, 25%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Analytic Study of Containerizing Stateful Stream Processing as Microservice to Support Digital Twins in Fog ComputingProgramming and Computing Software10.1134/S036176882008008346:8(511-525)Online publication date: 1-Dec-2020
    • (2019)Stateful Stream Processing for Digital Twins: Microservice-Based Kafka Stream DSL2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)10.1109/SIBIRCON48586.2019.8958367(0804-0809)Online publication date: Oct-2019
    • (2016)Activity Detection in Smart Home EnvironmentProcedia Computer Science10.1016/j.procs.2016.08.24996:C(672-681)Online publication date: 1-Oct-2016

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media