An integrated architecture for future studies in data processing for smart cities

https://doi.org/10.1016/j.micpro.2017.03.004Get rights and content

Abstract

Data processing for Smart Cities become more challenging, facing with different handling steps: data collection from different heterogeneous sources, processing sometimes in real-time and then delivered to high level services or applications used in Smart Cities. Applications used for intelligent transportation systems, crowd management, water resources management, noise and air pollution management, require different data processing techniques. The main subject of this paper is to propose an architecture for data processing in Smart Cities. The architecture is oriented on the flow of data from the source to the end user. We describe seven steps of data processing: collection of data from heterogeneous sources, data normalization, data brokering, data storage, data analysis, data visualization and decision support systems. We consider two case studies on crowd management in smart cities and on Intelligent Transportation Systems (ITS) and we provide experimental highlights.

Introduction

More and more applications today use, generate and handle very large volumes of data. In particular, this is true for Smart City applications, which attract a rapidly increasing interest from government, companies, citizens, developers, scientists, etc. They cover a large spectrum of needs in public safety, water and energy management, smart buildings, government and agency administration, social programs, transportation, health, education. They are fed with huge amounts of input data, in various formats, from a continuously increasing number of sources (sensors, governmental, regional, and municipal sources, citizens, public open data sources, etc.), and are described by a complex workflow and in many cases impose real-time processing capabilities, useful in decision taking.

The large volume of data coming from a variety of sources and in various formats, with different storage, transformation, delivery or archiving requirements, complicates the task of context data management. At the same time, fast responses are needed for real-time applications. Despite the potential improvements of the Smart City infrastructure, the number of concurrent applications needing quick data access will remain very high. With the emergence of the recent cloud infrastructures, achieving highly scalable data management in such contexts is a critical challenge, as the overall application performance is highly dependent on the properties of the data management service.

Extracting valuable information from raw data is especially difficult considering the velocity of growing data from year to year and the fact that 80% of data is unstructured. In addition, data sources are heterogeneous (various sensors, users with different profiles, etc.) and are located in different situations or contexts. This is why the Smart City infrastructure runs reliably and permanently to provide the context as a “public utility” to different services. Context-aware applications exploit the context to adapt accordingly the timing, quality and functionality of their services. The value of these applications and their supporting infrastructure lies in end-users always operating in a context: their role, intentions, locations and working environment constantly change.

As the scale, complexity and dynamism of distributed systems is dramatically growing, their configuration and data management have started to become a limiting factor of their development. This is particularly true in the case of Cloud is used for data storage and also for data processing, where the task of managing hundreds or thousands of nodes while delivering highly reliable services entails an intrinsic complexity. Furthermore, Cloud computing introduces another challenge which impacts on the resource management decisions. In these contexts, self-management mechanisms have to take into account the cost-effectiveness of the adopted decisions.

Considering all of these aspects, the main subject of this paper is to propose an architecture for Big Data processing in Smart Cities. The architecture is oriented on the flow of data from the source to the end user. We describe seven steps of data processing: collection of data from heterogeneous sources, data normalization, data brokering, data storage, data analysis, data visualization and decision support systems. We describe two case studies on crowds' management in smart cities and on Intelligent Transportation Systems (ITS).

The paper is structured as follows. Section 2 presents the related work on crowd data smart cities and on ITS. The proposed architecture is presented in Section 3. Two use cases are described in Section 4. Then, the experiments obtained for these use cases are presented in Section 5. The paper ends with conclusions and future work presented in Section 6.

Section snippets

Related work

Smart Cities [1] represent an important goal which can dramatically improve the life of citizens. There is a lot of research aiming to get us closer and closer to this goal. The idea of a smart city is in accordance to other movements in research such as Internet of Things [2], [3], [4] and Big Data [5]. New York times actually declared this period the “Age of Big Data”.1

In order to enable Smart Cities technologies such as Internet of

Data flow based architecture

We propose an architecture which contains several steps represented by the flow of data from the source to the end user. The data represents the input of our system, it is used to create valuable good for the users, usually in the form of information or automatic actions.

We created a seven step architecture to accomplish our goal to make from data a value (see Fig. 1): first we need to aggregate the data sources, then we need to perform data normalization, but before doing that we need to

Intelligent transportation systems

Large cities present many problems with their systems, but only transportation system entertains the dynamics of this environment. Currently, it cannot cope anymore with the enormous number of cars driven on its streets using classical traffic systems. Any problems like congestion, accidents, high fuel consumption, pollution, etc. which affect us daily in a city can have as root causes the bad usage of current infrastructure or not enough streets for current traffic flow.

Trying to solve the

Experiments

The first use case considers the Intelligent Transportation systems. The application model is as follow. The cabs are viewed as clients, which generate data with a sporadic schedule in a variety of sizes. The car GPS position is recorded every 15 s, and by default, the cluster client on the car sends the last 4 known positions every minute. However, if a car experiences a loss of connectivity, it may exhibit a pause in generating jobs and submit a larger data task when connectivity is

Conclusions and future work

In this paper we proposed a generic architecture for data flow handling specific for Smart Cities. We describe the functions and components for each step and identify specific technologies. Then we provide two use cases on crowd management and intelligent transportation systems. We highlight experimental results from applications developed using the model proposed in our architecture. As further work we will analyze self-adaptive optimization methods used in this architecture, focusing on data

Acknowledgment

The research presented in this paper is supported by projects: DataWay: Real-time Data Processing Platform for Smart Cities: Making sense of Big Data - PN-II-RU-TE-2014-4-2731; MobiWay: Mobility Beyond Individualism: an Integrated Platform for Intelligent Transportation Systems of Tomorrow - PN-II-PT-PCCA-2013-4-0321; CyberWater grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012; clueFarm: Information system based on cloud services accessible

Cristian Chilipirea is currently a Ph.D. student in computer science at University Politehnica of Bucharest, Romania. He received his M.S. degree in computer science at University Politehnica of Bucharest, Romania and a M.S. degree in science from Vrije Universiteit, Amsterdam, Netherlands, in 2014. He received B.S. degree in computer science in 2012 at the same University. He currently has the position of Assistant Professor, starting in 2014, at University Politehnica of Bucharest, Romania.

References (38)

  • A. Gandomi et al.

    Beyond the hype: big data concepts, methods, and analytics

    Int. J. Inf. Manag.

    (2015)
  • M. Zhou et al.

    Scanme: location tracking system in large-scale campus wi-fi environment using unlabeled mobility map

    Expert Syst. Appl.

    (2014)
  • V. Albino et al.

    Smart cities: Definitions, dimensions, performance, and initiatives

    J. Urban Technol.

    (2015)
  • A. Whitmore et al.

    The internet of things–a survey of topics and trends

    Inf. Syst. Front.

    (2015)
  • J.M. Batalla et al.

    Conception of id layer performance at the network level for internet of things

    Pers. Ubiquit. Comput.

    (2014)
  • J.M. Batalla et al.

    Id-based service-oriented communications for unified access to iot

    Comput. Electr. Eng.

    (2016)
  • C.S. Raghavendra et al.

    Wireless Sensor Networks

    (2006)
  • H. Ma et al.

    Opportunities in mobile crowd sensing

    Commun. Mag. IEEE

    (2014)
  • G.P. Hancke et al.

    The role of advanced sensing in smart cities

    Sensors

    (2012)
  • K. Su et al.

    Smart city and the applications

    Electronics, Communications and Control (ICECC), 2011 International Conference on

    (2011)
  • G.K. Still

    Crowd Dynamics

    (2000)
  • H. Du et al.

    Group mobility classification and structure recognition using mobile devices

    2016 IEEE International Conference on Pervasive Computing and Communications, PerCom 2016

    (2016)
  • D. Kumar et al.

    Understanding urban mobility via taxi trip clustering

    2016 17th IEEE International Conference on Mobile Data Management (MDM)

    (2016)
  • M.S. Grewal et al.

    Global Positioning Systems, Inertial Navigation, and Integration

    (2007)
  • Y. Chon et al.

    Sensing WiFi packets in the air

    Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing - UbiComp ’14 Adjunct

    (2014)
  • M. Dash et al.

    Visualize people’s mobility - both individually and collectively - using mobile phone cellular data

    Proceedings - IEEE International Conference on Mobile Data Management, 2016-July

    (2016)
  • A.J. Ruiz-Ruiz et al.

    Analysis methods for extracting knowledge from large-scale wifi monitoring to inform building facility planning

    Pervasive Computing and Communications (PerCom), 2014 IEEE International Conference on

    (2014)
  • L. Vu et al.

    Joint bluetooth/wifi scanning framework for characterizing and leveraging people movement in university campus

    Proceedings of the 13th ACM International Conference on Modeling, Analysis, and Simulation of Wireless and Mobile Systems

    (2010)
  • B. Bonne et al.

    Wifipi: Involuntary tracking of visitors at mass events

    World of Wireless, Mobile and Multimedia Networks (WoWMoM), 2013 IEEE 14th International Symposium and Workshops on a

    (2013)
  • Cited by (24)

    • Systematic Review of the Literature on Big Data in the Transportation Domain: Concepts and Applications

      2019, Big Data Research
      Citation Excerpt :

      Historical transportation data can be analyzed to discover trends and patterns to help make longer-term decisions and conduct urban planning. For example, traffic flow, driver behavior, and usage of current street infrastructure can be analyzed before building new transportation infrastructure [18]. Additionally, transportation data can be analyzed and used by policy makers to adjust policies, such as public transportation routes or traffic signals [30].

    • Smart community evaluation for sustainable development using a combined analytical framework

      2018, Journal of Cleaner Production
      Citation Excerpt :

      But smart community evaluation (SCE) plays a crucial role in ensuring the effectiveness of its implementation and development. Although the last decade has witnessed a number of results in the research on SCE, there are still many issues to be solved as the multi-source data integration and uncertain reasoning (Chilipirea et al., 2017; Ding et al., 2014; Linlin et al., 2017; Sta, 2017). SCE can allow governments to make more effective funding arrangements, increase the capability of community service and management, and promote the transition from traditional consumption to sustainable household consumption.

    View all citing articles on Scopus

    Cristian Chilipirea is currently a Ph.D. student in computer science at University Politehnica of Bucharest, Romania. He received his M.S. degree in computer science at University Politehnica of Bucharest, Romania and a M.S. degree in science from Vrije Universiteit, Amsterdam, Netherlands, in 2014. He received B.S. degree in computer science in 2012 at the same University. He currently has the position of Assistant Professor, starting in 2014, at University Politehnica of Bucharest, Romania. His research interests include distributed network systems, delay tolerant networks, crowd sensing as well as cloud computing and Internet of Things.

    Andreea-Cristina Petre is currently a Ph.D. student at University Politehnica of Bucharest, Romania. She received her M.S. degree in computer science at University Politehnica of Bucharest, Romania, in 2014. She received her B.S. degree in computer science in 2012, at the same University. In 2014 she studied for a year at Vrije Universiteit of Amsterdam, Netherlands by means of an Erasmus study grant. Her areas of interest are crowd sensing, distributed systems, multi-agent systems, natural language processing and operating systems.

    Loredana-Marsilia Groza, computer science diplomat engineer, is a Ph.D. student at University Politehnica of Bucharest, Romania, Faculty of Automatic Control and Computers, Computer Science Department. She works as an active member of Distributed Systems Laboratory having a research background in Big Data processing, data analytics, multi-criteria optimization, heterogeneous distributed computing, smart cities. She is also oriented on communication protocols performance analysis.

    Ciprian Dobre received his Ph.D. in computer science at the University Politehnica of Bucharest, Romania in 2008. He received his M.S. in computer science in 2004 and the engineering degree in computer science in 2003, at the same University. His main research interests are Grid Computing, Monitoring and Control of Distributed Systems, Modeling and Simulation, Advanced Networking Architectures, Parallel and Distributed Algorithms. He is member of the RoGrid consortium and is involved in a number of national and international projects. His research activities were awarded with the Innovations in Networking Award for Experimental Applications in 2008 by the Corporation for Education Network Initiatives (CENIC).

    Florin Pop, Ph.D., is professor with the Computer Science Department and also an active member of Distributed System Laboratory in University Politehnica of Bucharest. His general research interests are: Large-Scale Distributed Systems (design and performance), Grid Computing and Cloud Computing, Peer-to-Peer Systems, Big Data Management, Data Aggregation, Information Retrieval and Ranking Techniques, Bio-Inspired Optimization Methods. He is member of the RoGrid consortium and is involved in a number of national and international projects. His expertise is on optimization based on bio-inspired techniques received three Best Papers Awards, IBM Assistantship Award and IBM Faculty Award. He is member of the IEEE and ACM professional organizations.

    View full text