An integrated architecture for future studies in data processing for smart cities
Introduction
More and more applications today use, generate and handle very large volumes of data. In particular, this is true for Smart City applications, which attract a rapidly increasing interest from government, companies, citizens, developers, scientists, etc. They cover a large spectrum of needs in public safety, water and energy management, smart buildings, government and agency administration, social programs, transportation, health, education. They are fed with huge amounts of input data, in various formats, from a continuously increasing number of sources (sensors, governmental, regional, and municipal sources, citizens, public open data sources, etc.), and are described by a complex workflow and in many cases impose real-time processing capabilities, useful in decision taking.
The large volume of data coming from a variety of sources and in various formats, with different storage, transformation, delivery or archiving requirements, complicates the task of context data management. At the same time, fast responses are needed for real-time applications. Despite the potential improvements of the Smart City infrastructure, the number of concurrent applications needing quick data access will remain very high. With the emergence of the recent cloud infrastructures, achieving highly scalable data management in such contexts is a critical challenge, as the overall application performance is highly dependent on the properties of the data management service.
Extracting valuable information from raw data is especially difficult considering the velocity of growing data from year to year and the fact that 80% of data is unstructured. In addition, data sources are heterogeneous (various sensors, users with different profiles, etc.) and are located in different situations or contexts. This is why the Smart City infrastructure runs reliably and permanently to provide the context as a “public utility” to different services. Context-aware applications exploit the context to adapt accordingly the timing, quality and functionality of their services. The value of these applications and their supporting infrastructure lies in end-users always operating in a context: their role, intentions, locations and working environment constantly change.
As the scale, complexity and dynamism of distributed systems is dramatically growing, their configuration and data management have started to become a limiting factor of their development. This is particularly true in the case of Cloud is used for data storage and also for data processing, where the task of managing hundreds or thousands of nodes while delivering highly reliable services entails an intrinsic complexity. Furthermore, Cloud computing introduces another challenge which impacts on the resource management decisions. In these contexts, self-management mechanisms have to take into account the cost-effectiveness of the adopted decisions.
Considering all of these aspects, the main subject of this paper is to propose an architecture for Big Data processing in Smart Cities. The architecture is oriented on the flow of data from the source to the end user. We describe seven steps of data processing: collection of data from heterogeneous sources, data normalization, data brokering, data storage, data analysis, data visualization and decision support systems. We describe two case studies on crowds' management in smart cities and on Intelligent Transportation Systems (ITS).
The paper is structured as follows. Section 2 presents the related work on crowd data smart cities and on ITS. The proposed architecture is presented in Section 3. Two use cases are described in Section 4. Then, the experiments obtained for these use cases are presented in Section 5. The paper ends with conclusions and future work presented in Section 6.
Section snippets
Related work
Smart Cities [1] represent an important goal which can dramatically improve the life of citizens. There is a lot of research aiming to get us closer and closer to this goal. The idea of a smart city is in accordance to other movements in research such as Internet of Things [2], [3], [4] and Big Data [5]. New York times actually declared this period the “Age of Big Data”.1
In order to enable Smart Cities technologies such as Internet of
Data flow based architecture
We propose an architecture which contains several steps represented by the flow of data from the source to the end user. The data represents the input of our system, it is used to create valuable good for the users, usually in the form of information or automatic actions.
We created a seven step architecture to accomplish our goal to make from data a value (see Fig. 1): first we need to aggregate the data sources, then we need to perform data normalization, but before doing that we need to
Intelligent transportation systems
Large cities present many problems with their systems, but only transportation system entertains the dynamics of this environment. Currently, it cannot cope anymore with the enormous number of cars driven on its streets using classical traffic systems. Any problems like congestion, accidents, high fuel consumption, pollution, etc. which affect us daily in a city can have as root causes the bad usage of current infrastructure or not enough streets for current traffic flow.
Trying to solve the
Experiments
The first use case considers the Intelligent Transportation systems. The application model is as follow. The cabs are viewed as clients, which generate data with a sporadic schedule in a variety of sizes. The car GPS position is recorded every 15 s, and by default, the cluster client on the car sends the last 4 known positions every minute. However, if a car experiences a loss of connectivity, it may exhibit a pause in generating jobs and submit a larger data task when connectivity is
Conclusions and future work
In this paper we proposed a generic architecture for data flow handling specific for Smart Cities. We describe the functions and components for each step and identify specific technologies. Then we provide two use cases on crowd management and intelligent transportation systems. We highlight experimental results from applications developed using the model proposed in our architecture. As further work we will analyze self-adaptive optimization methods used in this architecture, focusing on data
Acknowledgment
The research presented in this paper is supported by projects: DataWay: Real-time Data Processing Platform for Smart Cities: Making sense of Big Data - PN-II-RU-TE-2014-4-2731; MobiWay: Mobility Beyond Individualism: an Integrated Platform for Intelligent Transportation Systems of Tomorrow - PN-II-PT-PCCA-2013-4-0321; CyberWater grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012; clueFarm: Information system based on cloud services accessible
Cristian Chilipirea is currently a Ph.D. student in computer science at University Politehnica of Bucharest, Romania. He received his M.S. degree in computer science at University Politehnica of Bucharest, Romania and a M.S. degree in science from Vrije Universiteit, Amsterdam, Netherlands, in 2014. He received B.S. degree in computer science in 2012 at the same University. He currently has the position of Assistant Professor, starting in 2014, at University Politehnica of Bucharest, Romania.
References (38)
- et al.
Beyond the hype: big data concepts, methods, and analytics
Int. J. Inf. Manag.
(2015) - et al.
Scanme: location tracking system in large-scale campus wi-fi environment using unlabeled mobility map
Expert Syst. Appl.
(2014) - et al.
Smart cities: Definitions, dimensions, performance, and initiatives
J. Urban Technol.
(2015) - et al.
The internet of things–a survey of topics and trends
Inf. Syst. Front.
(2015) - et al.
Conception of id layer performance at the network level for internet of things
Pers. Ubiquit. Comput.
(2014) - et al.
Id-based service-oriented communications for unified access to iot
Comput. Electr. Eng.
(2016) - et al.
Wireless Sensor Networks
(2006) - et al.
Opportunities in mobile crowd sensing
Commun. Mag. IEEE
(2014) - et al.
The role of advanced sensing in smart cities
Sensors
(2012) - et al.
Smart city and the applications
Electronics, Communications and Control (ICECC), 2011 International Conference on
(2011)
Crowd Dynamics
Group mobility classification and structure recognition using mobile devices
2016 IEEE International Conference on Pervasive Computing and Communications, PerCom 2016
Understanding urban mobility via taxi trip clustering
2016 17th IEEE International Conference on Mobile Data Management (MDM)
Global Positioning Systems, Inertial Navigation, and Integration
Sensing WiFi packets in the air
Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing - UbiComp ’14 Adjunct
Visualize people’s mobility - both individually and collectively - using mobile phone cellular data
Proceedings - IEEE International Conference on Mobile Data Management, 2016-July
Analysis methods for extracting knowledge from large-scale wifi monitoring to inform building facility planning
Pervasive Computing and Communications (PerCom), 2014 IEEE International Conference on
Joint bluetooth/wifi scanning framework for characterizing and leveraging people movement in university campus
Proceedings of the 13th ACM International Conference on Modeling, Analysis, and Simulation of Wireless and Mobile Systems
Wifipi: Involuntary tracking of visitors at mass events
World of Wireless, Mobile and Multimedia Networks (WoWMoM), 2013 IEEE 14th International Symposium and Workshops on a
Cited by (24)
Role of IoT technologies in big data management systems: A review and Smart Grid case study
2024, Pervasive and Mobile ComputingExperts and intelligent systems for smart homes’ Transformation to Sustainable Smart Cities: A comprehensive review
2024, Expert Systems with ApplicationsSystematic Review of the Literature on Big Data in the Transportation Domain: Concepts and Applications
2019, Big Data ResearchCitation Excerpt :Historical transportation data can be analyzed to discover trends and patterns to help make longer-term decisions and conduct urban planning. For example, traffic flow, driver behavior, and usage of current street infrastructure can be analyzed before building new transportation infrastructure [18]. Additionally, transportation data can be analyzed and used by policy makers to adjust policies, such as public transportation routes or traffic signals [30].
Smart community evaluation for sustainable development using a combined analytical framework
2018, Journal of Cleaner ProductionCitation Excerpt :But smart community evaluation (SCE) plays a crucial role in ensuring the effectiveness of its implementation and development. Although the last decade has witnessed a number of results in the research on SCE, there are still many issues to be solved as the multi-source data integration and uncertain reasoning (Chilipirea et al., 2017; Ding et al., 2014; Linlin et al., 2017; Sta, 2017). SCE can allow governments to make more effective funding arrangements, increase the capability of community service and management, and promote the transition from traditional consumption to sustainable household consumption.
Fault Prediction and Recovery Using Machine Learning Techniques and the HTM Algorithm in Vehicular Network Environment
2024, IEEE Open Journal of Intelligent Transportation SystemsOptimization of Maritime Communication Workflow Execution with a Task-Oriented Scheduling Framework in Cloud Computing
2023, Journal of Marine Science and Engineering
Cristian Chilipirea is currently a Ph.D. student in computer science at University Politehnica of Bucharest, Romania. He received his M.S. degree in computer science at University Politehnica of Bucharest, Romania and a M.S. degree in science from Vrije Universiteit, Amsterdam, Netherlands, in 2014. He received B.S. degree in computer science in 2012 at the same University. He currently has the position of Assistant Professor, starting in 2014, at University Politehnica of Bucharest, Romania. His research interests include distributed network systems, delay tolerant networks, crowd sensing as well as cloud computing and Internet of Things.
Andreea-Cristina Petre is currently a Ph.D. student at University Politehnica of Bucharest, Romania. She received her M.S. degree in computer science at University Politehnica of Bucharest, Romania, in 2014. She received her B.S. degree in computer science in 2012, at the same University. In 2014 she studied for a year at Vrije Universiteit of Amsterdam, Netherlands by means of an Erasmus study grant. Her areas of interest are crowd sensing, distributed systems, multi-agent systems, natural language processing and operating systems.
Loredana-Marsilia Groza, computer science diplomat engineer, is a Ph.D. student at University Politehnica of Bucharest, Romania, Faculty of Automatic Control and Computers, Computer Science Department. She works as an active member of Distributed Systems Laboratory having a research background in Big Data processing, data analytics, multi-criteria optimization, heterogeneous distributed computing, smart cities. She is also oriented on communication protocols performance analysis.
Ciprian Dobre received his Ph.D. in computer science at the University Politehnica of Bucharest, Romania in 2008. He received his M.S. in computer science in 2004 and the engineering degree in computer science in 2003, at the same University. His main research interests are Grid Computing, Monitoring and Control of Distributed Systems, Modeling and Simulation, Advanced Networking Architectures, Parallel and Distributed Algorithms. He is member of the RoGrid consortium and is involved in a number of national and international projects. His research activities were awarded with the Innovations in Networking Award for Experimental Applications in 2008 by the Corporation for Education Network Initiatives (CENIC).
Florin Pop, Ph.D., is professor with the Computer Science Department and also an active member of Distributed System Laboratory in University Politehnica of Bucharest. His general research interests are: Large-Scale Distributed Systems (design and performance), Grid Computing and Cloud Computing, Peer-to-Peer Systems, Big Data Management, Data Aggregation, Information Retrieval and Ranking Techniques, Bio-Inspired Optimization Methods. He is member of the RoGrid consortium and is involved in a number of national and international projects. His expertise is on optimization based on bio-inspired techniques received three Best Papers Awards, IBM Assistantship Award and IBM Faculty Award. He is member of the IEEE and ACM professional organizations.