Keywords

1 Introduction

Currently, with the rapid economic development in all parts of the world, most of the population resides in urban areas. According to UN statistics, by 2050 more than 2/3 of the global population will reside in cities. In addition to the sharp increase in demand for services, in the future it can be expected that issues regarding energy, water resources, transportation, disaster prevention, public security, health, education, and medical care will come into existence. Hence, the planning of smart cities [1] and smart services has become social issues that local governments and companies in relevant industries must pay attention to. According to the IDC’s statistics [2], the Asia/Pacific region represents over 40% of total spending on smart cities initiatives, while the Americas represent around one third, and Europe, Middle East and Africa around one quarter of the global opportunity. With various countries promoting the development of smart cities around the world, the Ministry of Science and Technology in Taiwan initiated the four-year Smart Science Park Project in 2016 [3], using ICT, Internet of Things (IoT), and other state-of-the-art technologies to transform the science parks into living labs for new smart services, focusing on transportation, sustainability, and governance.

Hsinchu Science Park [4] was established in 1980 as the first science park in Taiwan. The top three industrial focuses are integrated circuits, optoelectronics, and computers and peripheral merchandise. Southern Taiwan Science Park [5] was established in 1997. The top three industrial focuses are biotechnology, precision machinery, and optoelectronics. Central Taiwan Science Park [6] was established in 2003. The top three industrial focuses are precision machinery, biotechnology, and optoelectronics. Currently, more than 850 companies have been stationed in the three science parks, and the number of employees has reached 276,000, and the overall turnover in 2018 has reached 80 billion dollars. According to statistics, 95% of the 153,000 commuters in Hsinchu Science Park use private vehicles (cars or motorcycles) as their means of transportation. Traffic congestion during peak hours of commuting is hence very serious. The traffic congestion not only puts drivers in a gridlock, but also gives rise to economic loss of time and fuel cost, as well as excessive emissions of pollutants while the cars are functioning at low speeds, rendering an even worse impact on the environment [7]. In the long run, losses accumulated can be extremely considerable. In the early days, limited by technology, the administration could only collect traffic and air quality data manually in the science parks. Now in conjunction with the Smart Park Project, the administration built a variety of IoT-enabled systems in 2017, with the eTag [8] system and continuous emission monitoring systems (CEMS) to collect traffic and air quality data and it was expected that such data would be conducive to helping perform traffic management and monitor air quality more accurately.

Many organizations have begun to use big data analytics technology to develop smart applications to achieve smart sustainability, smart governance, smart management and other goals [9]. IoT Innovation [10] indicates that there are three differences between big data and IoT: concept, time sequencing, and analytical goals. The big data, which is analyzing large amounts of mostly human-generated data to support longer-duration use cases such as predictive maintenance. The IoT, which is aggregating and compressing massive amounts of low latency, low duration, and high volume machine-generated data coming from a wide variety of sensors to support real-time use cases such as operational optimization. Although big data and IoT are different, they are intricately linked. Therefore, it is important to find out how to use the emerging technology to support IoT-enabled big data management in order to provide accurate information in real time to help administration observe the historical trends and current conditions of traffic congestion and air pollution, and to formulate research questions on traffic controls and air pollution prevention strategies.

Many previous studies [11,12,13,14,15,16,17] point out that data visualization technology and dashboard can bridge the gap between user and data, and improve decision-making quality and efficiency. Based on the statement above, this paper has two research purposes: (1) to build a Science Park Smart Governance Platform [18] to collect, process, and demonstrate data from the Science Park IoT device systems; (2) via big data and visualization technology, to design and develop two smart application services for traffic dynamics and air quality monitoring respectively, and, in the shortest time possible, to convert complicated raw data into dynamic charts and provide them to the administration in the form of dashboards. To that end, this paper plans to gather the traffic flow data of eTag and air quality monitoring data via the standardized data exchange technology tool (e.g. application programming interface, API). Then, the data are parsed and stored in the database of the platform. Finally, the traffic flow data and air quality data are visualized by describing and building modules via statistical analysis and data visualization technology.

2 Related Works

In recent years, with the advances in software and hardware technologies, research on big data is increasing day by day. This paper summarizes the findings of Nuaimi et al. [10] and Mehmood et al. [19] to list out the main applications of big data in smart cities. As shown in Table 1, IoT combined with big data analysis can improve service quality and improve decision-making efficiency. However, real-time IoT big data has high technical requirements for big data management, big data processing platform, algorithm, open standard technology, etc. Therefore, in the absence of relevant technologies, it is difficult for the administration to further explore the value of IoT-enabled big data.

Table 1. Services of big data applications in smart city

Data visualization [20] is a new research area in recent years. The objective of data visualization is to support the collection and interpretation of heterogeneous data in more clear and effective ways. Previous researches [10,11,12,13,14,15,16] have pointed out that data visualization technologies can bridge the gap between user and data, and thus improve decision-making quality and efficiency. Some studies such as [11,12,13, 16] even present the core information which is most relevant to decision-makers in the form of dashboard. Practically, studies [11, 13] present multiple sources of data on one interface with multiple dashboards, while study [12] presents multiple sources of data on a single dashboard in various colors and graphs. Besides the above-mentioned studies, study [16] utilizes cross-theme dashboards to allow users to execute data exploration, extraction, visualization, transformation, etc. in an interactive manner. Since dashboard offers a comprehensive viewpoint for decision-makers to access all key information in an intuitive way, it has been adopted in the area of city governance [21,22,23] in recent years. Examples such as the London City Dashboard [24] and the Boston Smart City dashboard [25] present information of multiple categories (e.g. traffic, weather, finance, news, etc.) on a single dashboard, whereas examples such as Bandung City Dashboard [26] and Smart CEI Moncloa Dashboard [27] have a list of categories (e.g. education, population, environment, etc.) on the left of the dashboard for users to click on, and on the right, there is another page or a drop-down menu to present relevant information. Last but not least, in the example of Dublin Dashboard [28], there are different dashboard interfaces which present different information. (See Figs. 1 and 2)

Fig. 1.
figure 1

(Source: [24,25,26])

The three examples of city dashboard

Fig. 2.
figure 2

(Source: [27, 28])

The two examples of city dashboard

As discussed in the above studies, after being analyzed and visualized, and then presented with all the key information via dashboard, big data generated by smart cities services can help decision makers achieve better efficiency. Therefore, this paper will use the standardized data exchange format technology tool application programming interface (API) to obtain traffic flow data and air quality monitoring data in the science parks. Then, data analysis and storage are performed to the platform back-end database. Finally, via statistical analysis and data visualization technologies, the traffic and air quality data are visually interpreted by the dashboard via the building of the model. The details will be explained in the next section.

3 Methods

Previous studies [20, 29, 30] have established the framework of “visual analytics pipeline” (see Fig. 3), which has four core concepts: (1) Data, which refers to the collection and pre-processing of heterogeneous raw data. Common pre-processing includes data parsing, data integration, data cleaning (e.g. getting rid of redundancy, errors, and invalid), data transformation (e.g. normalization), and data reduction; (2) Models, which refers to the conversion from data to information. Common conversion methods include feature selection and generation, model building, selection and validation, etc.; (3) Visualization, which refers to the visualization and abstract transformation of data, such as visual mapping (e.g. parallel coordinates, force-directed graph, chord graph, scatter matrix), view generation and coordination (e.g. overview + detail, small-multiples), human-computer interaction, etc.; (4) Knowledge (or called intelligence-gaining, sense-making, decision-making, or concept-building), which refers to the process in which humans interact with machines so as to spark knowledge.

Fig. 3.
figure 3

(Source: [20, 29, 30])

Visual analytics pipeline

This paper applies the theoretical framework of the visual analytics pipeline to illustrate the establishment of the smart traffic monitoring service and the smart air quality monitoring service. The objective of Data and Model is to collect, process, and discover the value of data, whereas the purpose of Visualization and Knowledge is to present the results of data and model visualization. Therefore, this paper will first explain the two phases of Data and Model in this section. The two phases of Visualization and Knowledge will be further explored later in Sect. 4.

3.1 Smart Traffic Monitoring Service

In the Data phase, this paper collects the raw traffic data in the science parks and converts it into a format accessible by the database. There are three steps in the establishment process (see Fig. 4): (1) Using APIs to interface with the eTag device system and collect the traffic data at intersections; (2) Writing a program to parse the data structure in the JSON file format; (3) Cleaning and converting the data.

Fig. 4.
figure 4

Data flow of smart traffic monitoring service

In the Model phase, this paper builds a traffic prediction model to predict the traffic volume in the next 10 min. There are three steps in the establishment process (see Fig. 5): (1) manually collecting the map and traffic data from 10 intersections from Google Maps. (2) comparing the color of the Google traffic map image and the traffic volume at the intersections. If the color of the road segment is red (indicating traffic congestion), it is marked as 1, otherwise it is marked as 0 (indicating smooth traffic); (3) integrating climate information [31] (e.g. rainfall probability, highest temperature, lowest temperature) and a traffic flow prediction model for each intersection with Multiple Logistic Regression Analysis [32]. The equation is specified as Eq. 1.

Fig. 5.
figure 5

Model building of smart traffic monitoring service (Color figure online)

$$ {\text{y = }}\upbeta_{ 0} { + }\upbeta_{ 1} {\text{x}}_{ 1} { + }\upbeta_{ 2} {\text{x}}_{ 2} { + }\upbeta_{ 3} {\text{x}}_{ 3} { + }\upbeta_{ 4} {\text{x}}_{ 4} $$
(1)

where the variable y indicates the traffic congestion at the intersection (0 means smooth traffic, 1 means congestion), the variables β0−β3 are regression coefficients, the variable x1 is traffic flow, the variable x2 is rainfall probability, the variable x3 is the highest temperature of the day, and the variable x4 is the lowest temperature of the day.

3.2 Smart Air Quality Monitoring Service

In the Data phase, this paper collects the air quality data in the science parks and converts it into a database-accessible format. The establishment process has three steps (see Fig. 6): (1) using APIs to interface with CEMS and collecting air quality data from the monitoring station(s) in the area; (2) writing a program to parse the data structure in the XML file format; (3) cleaning and converting the data.

Fig. 6.
figure 6

Data flow of smart air quality monitoring service

In the Model phase, this paper converts air quality data into Air Quality Index (AQI). For the calculation, see Eqs. 2, 3, 4 and 5.

$$ {\text{AQI}}_{\text{s1}} {\text{ = max}}\left( {{ \hbox{max} }\left( {{\text{O}}_{{ 3 , 8 {\text{hr}}}} , {\text{ O}}_{ 3} } \right) , {\text{ PM}}_{ 2. 5} , {\text{ PM}}_{ 1 0} , {\text{ CO, SO}}_{ 2} , {\text{ NO}}_{ 2} } \right) $$
(2)
$$ {\text{AQI}}_{\text{s2}} {\text{ = max}}\left( {{ \hbox{max} }\left( {{\text{O}}_{{ 3 , 8 {\text{hr}}}} , {\text{ O}}_{ 3} } \right) , {\text{ PM}}_{ 2. 5} , {\text{ PM}}_{ 1 0} , {\text{ CO, SO}}_{{ 2 , 2 4 {\text{hr}}}} , {\text{ NO}}_{ 2} } \right) $$
(3)
$$ {\text{AQI}}_{\text{s3}} {\text{ = max}}\left( {{\text{O}}_{ 3} , {\text{ PM}}_{ 2. 5} , {\text{ PM}}_{ 1 0} , {\text{ CO, SO}}_{{ 2 , 2 4 {\text{hr}}}} , {\text{ NO}}_{ 2} } \right) $$
(4)
$$ {\text{AQI = max}}\left( {{\text{AQI}}_{\text{s1}} , {\text{ AQI}}_{\text{s2}} , {\text{ AQI}}_{\text{s3}} } \right) $$
(5)

where the variable O3,8 hr is the average concentration of ozone (trioxygen) in the last eight hours; the variable O3 is the instantaneous concentration of ozone (trioxygen); the calculation of PM2.5 and PM10 is (0.5 × the average in the first 12 h + 0.5 × the average in the first 4 h); the variable CO is the average concentration of carbon monoxide (CO) in the last eight hours; the variable SO2 is the instantaneous concentration of sulfur dioxide; the variable SO2,24 hr is the average concentration of sulfur dioxide in the last 24 h; the variable NO2 is the instantaneous concentration of nitrogen dioxide.

4 Results

This study established the Science Park Smart Governance Platform to manage the IoT-enabled big data and provide smart application services. Its user interface is shown in Fig. 7. All the features of the Science Park Smart Governance Platform are listed in the menu on the left. From top to bottom, they are front end, back end, front end settings, IoT data management center, statistics of water/electricity consumption, back end settings, map-based smart application services, and IoT data stream monitoring. The IoT data management center allows the administration to manage all IoT-enabled big data in the science parks. The two functions of IoT data management center are (1) converting unstructured data into structured data, and (2) converting data into three file formats (i.e. CSV, JSON, and XML). Map-based smart application services allow the administration to visually examine data processing and analysis results. Similar to researches [11, 13], this paper also integrates multiple data into a single dashboard, with the purpose of reducing the time and cognitive effort that users need to spend in order to understand the information presented.

Fig. 7.
figure 7

(Source: [18])

User interface of science park smart governance platform

4.1 Smart Traffic Monitoring Service

In the Visualization phase, this paper visualizes traffic-related data or information, and the user interface established is shown in Fig. 8. The user interface has five main blocks; from top to bottom they are (1) the big data exploration block, (2) the intersections block, (3) the traffic predication block, (4) the traffic flow block, and (5) the weather condition block. The intersections block presents the traffic flow of 10 intersections simultaneously via data visualization. The traffic prediction block predicts the traffic flow change of each intersection in the next 10 min. The traffic flow block provides the instantaneous traffic flow data of each intersection. The weather condition block provides the local rainfall probability, temperature, as well as other weather conditions described in text.

Fig. 8.
figure 8

User interface of smart traffic monitoring service

In the Knowledge phase, this paper accumulates the traffic data from 10 intersections since July 2018 (approximately 500,000 pieces of data per week) and presents it in a line chart (as shown in Fig. 9). Users can click on the buttons at the top right to view the trend of traffic flow history data define in different timeframes (i.e. the current day, the current month, and the current year). For example, the traffic volume on the 14th is abnormally high, and the traffic volume in the first half of the month is higher than that in the second half.

Fig. 9.
figure 9

User interface of data exploration area

In short, the smart traffic monitoring service provides three services: (1) providing accurate traffic flow data, allowing the administration to instantly (updated per minute) grasp the traffic status; (2) providing three kinds of traffic flow history data trends, allowing the administration to know at what time traffic congestion occurs; (3) predicting traffic flow for the next 10 min, and the administration can use this information to further adjust the traffic lights at intersections.

4.2 Smart Air Quality Monitoring Service

In the Visualization phase, this paper presents information such as air quality index (AQI), wind direction, wind speed, etc., and the results are shown in the upper part of the dashboard (see Fig. 10). Therefore, users can directly know in which directions pollutants are diffused the air quality status of each monitoring station from the colors shown on the dashboard. Green (0–50) indicates good air quality. Yellow (51–100) indicates normal air quality. Orange (101–150) indicates the air quality is unhealthy for sensitive groups. Red (151–200) means that air quality is unhealthy for all people. Purple (201–300) means that air quality is very unhealthy, and maroon (301–500) means air quality is harmful.

Fig. 10.
figure 10

User interface of smart air quality monitoring service (Color figure online)

In the Knowledge phase, this paper collects the data of air quality per hour since mid-May 2018 (approximately 3,600 pieces of data per month) and sets an air pollution warning threshold for the platform. When the AQI exceeds the threshold, the time, location, and AQI value of air pollution will be listed at the lower left of the dashboard. Therefore, the administration can further explore whether there exist regional or seasonal trends in air quality anomalies.

In short, smart air quality monitoring service provides three services: (1) the hourly concentrations data of air pollutant from three monitoring stations are collected, allowing the administration to monitor the air quality status in various areas of the science parks immediately; (2) the hourly concentration data of air pollutant, wind direction and wind speed data from the three stations has been collected from mid-May 2018, presenting a historical trend of air quality, and allowing the administration to understand in what month/time air pollution is occurring; and (3) when the concentration of certain air pollutant exceeds a particular threshold, the smart environmental monitoring Chabot service will push warning messages to the administration in the science park.

5 Conclusion and Future Work

Big data and visual analysis technology have been widely used to tackle real-world problems such as network traffic analysis, engaging education, sport analysis, database analysis, and biological data analysis. In recent years, relevant industries and the academia have begun to pay attention to data visualization applications such as time-oriented data, spatio-temporal data, network data, etc. Different from previous researches, this paper discusses the IoT-enabled big data and its visualization in smart application services. First, the researchers developed the IoT-enabled Science Park Smart Governance Platform, which allows the administration to manage all on-site IoT-enabled data using IoT data management. Next, the researchers introduced the dashboard concept to the smart traffic monitoring service and smart air quality monitoring service, allowing the administration to view traffic congestion and air pollution information in a comprehensive manner.

The contribution of this paper is as follows: (1) the application of the visual analytics pipeline theory framework to fully explain the operation of smart traffic monitoring service and smart air quality monitoring service at various phases (i.e. Data, Model, Visualization, and Knowledge), adding a new example to the development and application of IoT-enabled big data presented via dashboard; (2) the smart traffic monitoring service presents the traffic volume at the intersection and predicts the traffic change in the next 10 min, and the traffic sign technology automatically adjusts the time of red lights to improve the issue of congestion and reduce the number of stoppages and the amount of fuel consumption; (3) instant air pollution monitoring of the smart air quality monitoring service, with data of wind direction and wind speed, can be accessed through digital signage or the Science Park Mobile Wizard 2.0 APP [33] and other terminal devices. The system will also automatically notify the people in the downwind to carry out self-protection strategies (such as wearing masks, reducing the time spent outdoors) to reduce the chances of respiratory dysfunctions, cardiovascular diseases (CVDs), and asthma attacks; (4) using big data technology, the administration can know whether traffic congestion or air pollution follows a regional or seasonal trend. If the traffic and air quality control and pollution prevention strategies can be formulated early, the public can enjoy a safer and more livable environment.

In conclusion, the development of Science Park Smart Governance Platform enables administration to accurately observe and grasp the trends, be it historical or current, of traffic congestion and air pollution. There are three directions worthy of further exploration in the future: (1) warning service should be developed in the future to detect data reception abnormality from the IoT devices in real time, in order to enable administration to check and repair the devices/systems in the shortest time to improve preventive maintenance; (2) the research can present the integrated multi-data analysis results via map-based dashboard visualization, in order to provide administration with more comprehensive information to optimize the quality and efficiency of decision-making process; and (3) traffic congestion and air quality issues need to be more thoroughly investigated in future studies, in order to increase the accuracy of the information dashboard provided by the Science Park Smart Governance Platform.