Deep spatiotemporal residual early-late fusion network for city region vehicle emission pollution prediction

doi:10.1016/j.neucom.2019.04.040

Neurocomputing

Volume 355, 25 August 2019, Pages 183-199

https://doi.org/10.1016/j.neucom.2019.04.040 Get rights and content

Abstract

Regulation on the urban vehicle emission has great impact on our daily lives and can protect public health. However, there are sparse emission remote sensing stations in city, and vehicle emission data is both spatial and temporal non-stationary, which is influenced by various internal and external factors, such as spatial dependencies (nearby and distant), temporal dependencies (closeness, period, trend), environments (road network, meteorology, events, traffic flow and POIs). In this paper, we introduce a semi-supervised learning approach with co-training geographical weighted regression model, which aims to construct the historical emission observations with the insufficient stations records. And then we formulate the region emission prediction as a spatiotemporal sequence forecasting problem and propose a deep spatiotemporal residual early-late fusion network based on unique properties of spatiotemporal data, to predict vehicle emissions in each region of the given city. And the residual convolution network is employed to model the temporal properties of region vehicle emissions. Finally, we present experiments with the remote sensing records of Hefei, where the proposed model outperforms the other baselines. This result demonstrates that combining deep spatiotemporal residual early-late fusion network with the semi-supervised geographical weighted regression can predict vehicle emission in each region of city effectively.

Introduction

With the increasing number of urban motor vehicles, the ecological and environmental problems caused by automobile exhaust emissions have become increasingly prominent, and have raised a hot topic of social concern. Vehicles generate greenhouse gases, such as CO₂, HC, NO_x and PM_2.5. Getting real-time vehicle emission information is important to support urban traffic pollution control and protect environment. If we were to know the city region vehicle emission at any time, which can enable region pollution alerts and help relevant governments improve city’s transportation infrastructure design.

For example, there is a pollution source in places where geographical and meteorological conditions are not in favour of ventilation. To alleviate high atmospheric pollution, it is necessary to take strict actions as closing of schools and industries and restrict vehicles circulation. If it were possible to predict the place with high pollution probability one or two days in advance, more efficient actions could be taken to alleviate the potential region pollution [1].

Existing methods for region vehicle emission prediction can roughly be categorized into two classes, namely and classical dispersion models and satellite remote sensing. For the classical dispersion models, such as gaussian plume models, operational street canyon models and computational fluid dynamics. And the mobile source emission factor (MOBILE) and computer programme to calculate emissions from road transport (COPERT) models developed in USA and Europe respectively are the most frequently used emission factor models [2]. David D.Parrish evaluated the estimation of the mass of the emitted species, the temporal evolution of the annual average emissions over decade scales, and the speciation of the VOC emissions [3]. Dane Westerdahl et al. characterized Beijing on-road emission factors, the impact of on-road transportation on air quality, to provide data and control measures in advance of the 2008 Olympics [4]. David R. Lyon et al. construct a spatially resolved methane emissions inventory for the 25-country Barnett Shale region with estimation of emissions from O&G and other sources using bottom-up approaches. These models are usually a complex model of meteorology, road network geometry, geographical locations, traffic volumes and emission factors, based on a number of empirical assumptions and parameters which might not be applicable to all city regions [5]. And these parameters are usually hard to obtain, and the results generated by these kinds of models may be inaccurate. Satellite remote sensing of surface air pollution has been studied intensively in past decades [6], which can be regarded as top-down methods. Donkelaar et al. [7] compared PM_2.5 inferred from the moderate resolution imaging spectroradimeter. Lamsal et al. [8] estimated surface NO₂ concentrations by applying local scaling factors from a global three-dimensional model. Zou et al. [9] employ a geographically weighted regression (GWR) model to predict the urban PM_2.5 using the high-resolution satellite aerosol optical depth. However, such approach is extremely influenced by clouds and would be sensitive to other environmental factors, such as humidity, temperature, pressure and geographical locations [7].

Although much progress has been made by these methods, it is still a challenging problem for the estimation of region vehicle emission in the real-world [10].

Firstly, there are insufficient remote sensing vehicle measurement stations in a city for the expensive cost of building and maintaining such equipments. And urban vehicle emission varies by locations non-linearly and depends on many complex external factors, such as road networks, meteorology, traffic, green land ratio and living functions. Moreover, the region emission distribution is with spatiotemporal dependencies. And we summarize the challenges in the region vehicle emission prediction as following:

1) Data sparsity and spatial heterogeneity [11]. As can be seen from Fig. 1, the blue grids denote the location of the monitor station in Hefei. The vehicle emission monitor stations in city is very sparse. And it is hard to make the emission prediction on the city region based on the limited remote sensing monitor stations. Moreover, the spatial heterogeneity of the Emission-ExternalFactor relationship should be taken into account, or in other words, the strength of the Emission-ExternalFactor correlation should not be constant across space and it should change with spatial context. Most existing researches [12] for spatial interpolation are based on geographical statistics methods, which employ the first law of geography: near things are more related than distant things [13]. In the region emission interpolation problem, these methods neglect the road network topology and the traffic volume influence, which means that the two regions are far away in geographic space but have similar road network topology and traffic volume and they may have similar emission distribution. Using such assumptions may cause unpredictable errors. This motivates us to propose a extended co-training geographical weighted regression model, which combines the road network topology and the traffic effects.

2) Spatiotemporal dependencies. In the spatial scale, the vehicle emission of Region R2 (in Fig. 2) is affected by the emission diffusion of nearby regions (R1 and R3). In the temporal scale, the region vehicle emission is influenced by recent, near and distant time intervals. For instance, a traffic congestion occurring at 10 am will affect that of 11 am. Moreover, traffic conditions in the morning may be similar on consecutive workdays, repeating every 24 h. The traditional air pollution method [14] considers it as a single-point time series prediction problem, and then performs spatial interpolation to achieve overall region prediction. But it does not take spatiotemporal dependence into account. That is, each grid emission is affected by its time and spatial neighbors. Without considering the spatiotemporal dependence, we can not achieve a precise region emission prediction. Therefore, we consider the spatiotemporal dependence in each step of prediction with the spatiotemporal residual network.

All these challenges inspire us to rethink the region vehicle emission prediction problem based on deep learning model with the rich amount of spatiotemporal data [15]. And to address the aforementioned issues, in this paper we predict the region vehicle emission through a data-driven method, using a variety of datasets, including remote sensing records, meteorological data, traffic data, road networks data and POIs. Specifically, we present a semi-supervised learning approach with co-training geographical weighted regression (COGWR) to address the data sparsity and spatial heterogeneity, and formulate the region emission prediction as a spatiotemporal sequence forecasting problem that can be solved by constructing a deep learning framework. In order to model the spatiotemporal relationships, we propose a deep spatiotemporal residual early-late fusion network (ResNetELF) to collectively predict vehicle emission in every region. The effectiveness of the proposed method is demonstrated by the comparison with several baseline methods on the real-world dataset. The main contributions of this paper are as followings:

1) To deal with the data sparsity and spatial heterogeneity of vehicle emission records, we propose a semi-supervised learning approach with co-training geographical weighted regression (COGWR), which leverages available emission data combined with meteorology, traffic, POIs and road network to fill the missing entries at unmeasured locations.

2) To capture the spatiotemporal dependencies of region emission, we proposed an end-to-end deep neural network architecture (ResNetELF), which employs convolution layers to model spatial dependencies while extracting the temporal properties of vehicle emission as closeness, period, and trend dependencies. Moreover, to analysis the impact of external factors, an AutoEncoder module and an early-late fusion strategy are proposed to integrate the extracted spatiotemporal feature with multiple external factors (e.g., traffic, weather conditions, road network).

3) The proposed approach is evaluated on the vehicle emission remote sensing data of Hefei. The results demonstrate the superiority of our proposed approach against the state-of-the-art methods.

The rest parts of this paper are arranged as follows. The related work is summarized in Section 2. The system overview is presented in Section 3. The proposed COGWR algorithm and deep spatiotemporal residual early-late fusion network are clarified in Section 4. Section 5 gives the details of experiment settings and the related experiment results. Finally, we conclude the paper in Section 6.

Section snippets

Related work

In this section, we briefly review some representative research on sparse data analysis, spatial data interpolation, spatiotemporal sequence forecasting and feature fusion.

Overview

Firstly, we introduce some definitions in the vehicle emission region prediction problem, and then present the framework of our approach. Finally, we clarify the formulation of the vehicle emission region prediction problem.

Methodology

We propose the COGWR ResNetELF prediction model based on the framework of co-training geographical weighted regression and deep spatiotemporal residual early-late fusion network, as shown in Fig. 6. For problem 1, we propose a co-training geographical weighted regression (COGWR) model, which leverages unlabeled data to fill the missing entries and improving the accuracy of prediction. And for problem 2, we convert the emission pollutants in the target region at each time interval into

Data and setup

The proposed method is implemented in a personal computer, whose detailed information is shown in Table 2. The python libraries, including Keras [38] and Theano [39], are used to build our model.

We utilize the following four real datasets for evaluation, which are detailed in Table 3.

(1) Meteorological data: We collect fine-grained meteorological data, consisting of weather, temperature, humidity, barometer pressure, wind strength, from a public web site every hour.

(2) Remote sensing records:

Conclusion

In this paper, we introduce a semi-supervised learning approach with co-training geographical weighted regression model (COGWR), which aims to construct the historical emission observations with the insufficient stations records. And then we formulate the region emission prediction as a spatiotemporal sequence forecasting problem and propose a deep spatiotemporal residual early-late fusion network based on unique properties of spatiotemporal data, to predict vehicle emissions in each region of

Conflict of interest

There are no conflicts of interest.

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China (61725304, 61673361, 61472380 and 61872327), as well as the Fundamental Research Funds for the Central Universities under Grant WK2380000001.

Zhenyi Xu was born in 1993. He received the B.S. degree in automation from the Nanjing Institute of Technology, Nanjing, China, in 2015, and is currently pursuing the Ph.D. degree of Control Science and Engineering, in the Department of Automation from the University of Science and Technology of China. His research interests are deep learning, urban computing, intelligent transportation, machine learning and data mining.

References (43)

P. Pérez et al.
Prediction of pm 2.5 concentrations several hours in advance using neural networks in santiago, chile
Atmos. Environ.
(2000)
R. Smit et al.
Validation of road vehicle and traffic emission models c a review and meta-analysis
Atmos. Environ.
(2010)
D.D. Parrish
Critical evaluation of us on-road vehicle emission inventories
Atmos. Environ.
(2006)
D. Westerdahl et al.
Characterization of on-road vehicle emission factors and microenvironmental air quality in Beijing, China
Atmos. Environ.
(2009)
S. Vardoulakis et al.
Modelling air quality in street canyons: a review
Atmos. Environ.
(2003)
R.V. Martin
Satellite remote sensing of surface air quality
Atmos. Environ.
(2008)
ZhangJ. et al.
Spatial heterogeneity of urban residential carbon emissions in China
(2013)
LiJ. et al.
A review of comparative studies of spatial interpolation methods in environmental sciences: performance and impact factors
Ecol. Inf.
(2011)
L. Contreras et al.
Wind-sensitive interpolation of urban air pollution forecasts
Procedia Comput. Sci.
(2016)
A. Van Donkelaar et al.
Estimating ground-level pm2.5 using aerosol optical depth determined from satellite remote sensing
J. Geophys. Res.
(2006)

L. Lamsal et al.

Ground-level nitrogen dioxide concentrations inferred from the satellite-borne ozone monitoring instrument

J. Geophys. Res.: Atmos.

(2008)

ZouB. et al.

High-resolution satellite mapping of fine particulates based on geographically weighted regression

IEEE Geosci. Remote Sens. Lett.

(2016)

D.R. Lyon et al.

Constructing a spatially resolved methane emission inventory for the Barnett shale region

Environ. Sci. Technol.

(2015)

H.J. Miller

Tobler’s first law and spatial analysis

Ann. Assoc. Am. Geogr.

(2004)

X. Shi, D.-Y. Yeung, Machine learning for spatiotemporal sequence forecasting: a survey, arXiv:1808.06865...

LuoX. et al.

Incorporation of efficient second-order solvers into latent factor models for accurate prediction of missing qos data

IEEE Trans. Cybern.

(2018)

LuoX. et al.

An inherently nonnegative latent factor model for high-dimensional and sparse matrices from industrial applications

IEEE Trans. Ind. Inform.

(2018)

LuoX. et al.

Generating highly accurate predictions for missing qos data via aggregating nonnegative latent factor models

IEEE Trans. Neural Netw. Learn. Syst.

(2016)

LuoX. et al.

An effective scheme for qos estimation via alternating direction method-based matrix factorization

IEEE Trans. Serv. Comput.

(2016)

LuoX. et al.

Non-negativity constrained missing data estimation for high-dimensional and sparse matrices from industrial applications

IEEE Trans. Cybern.

(2019)

LuoX. et al.

A nonnegative latent factor model for large-scale sparse matrices in recommender systems via alternating direction method

IEEE Trans. Neural Netw. Learn. Syst.

(2016)

Cited by (39)

Models for predicting vehicle emissions: A comprehensive review
2024, Science of the Total Environment
Air pollution is a primary concern, causing around 7 million premature deaths annually, with traffic-related sources contributing 23 %–45 % of emissions. While several studies have surveyed vehicle emission models, they are either outdated or focus on specific data-driven models. This paper systematically reviews vehicle emission prediction models, comparing traditional approaches with data-driven emission models. The traditional emission models can be divided into average-speed, modal, and other models, noting their reliance on empirical assumptions and parameters that may not be universally applicable. In contrast, we delve into data-driven models utilizing dynamometer and on-road test data for time-series and spatial-temporal predictions. The application of these models is discussed across various scenarios, highlighting the progress and gap. We observed that traditional models, primarily estimating total traffic emissions in study regions, lack micro-level detail crucial for tailored decisions. The direct link between road emission model accuracy and input data quality poses challenges in disaggregating on-road vehicle emission inventories. Due to unique transportation instruments, traffic fleet components, and patterns, exploring the effects of emission-reduction policies in specific cities or regions is urgent. Vehicle characteristics, environmental conditions, traffic scenarios, and prediction scales are common effect factors, while instantaneous driving profiles prove effective in model calibration. In data-driven models, ANN outperforms in estimating emissions and performance of low-power diesel engines with errors not exceeding 5 %. However, no single data-driven method performed excellently in predicting all pollutants. Besides, integrated methods utilizing LSTM, GRU, and RNN outperform individual models. To enhance prediction accuracy considering the inherent connectivity of road networks and spatiotemporal variation patterns of vehicle emissions, GCN is an emerging approach for capturing spatial-temporal relationships based on remote sensing data. Moreover, limited data-driven studies have been performed to forecast particle matter emissions, the main contributors to urban pollution, calling for more attention for future research.
A novel multivariate time series prediction of crucial water quality parameters with Long Short-Term Memory (LSTM) networks
2023, Journal of Contaminant Hydrology
Intelligent prediction of water quality plays a pivotal role in water pollution control, water resource protection, emergency decision-making for sudden water pollution incidents, tracking and evaluation of water quality changes in river basins, and is crucial to ensuring water security. The primary methodology employed in this paper for water quality prediction is as follows: (1) utilizing the comprehensive pollution index method and Mann-Kendall (MK) trend analysis method, an assessment is made of the pollution status and change trend within the basin, while simultaneously extracting the principal water quality parameters based on their respective pollution share rates; (2) employing the spearman method, an analysis is conducted to identify the influential factors impacting each key parameter; (3) subsequently, a water quality parameter prediction model, based on Long Short-Term Memory (LSTM) analysis, is constructed using the aforementioned driving factor analysis outcomes. The developed LSTM model in this study showed good prediction performance. The average coefficient of determination (R²) of the prediction of crucial water quality parameters such as total nitrogen (TN) and dissolved oxygen (DO) reached 0.82 and 0.86 respectively. Additionally, the error analysis of WQI prediction results showed that >75% of the prediction errors were in the range of 0–0.15. The comparative analysis revealed that the LSTM model outperforms both the random forest (RF) model in time series prediction and demonstrates superior robustness and applicability compared to the AutoRegressive Moving Average with eXogenous inputs model (ARMAX). Hence, the model developed in this study offers valuable technical assistance for water quality prediction and early warning systems, particularly in economically disadvantaged regions with limited monitoring capabilities. This contribution facilitates resource optimization and promotes sustainable development.
A dual attention-based fusion network for long- and short-term multivariate vehicle exhaust emission prediction
2023, Science of the Total Environment
Citation Excerpt :
However, limited by their stationarity assumption of time sequences, these models could fail to take the long-term correlation into account and may not well handle highly complex exhaust emission data. Xu et al. (Zhenyi et al., 2019) proposed a deep spatio-temporal residual early-late fusion network based on unique properties of spatio-temporal data to predict vehicle emissions. Fei et al. (Fei et al., 2021) adopted a novel deep learning-based framework based on multi-component fusion temporal networks to collectively predict vehicle emission concentrations.
The increasing number of vehicles is one main cause of atmospheric environment pollution problems. Timely and accurate long- and short-term (LST) prediction of the on-road vehicle exhaust emission could contribute to atmospheric pollution prevention, public health protection, and government decision-making for environmental management. Vehicle exhaust emission has strong non-stationary and nonlinear characteristics due to the inherent randomness and imbalance nature of meteorological factors and traffic flow. Therefore accurate LST vehicle exhaust emission prediction encounters many challenges, such as the LST temporal dependencies and complicated nonlinear correlation on various emission gases, including carbon monoxide (CO), hydrocarbon (HC), and nitric oxide (NO), and external influence factors. To resolve these challenging issues, we propose a novel hybrid deep learning framework, namely Dual Attention-based Fusion Network (DAFNet), to effectively predict LST multivariate vehicle exhaust emission with the temporal convolutional network, convolutional neural network, long short term memory (LSTM)-skip based on recurrent neural network, dual attention mechanism, and autoregressive decomposition model. The proposed DAFNet consists of three major parts: 1) a nonlinear component to effectively capture the dynamic LST temporal dependency of multivariate gas by the temporal convolutional network, convolutional neural network, and LSTM-skip. Moreover, the above two networks employ an attention mechanism to model the internal relevance of the LST temporal patterns and multivariate gas, respectively. 2) a linear component to tackle the scale-insensitive problem of the neural network model by an autoregressive decomposition model. 3) the external components are taken to compensate the impact of external factors on vehicle exhaust emission by the multilayer perceptron model. Finally, the proposed DAFNet is evaluated on two real-world vehicle emission datasets in Zibo and Hefei, China. Experimental results demonstrate that the proposed DAFNet is a powerful tool to provide highly accurate prediction for LST multivariate vehicle exhaust emission in the field of vehicle environmental management.
Attention-based global and local spatial-temporal graph convolutional network for vehicle emission prediction
2023, Neurocomputing
Nowadays the number of vehicles is increasing day by day and vehicle emission becomes a major pollution source. To wisely control vehicle emission, accurate vehicle emission prediction is of critical importance. However, accurate vehicle emission prediction suffers from many challenges, such as the strong nonlinearity of emission data and the temporal correlation and spatial interaction between different road segments, which become more complicated for mid- and long-term prediction. To resolve these challenging issues, we propose an attention-based global and local spatial-temporal graph convolutional network (AGLGCN) to effectively predict mid- and long-term vehicle emission through a graph structural network. The proposed AGLGCN consists of two major parts: 1) a spatial-temporal attention mechanism to effectively capture the dynamic spatial-temporal correlation of vehicle emission data by merging hourly, daily, and weekly sequences, 2) a global and local spatial graph convolution network to capture the hidden global and local spatial dependencies based on graph convolution. AGLGCN can capture the dynamic temporal correlation as well as the global and local spatial information variation of vehicle emission, and effectively predict mid- and long-term time series. Two real-world vehicle emission datasets are taken to evaluate AGLGCN. Experimental results demonstrate that our proposed AGLGCN can outperform some state-of-the-art methods.
Review on recent progress in on-line monitoring technology for atmospheric pollution source emissions in China
2023, Journal of Environmental Sciences (China)
Citation Excerpt :
In Kang et al. (2019b), a random forest model was developed to modify and train the remote sensing results of vehicle exhaust for the purpose of real-time online correction of vehicle exhaust data. For vehicle exhaust detection in urban regions, Xu et al. (2019) proposed a semi-supervised geographically weighted regression learning model, which is an early-to-late fusion network of deep spatiotemporal residuals based on spatiotemporal data characteristics and can accurately predict vehicle emissions in various urban areas. A spatiotemporal convolution multifusion network was proposed (Xu et al., 2020), which uses the structural characteristics of the graph as the internal connectivity of road networks and investigates the external factors to further improve the accuracy of vehicle emission prediction in urban regions.
Emissions from mobile sources and stationary sources contribute to atmospheric pollution in China, and its components, which include ultrafine particles (UFPs), volatile organic compounds (VOCs), and other reactive gases, such as NH₃ and NO_x, are the most harmful to human health. China has released various regulations and standards to address pollution from mobile and stationary sources. Thus, it is urgent to develop online monitoring technology for atmospheric pollution source emissions. This study provides an overview of the main progress in mobile and stationary source monitoring technology in China and describes the comprehensive application of some typical instruments in vital areas in recent years. These instruments have been applied to monitor emissions from motor vehicles, ships, airports, the chemical industry, and electric power generation. Not only has the level of atmospheric environment monitoring technology and equipment been improving, but relevant regulations and standards have also been constantly updated. Meanwhile, the developed instruments can provide scientific assistance for the successful implementation of regulations. According to the potential problem areas in atmospheric pollution in China, some research hotspots and future trends of atmospheric online monitoring technology are summarized. Furthermore, more advanced atmospheric online monitoring technology will contribute to a comprehensive understanding of atmospheric pollution and improve environmental monitoring capacity.
A novel deployment strategy of monitors for vehicle emission remote sensing system
2022, ISA Transactions

View all citing articles on Scopus

Yang Cao (M’-) was born in 1980. He received the B.S. degree and the Ph.D. degree in information engineering from Northeastern University, Shenyang, China, in 1999 and 2004, respectively. Since 2004, he has been with the Department of Automation, University of Science and Technology of China, Hefei, China, where he is currently an Associate Professor. His current research interests include machine learning and computer vision. Dr. Cao is a member of the IEEE Signal Processing Society.

Yu Kang (M’09-SM’-) received the Dr. Eng. degree in control theory and control engineering from the University of Science and Technology of China, Hefei, China, in 2005. From 2005 to 2007, he was a Post-Doctoral Fellow with the Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China. He is currently a Professor with the Department of Automation, University of Science and Technology of China. His current research interests include adaptive/robust control, variable structure control, mobile manipulator, and Markovian jump systems.

View full text

Deep spatiotemporal residual early-late fusion network for city region vehicle emission pollution prediction

Abstract

Introduction

Section snippets

Related work

Overview

Methodology

Data and setup

Conclusion

Conflict of interest

Acknowledgment

Atmos. Environ.

Atmos. Environ.

Atmos. Environ.

Atmos. Environ.

Atmos. Environ.

Atmos. Environ.

Ecol. Inf.

Procedia Comput. Sci.

Estimating ground-level pm2.5 using aerosol optical depth determined from satellite remote sensing

J. Geophys. Res.

Ground-level nitrogen dioxide concentrations inferred from the satellite-borne ozone monitoring instrument

J. Geophys. Res.: Atmos.

High-resolution satellite mapping of fine particulates based on geographically weighted regression

IEEE Geosci. Remote Sens. Lett.

Constructing a spatially resolved methane emission inventory for the Barnett shale region

Environ. Sci. Technol.

Tobler’s first law and spatial analysis

Ann. Assoc. Am. Geogr.

Incorporation of efficient second-order solvers into latent factor models for accurate prediction of missing qos data

IEEE Trans. Cybern.

An inherently nonnegative latent factor model for high-dimensional and sparse matrices from industrial applications

IEEE Trans. Ind. Inform.

Generating highly accurate predictions for missing qos data via aggregating nonnegative latent factor models

IEEE Trans. Neural Netw. Learn. Syst.

An effective scheme for qos estimation via alternating direction method-based matrix factorization

IEEE Trans. Serv. Comput.

Non-negativity constrained missing data estimation for high-dimensional and sparse matrices from industrial applications

IEEE Trans. Cybern.

A nonnegative latent factor model for large-scale sparse matrices in recommender systems via alternating direction method

IEEE Trans. Neural Netw. Learn. Syst.