Skip to main content
Log in

A distributed EEMDN-SABiGRU model on Spark for passenger hotspot prediction

基于Spark面向分布式EEMDN-SABiGRU模型的乘客热点预测

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

To address the imbalance problem between supply and demand for taxis and passengers, this paper proposes a distributed ensemble empirical mode decomposition with normalization of spatial attention mechanism based bi-directional gated recurrent unit (EEMDN-SABiGRU) model on Spark for accurate passenger hotspot prediction. It focuses on reducing blind cruising costs, improving carrying efficiency, and maximizing incomes. Specifically, the EEMDN method is put forward to process the passenger hotspot data in the grid to solve the problems of non-smooth sequences and the degradation of prediction accuracy caused by excessive numerical differences, while dealing with the eigenmodal EMD. Next, a spatial attention mechanism is constructed to capture the characteristics of passenger hotspots in each grid, taking passenger boarding and alighting hotspots as weights and emphasizing the spatial regularity of passengers in the grid. Furthermore, the bi-directional GRU algorithm is merged to deal with the problem that GRU can obtain only the forward information but ignores the backward information, to improve the accuracy of feature extraction. Finally, the accurate prediction of passenger hotspots is achieved based on the EEMDN-SABiGRU model using real-world taxi GPS trajectory data in the Spark parallel computing framework. The experimental results demonstrate that based on the four datasets in the 00-grid, compared with LSTM, EMD-LSTM, EEMD-LSTM, GRU, EMD-GRU, EEMD-GRU, EMDN-GRU, CNN, and BP, the mean absolute percentage error, mean absolute error, root mean square error, and maximum error values of EEMDN-SABiGRU decrease by at least 43.18%, 44.91%, 55.04%, and 39.33%, respectively.

摘要

针对出租车与乘客之间的供需不平衡问题, 本文提出一种基于Spark的分布式归一化集合经验模态分解和面向空间注意力机制的双向门控循环单元(EEMDN-SABiGRU)模型, 实现乘客热点的精准预测, 旨在于降低盲目巡航开支、提高载客效率和实现收益最大化。首先, 提出一种归一化的集合经验模态分解方法(EEMDN), 处理网格中乘客热点数据, 解决非平稳序列问题和数值差异过大造成的预测精度下降问题, 避免EMD本征模态函数(IMF)存在的模态混叠现象。其次, 构建一种基于乘客上下车热点的权重和乘客的空间规律性的空间注意力机制, 捕捉每个网格中的乘客热点特征。再次, 融合一种双向门控循环单元(GRU)算法, 解决GRU仅能获取前向信息而忽略后向信息问题, 提高特征提取的准确性。最后, 在Spark并行计算框架下, 采用真实的出租车GPS轨迹数据, 基于EEMDN-SABiGRU模型实现了乘客热点的准确预测。实验结果表明, 在00网格4个数据集上, 与LSTM、EMDL-STM、EEMD-LSTM、GRU、EMD-GRU、EEMD-GRU、EMDN-GRU、CNN和BP相比, EEMDN-SABiGRU的平均绝对百分比误差、平均绝对误差、均方根误差和最大误差值分别降低了43.18%、44.91%、55.04%和39.33%。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding authors upon reasonable request.

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Dawen XIA and Jian GENG designed the research. Dawen XIA, Jian GENG, and Huaqing LI proposed the approaches and performed the experiments. Ruixi HUANG, Bingqi SHEN, and Yang HU processed the data. Dawen XIA, Jian GENG, and Huaqing LI drafted the paper. Dawen XIA, Jian GENG, Yang HU, Yantao LI, and Huaqing LI revised and finalized the paper.

Corresponding authors

Correspondence to Dawen Xia  (夏大文) or Huaqing Li  (李华青).

Ethics declarations

Dawen XIA, Jian GENG, Ruixi HUANG, Bingqi SHEN, Yang HU, Yantao LI, and Huaqing LI declare that they have no conflict of interest.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 62162012, 62173278, and 62072061), the Science and Technology Support Program of Guizhou Province, China (No. QKHZC2021YB531), the Natural Science Research Project of Department of Education of Guizhou Province, China (Nos. QJJ2022015 and QJJ2022047), the Science and Technology Foundation of Guizhou Province, China (Nos. QKHJCZK2022YB195, QKHJCZK2022YB197, and QKHJCZK2023YB143), the Scientific Research Platform Project of Guizhou Minzu University, China (No. GZMUSYS202104), and the 7th Batch High-Level Innovative Talent Project of Guizhou Province, China

List of electronic supplementary materials

Fig. S1 Comparisons of models using the 1-day dataset

Fig. S2 Comparisons of models using the 10-day dataset

Fig. S3 Comparisons of models using the 20-day dataset

Fig. S4 Comparisons of models using the 30-day dataset

Supplementary materials

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, D., Geng, J., Huang, R. et al. A distributed EEMDN-SABiGRU model on Spark for passenger hotspot prediction. Front Inform Technol Electron Eng 24, 1316–1331 (2023). https://doi.org/10.1631/FITEE.2200621

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.2200621

Key words

关键词

CLC number

Navigation