research-article

Open access

Estimation of Bus Passenger Attributes Using Swin Transformer

Authors:

Soichiro Yokoyama,

Tomohisa Yamashita,

Hidenori Kawamura,

Miyuki HirasawaAuthors Info & Claims

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

Pages 121 - 128

https://doi.org/10.1145/3573942.3573961

Published: 16 May 2023 Publication History

All formats PDF

Abstract

Human Attribute Recognition is an emerging research topic related to person re-identification in the computer vision and pattern recognition fields. Existing research has mainly focused on the applications of video surveillance and security. In this paper, an algorithm adopting Swin Transformer as the backbone is proposed and verified for bus passenger attribute recognition. To evaluate the performance and effectiveness of the algorithm, a real bus passenger dataset with thousands of images is collected by digital cameras installed on the route bus in Sapporo city. Experimental results on a single passenger image show that our proposed algorithm achieves high accuracy in most attribute categories and proves the feasibility of a practical application. For those attributes with lower accuracy, our experiments reveal the deficiency of the algorithm in tackling passenger images with occlusion, mosaic cover, and viewpoint change. The possibility of enhancing the performance and robustness of attribute recognition by employing continuous passenger images for inference is also highlighted.

References

[1]

Japan Bus Association. Japanese bus business, 2020. https://www.bus.or.jp/about/pdf/2020_busjigyo.pdf.

[2]

Truck Information Company. Accidents on buses tend to occur more frequently due to the aging of passengers, 2021. http://www.shinjidai.jp/update.php?id=119.

[3]

Transport Ministry of Land, Infrastructure and Tourism. Acceleration of changes and emerging problems due to crises, 2021. https://www.mlit.go.jp/en/statistics/content/.

[4]

Yubin Deng, Ping Luo, Chen Change Loy, and Xiaoou Tang. Pedestrian attribute recognition at far distance. In Proceedings of the 22nd ACM international conference on Multimedia, pages 789–792, 2014.

Digital Library

[5]

Washington García, Cristian Mera, Leonel Santana, and Luzmila Pro, "Algorithm for the Recognition of a Silhouette of a Person from an Image," Journal of Image and Graphics, Vol. 7, No. 2, pp. 59-63, June 2019.

[6]

Dangwei Li, Xiaotang Chen, and Kaiqi Huang. Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pages 111–115. IEEE, 2015.

[7]

Chufeng Tang, Lu Sheng, Zhaoxiang Zhang, and Xiaolin Hu. Improving pedestrian attribute recognition with weakly-supervised multiscale attribute-specific localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019.

[8]

Y. Deng, P. Luo, C. C. Loy, and X. Tang, “Pedestrian attribute recognition at far distance,” in Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014, pp. 789–792. 2

Digital Library

[9]

D. Li, Z. Zhang, X. Chen, H. Ling, and K. Huang, “A richly annotated dataset for pedestrian attribute recognition,” arXiv preprint arXiv:1603.07054, 2016.

[10]

X. Liu, H. Zhao, M. Tian, L. Sheng, J. Shao, S. Yi, J. Yan, and X. Wang, “Hydraplus-net: Attentive deep features for pedestrian analysis,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 350–359.

[11]

Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, Qi Tian, "Scalable Person Re-identification: A Benchmark", IEEE International Conference on Computer Vision (ICCV), 2015.

[12]

Y. Lin, L. Zheng, and W. Y. a. Y. Y. Zheng, Zhedong and, “Improving person re-identification by attribute and identity learning,” arXiv preprint arXiv:1703.07220, 2017.

[13]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.

[14]

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[15]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.

[16]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778, 2016.

[17]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.

[18]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

[19]

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.

[20]

Tsubasa Nishiura, Soichiro Yokoyama, Tomohisa Yamashita, Hidenori Kawamura, Yoshimi Sato, Rei Hasegawa, Miyuki Hirasawa. Estimating the Number of Bus Passengers Using a Person Tracking Method Based on Deep Learning. The Special Interest Group on Data Oriented Constructive Mining and Simulation (DOCMAS), 2021.

[21]

Yuanyuan Xu, Wan Yan, Haixin Sun, Genke Yang, and Jiliang Luo. Centerface: Joint face detection and alignment using face as point. In arXiv:1911.03599, 2019.

[22]

Chien-Yao Wang, I-Hau Yeh, and HongYuan Mark Liao. You only learn one representation: Unified network for multiple tasks. arXiv preprint arXiv:2105.04206, 2021.

[23]

Guillem Bras´o and Laura Leal-Taix´e. Learning a neural solver for multiple object tracking. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.

[24]

Dangwei Li, Zhang Zhang, Xiaotang Chen, Haibin Ling, and Kaiqi Huang. A richly annotated dataset for pedestrian attribute recognition. arXiv preprint arXiv:1603.07054, 2016.

[25]

Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. PMLR, 2015.

[26]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li FeiFei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.

Index Terms

Estimation of Bus Passenger Attributes Using Swin Transformer
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Activity recognition and understanding
2. General and reference
  1. Cross-computing tools and techniques
    1. Empirical studies

Recommendations

Face Attributes Recognition Based on One-Way Inferential Correlation Between Attributes
MultiMedia Modeling
Abstract
Attributes recognition of face in the wild is getting increasingly attention with the rapid development of computer vision. Most prior work tend to apply separate model for the single attribute or attributes in the same region, which easily lost ...
Face Shape Classification Using Swin Transformer Model
Abstract
Face shape classification plays a crucial role in various applications, including facial recognition and personalized beauty rec- ommendations like hairstyles. In some cases, it is tricky to establish one's genuine facial shape, making it ...
Recognition of Facial Attributes Using Multi-Task Learning of Deep Networks
ICMLC '17: Proceedings of the 9th International Conference on Machine Learning and Computing

Face recognition is one of important topics in pattern recognition field. Besides recognizing personal identity, there have been numerous studies on recognizing various facial attributes such as gender, age, race, and expression. Recently, rapid growth ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 2022

1221 pages

ISBN:9781450396899

DOI:10.1145/3573942

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIPR 2022

AIPR 2022: 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

September 23 - 25, 2022

Xiamen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
215
Total Downloads

Downloads (Last 12 months)159
Downloads (Last 6 weeks)22

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten