Skip to main content
Log in

Transit smart card data mining for passenger origin information extraction

  • Published:
Journal of Zhejiang University SCIENCE C Aims and scope Submit manuscript

Abstract

The automated fare collection (AFC) system, also known as the transit smart card (SC) system, has gained more and more popularity among transit agencies worldwide. Compared with the conventional manual fare collection system, an AFC system has its inherent advantages in low labor cost and high efficiency for fare collection and transaction data archival. Although it is possible to collect highly valuable data from transit SC transactions, substantial efforts and methodologies are needed for extracting such data because most AFC systems are not initially designed for data collection. This is true especially for the Beijing AFC system, where a passenger’s boarding stop (origin) on a flat-rate bus is not recorded on the check-in scan. To extract passengers’ origin data from recorded SC transaction information, a Markov chain based Bayesian decision tree algorithm is developed in this study. Using the time invariance property of the Markov chain, the algorithm is further optimized and simplified to have a linear computational complexity. This algorithm is verified with transit vehicles equipped with global positioning system (GPS) data loggers. Our verification results demonstrated that the proposed algorithm is effective in extracting transit passengers’ origin information from SC transactions with a relatively high accuracy. Such transit origin data are highly valuable for transit system planning and route optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barry, J.J., Newhouser, R., Rahbee, A., Sayeda, S., 2002. Origin and destination estimation in New York City with automated fare system data. Transp. Res. Rec., 1817: 183–187. [doi:10.3141/1817-24]

    Article  Google Scholar 

  • Barry, J.J., Freimer, R., Slavin, H., 2009. Use of entry-only automatic fare collection data to estimate linked transit trips in New York City. Transp. Res. Rec., 2112:53–61. [doi:10.3141/2112-07]

    Article  Google Scholar 

  • Bayes, T., Price, R., 1763. An essay towards solving a problem in the doctrine of chances. Phil. Trans. R. Soc. Lond., 53:370–418. [doi:10.1098/rstl.1763.0053]

    Article  Google Scholar 

  • BTRC (Beijing Transportation Research Center), 2010a. Beijing Transport Annual Report 2010. Available from http://www.bjtrc.org.cn/InfoCenter%5CNewsAttach%5C%5C3891f531-3019-4d28-9b70-29c58217b50d.pdf (in Chinese) [Accessed on Aug. 23, 2011].

  • BTRC (Beijing Transportation Research Center), 2010b. Beijing Transportation Smart Card Usage Survey. Research Report, unpublished (in Chinese).

  • Chu, K.K.A., Chapleau, R., 2008. Enriching archived smart card transaction data for transit demand modeling. Transp. Res. Rec., 2063:63–72. [doi:10.3141/2063-08]

    Article  Google Scholar 

  • Cooper, G.F., 1990. The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell., 42(2-3):393–405. [doi:10.1016/0004-3702(90)900 60-D]

    Article  MATH  Google Scholar 

  • Farzin, J.M., 2008. Constructing an automated bus origin-destination matrix using farecard and global positioning system data in Sao Paulo, Brazil. Transp. Res. Rec., 2072:30–37. [doi:10.3141/2072-04]

    Article  Google Scholar 

  • Hofmann, M., Wilson, S., White, P., 2009. Automated Identification of Linked Trips at Trip Level Using Electronic Fare Collection Data. 88th Annual Meeting of Transportation Research Board, p.18.

  • Jang, W., 2010. Travel time and transfer analysis using transit smart card data. Transp. Res. Rec., 2144:142–149. [doi:10.3141/2144-16]

    Article  Google Scholar 

  • Janssens, D., Wets, W., Brijs, T., Vanhoof, K., Arentze, T., Timmermans, H., 2006. Integrating Bayesian networks and decision trees in a sequential rule-based transportation model. Eur. J. Oper. Res., 175(1):16–34. [doi:10. 1016/j.ejor.2005.03.022]

    Article  MATH  Google Scholar 

  • Li, B., 2009. Markov models for Bayesian analysis about transit route origin-destination matrices. Transp. Res. Part B, 43(3):301–310. [doi:10.1016/j.trb.2008.07.001]

    Article  Google Scholar 

  • Nassir, N., Khani, A., Lee, S.G., Noh, H., Hickman, M., 2011. Transit stop-level origin-destination estimation through use of transit schedule and automated data collection system. Transp. Res. Rec., 2263:140–150. [doi:10.3141/2263-16]

    Article  Google Scholar 

  • Pelletier, M.P., Trépanier, M., Morency, C., 2011. Smart card data use in public transit. Transp. Res. Part C, 19(4):557–568. [doi:10.1016/j.trc.2010.12.003]

    Article  Google Scholar 

  • Rahbee, A.B., 2009. Farecard passenger flow model at Chicago transit authority, Illinois. Transp. Res. Rec., 2072: 3–9. [doi:10.3141/2072-01]

    Article  Google Scholar 

  • Reddy, A., Lu, A., Kumar, S., Bashmakov, V., Rudenko, S., 2009. Entry-only automated fare collection (AFC) system data used to infer ridership, rider destinations, unlinked trips, and passenger miles. Transp. Res. Rec., 2110:128–136. [doi:10.3141/2110-16]

    Article  Google Scholar 

  • Trépanier, M., Tranchant, N., Chapleau, R., 2007. Individual trip destination estimation in a transit smart card automated fare collection system. J. Intell. Transp. Syst., 11(1):1–14. [doi:10.1080/15472450601122256]

    Google Scholar 

  • Trépanier, M., Morency, C., Agard, B., 2009. Calculation of transit performance measures using smartcard data. J. Publ. Transp., 12(1):79–96.

    Google Scholar 

  • US Energy Information Administration, 2007. International Energy Outlook 2007. Available from http://www.eia.gov/forecasts/archive/ieo07/index.html [Accessed on Feb. 23, 2010].

  • Zhang, L., Zhao, S., Zhu, Y., Zhu, Z., 2007. Study on the Method of Constructing Bus Stops OD Matrix Based on IC Card Data. Int. Conf. on Wireless Communications, Networking and Mobile Computing, p.3147–3150. [doi:10.1109/WICOM.2007.780]

  • Zhang, Y.F., 2002. Programming on OD Matrix Estimation—Application in New York City Mass Transit System. Proc. 3rd Int. Conf. on Traffic and Transportation Studies, p.786–792. [doi:10.1061/40630(255)110]

  • Zhao, J., Rahbee, A., Wilson, N.H.M., 2007. Estimating a rail passenger trip origin-destination matrix using automatic data collection systems. Comput.-Aided Civ. Infr. Eng., 22(5):376–387. [doi:10.1111/j.1467-8667.2007.00494.x]

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yin-hai Wang.

Additional information

Project supported by the National Natural Science Foundation of China (No. 51138003) and the Beijing Transportation Research Center (BTRC), China

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, Xl., Wang, Yh., Chen, F. et al. Transit smart card data mining for passenger origin information extraction. J. Zhejiang Univ. - Sci. C 13, 750–760 (2012). https://doi.org/10.1631/jzus.C12a0049

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.C12a0049

Key words

CLC number

Navigation