Skip to main content

ETL Processing in Business Intelligence Projects for Public Transportation Systems

  • Conference paper
  • First Online:
  • 557 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1151))

Abstract

Defining a business intelligence project for a transportation system with more than 10 k-users per day could become a challenging problem. A transportation system like this would generate more than 400 million of registers per month when monitoring users each minute. That is why, some strategies need to be applied to the ETL process to correctly handle the data generated by big transportation systems. This paper explores different operational database (OD) architectures and analyze their impact on processing time of the ETL stage in a business intelligence. The database architectures reviewed are: one centralized OD, one logical-centralized OD and distributed OD. This model is being tested with the transportation system defined in the city of Poza Rica, Mexico. This system contains more than three million simulated registers per day and the entire ETL process is done under 136 s. This model runs on a Quad-core Intel Xeon processor 8 GB RAM OSX Yosemite 10.10.5 computer.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Sharif, A., Li, j., Khalil, M., Kumar. R., Irfan, M., Sharif, A.: Internet of things—smart traffic management system for smart cities using big data analytics. In: 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 281–284. Chengdu (2017)

    Google Scholar 

  2. Liu, Y.: Big data technology and its analysis of application in urban intelligent transportation system. In: 2018 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), pp. 17–19. Xiamen (2018)

    Google Scholar 

  3. Yang, Q., Gao, Z., Kong, X., Rahim, A., Wang, J., Xia, F.: Taxi operation optimization based on big traffic data. In: 2015 IEEE 12th International Conference on Ubiquitous Intelligence and Computing and 2015 IEEE 12th International Conference on Autonomic and Trusted Computing and 2015 IEEE 15th International Conference on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), pp. 127–134. Beijing (2015)

    Google Scholar 

  4. de Moreno, I.F.: La sociedad del conocimiento. General José María Córdova 5(7), 40–44 (2009)

    Google Scholar 

  5. García, F.: Engineering contributions to a multicultural perspective of the knowledge society. IEEE Revista Iberoamericana de Tecnologias del Aprendizaje 10(1), 17–18 (2015)

    Article  Google Scholar 

  6. Kumar, N., Goel, S., Mallick, P.: Smart cities in India: features, policies, current status, and challenges. In: 2018 Technologies for Smart-City Energy Security and Power (ICSESP), pp. 1–4. Bhubaneswar (2018)

    Google Scholar 

  7. Ynzunza., C., Landeta, I., Bocarando, J., Aguilar, F., Larios, M.: El entorno de la industria 4.0: Implicaciones y Perspectivas Futuras. In: Conciencia Tecnológica, ISSN: 1405–5597

    Google Scholar 

  8. Lom, M., Pribyl, O., Svitek, M.: Industry 4.0 as a part of smart cities. In: 2016 Smart Cities Symposium Prague (SCSP), pp. 1–6. Prague (2016)

    Google Scholar 

  9. Gokalp, M., Kayabay, K., Akyol, M., Eren, P., Koçyiğit, A.: Big data for industry 4.0: a conceptual framework. In: 2016 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 431–434. Las Vegas, NV (2016)

    Google Scholar 

  10. Ferreira, N., et al.: Urbane: A 3D framework to support data driven decision making in urban development. In: 2015 IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 97–104. Chicago, IL (2015)

    Google Scholar 

  11. Zhao, M.: Urban traffic flow guidance system based on data driven. In: 2009 International Conference on Measuring Technology and Mechatronics Automation, pp. 653–657. Zhangjiajie, Hunan (2009)

    Google Scholar 

  12. Zhang, S., Jia, S., Ma, C., Wang, Y.: Impacts of public transportation fare reduction policy on urban public transport sharing rate based on big data analysis. In: 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 280–284. Chengdu (2018)

    Google Scholar 

  13. Guido, G., Rogano, D., Vitale, A., Astarita, V., Festa, D.: Big data for public transportation: a DSS framework. In: 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), pp. 872–877. Naples (2017)

    Google Scholar 

  14. Yuan, W., Deng, P., Taleb, T., Wan, J., Bi, C.: An unlicensed taxi identification model based on big data analysis. IEEE Trans. Intell. Transp. Syst. 17(6), 1703–1713 (2016)

    Article  Google Scholar 

  15. Wu, P., Chen, Y.: Big data analytics for transport systems to achieve environmental sustainability. In: 2017 International Conference on Applied System Innovation (ICASI), pp. 264–267. Sapporo (2017)

    Google Scholar 

  16. Huang, J., Guo, C.: An MAS-based and fault-tolerant distributed ETL workflow engine. In: Proceedings of the 2012 IEEE 16th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 54–58. Wuhan (2012)

    Google Scholar 

  17. Yang, P., Liu, Z., Ni, J.: Performance tuning in distributed processing of ETL. In: 2013 Seventh International Conference on Internet Computing for Engineering and Science, pp. 85–88. Shanghai, (2013)

    Google Scholar 

  18. Tank, D., Ganatra, A., Kosta, Y., Bhensdadia, C.: Speeding ETL processing in data warehouses using high-performance joins for changed data capture (CDC). In: 2010 International Conference on Advances in Recent Technologies in Communication and Computing, pp. 365–368. Kottayam (2010)

    Google Scholar 

  19. Wang, G., Guo, C.: Research of distributed ETL engine based on MAS and data partition. In: Proceedings of the 2011 15th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 342–347. Lausanne, (2011)

    Google Scholar 

  20. Azqueta, A., Patiño, M., Brondino, I., Jimenez, R.: Massive data load on distributed database systems over HBase. In: 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 776–779. Madrid (2017)

    Google Scholar 

  21. Chen, G., An, B., Liu, Y.: A novel agent-based parallel ETL system for massive data. In: 2016 Chinese Control and Decision Conference, pp. 3942–3948. Yinchuan (2016)

    Google Scholar 

  22. Guerreiro, G., Figueiras, P., Silva, R., Costa, R., Jardim, R.: An architecture for big data processing on intelligent transportation systems: an application scenario on highway traffic flows. In: 2016 IEEE 8th International Conference on Intelligent Systems (IS), pp. 65–72. Sofia (2016)

    Google Scholar 

Download references

Acknowledgements

This research is sponsored in part by the Mexican Agency for International Development Cooperation (AMEXCID) and the Uruguayan Agency for International Cooperation (AUCI) through the Joint Uruguay-Mexico Cooperation Fund. This research work reflects only the points of view of the authors and not those of the AMEXCID or the AUCI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alfredo Cristobal-Salas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cristobal-Salas, A. et al. (2019). ETL Processing in Business Intelligence Projects for Public Transportation Systems. In: Torres, M., Klapp, J. (eds) Supercomputing. ISUM 2019. Communications in Computer and Information Science, vol 1151. Springer, Cham. https://doi.org/10.1007/978-3-030-38043-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-38043-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-38042-7

  • Online ISBN: 978-3-030-38043-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics