Abstract
Recent advances in computer vision and Convolutional Neural Networks (CNNs) have facilitated the use of street view imagery (SVI) for the automatic assessment of physical and perceived attributes of walkable areas. However, these methods still overlook the broader urban context and fail to capture and communicate to the user the qualitative factors influencing the assessed walkability score. This paper addresses these challenges by leveraging a Multimodal Large Language Model (MLLM) to provide a holistic assessment of walkability, consisting of both quantitative scores and linguistic qualitative insights. This approach offers a more comprehensive understanding of the factors contributing to the walkability score attributed to the image and enhances the interpretability and practical applicability of the assessments for urban planners and policymakers. Preliminary experiments demonstrate that a MLLM-based methodology can effectively capture a diverse range of factors of walkability, suggesting a promising direction for future developments of evaluation tools aimed at supporting the design of pedestrian-friendly environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Saelens, B.E., Handy, S.L.: Built environment correlates of walking: a review. Med. Sci. Sports Exerc. 40, 550–566 (2008)
Maghelal, P.K., Capp, C.J.: Walkability: a review of existing pedestrian indices. URISA J. 23, 5–19 (2011)
Talen, E., Koschinsky, J.: The walkable neighborhood a literature review. Int. J. Sustain. Land Use Urban Plan 1, 42–63 (2013)
Vale, D.S., Saraiva, M., Pereira, M.: Active accessibility: a review of operational measures of walking and cycling accessibility. J. Transport Land Use 9, 209–235 (2016)
Frank, L., Schmid, T., Sallis, J., Chapman, J., Saelens, B.: Linking objectively measured physical activity with objectively measured urban form: findings from SMARTRAQ. Am. J. Prev. Med. 28, 117–125 (2005)
Lee, K., Ahn, K.: An empirical analysis of neighborhood environment affecting residents’ walking: case study of 12 areas in Seoul. J. Architectural Inst. Korea Plann. Des. 24, 293–302 (2008)
Wang, W., Li, P., Wang, W., Namgung, M.: Exploring determinants of pedestrians’ satisfaction with sidewalk environments: case study in Korea. J. Urban Plann. Dev. 138, 166–172 (2012)
Mateo-Babiano, I.: Pedestrian’s needs matter: examining manila’s walking environment. Transp. Policy 45, 107–115 (2016)
Lee, E., Dean, J.: Perceptions of walkability and determinants of walking behavior among urban seniors in Toronto, Canada. J. Transport Health 9, 309–320 (2018)
Biljecki, F., Ito, K.: Street view imagery in urban analytics and GIS: a review. Landscape Urban Plann. 215, 104217 (2021)
Neuhold, G., Ollmann, T., Rota Bulò, S., Kontschieder, P.: The mapillary vistas dataset for semantic understanding of street scenes. In: International Conference on Computer Vision (ICCV) (2017)
Zhou, H., He, S., Cai, Y., Wang, M., Su, S.: Social inequalities in neighborhood visual walkability: Using street view imagery and deep learning technologies to facilitate healthy city planning. Sustain. Cities Soc. 50, 101605 (2019)
Li, Y., Yabuki, N., Fukuda, T., Zhang, J.: A big data evaluation of urban street walkability using deep learning and environmental sensors-a case study around Osaka university Suita campus. In: Proceedings of the 38th eCAADe Conference, Berlin, Germany, eCAADe (2020) 319–328 16–18 September 2020 (2020)
Wu, C., Peng, N., Ma, X., Li, S., Rao, J.: Assessing multiscale visual appearance characteristics of neighborhoods using geographically weighted principal component analysis in Shenzhen, China. Comput. Environ. Urban Syst. 84, 101547 (2020)
Ordonez, V., Berg, T.L.: Learning high-level judgments of Urban perception. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 494–510. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_32
Porzi, L., Rota Bulò, S., Lepri, B., Ricci, E.: Predicting and understanding urban perception with convolutional neural networks. In: Proceedings of the 23rd ACM International Conference on Multimedia. MM 2015, New York, NY, USA, ACM (2015) 139–148 (2015)
Dubey, A., Naik, N., Parikh, D., Raskar, R., Hidalgo, C.A.: Deep learning the city: quantifying urban perception at a global scale. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 196–212. Springer International Publishing, Cham (2016)
Liu, L., Silva, E.A., Wu, C., Wang, H.: A machine learning-based method for the large-scale evaluation of the qualities of the urban environment. Comput. Environ. Urban Syst. 65, 113–125 (2017)
Blečić, I., Cecchini, A., Trunfio, G.A.: Towards automatic assessment of perceived walkability. In: Gervasi, O., Murgante, B., Misra, S., Stankova, E., Torre, C.M., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O., Tarantino, E., Ryu, Y. (eds.) ICCSA 2018. LNCS, vol. 10962, pp. 351–365. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95168-3_24
Blečić, I., Santos, A.G., Moura, A.C., Trunfio, G.A.: Multi-criteria evaluation vs perceived Urban quality: an exploratory comparison. In: Misra, S., Gervasi, O., Murgante, B., Stankova, E., Korkhov, V., Torre, C., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O., Tarantino, E. (eds.) ICCSA 2019. LNCS, vol. 11621, pp. 612–627. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24302-9_44
Chang, Y., et al.: A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 15(3), 1–45 (2024)
Yin, S., et al.: A survey on multimodal large language models. arXiv preprint arXiv:2306.13549 (2023)
Liu, H., Li, C., Wu, Q., Lee, Y.J.: Visual instruction tuning. arXiv preprint arXiv:2304.08485 (2023)
Liu, H., Li, C., Li, Y., Lee, Y.J.: Improved baselines with visual instruction tuning. arXiv preprint arXiv:2310.03744 (2023)
Liu, H., et al.: LLaVA-NeXT: Improved reasoning, OCR, and world knowledge - https://llava-vl.github.io/blog/2024-01-30-llava-next (2024)
Jiang, A.Q., et al.: Mistral 7B. arXiv preprint arXiv:2310.06825 (2023)
Acknowledgments
This study was carried out within the MOST-Sustainable Mobility National Research Centre and received funding from the European Union Next-GenerationEU (PNRR, Missione 4, Componente 2, Investimento 1.4, D.D. 1033 17 June 2022, CN00000023). The research was also supported by funding from the Collaboration agreement between Autonomous Region of Sardinia (CRP) and Universities of Cagliari (DICAAR) and Sassari (DADU) on “Support to local authorities and the regional administration for the construction and implementation of planning, management and monitoring models on transport systems and services for sustainable mobility” related to Priority 4 of the PR FESR 2021-2027 and to the National Strategy for the Internal Areas 2021-2027 (RAS Convenzione n. 87 Prot. n. 10469 23.12.2022).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Blečić, I., Saiu, V., A. Trunfio, G. (2024). Enhancing Urban Walkability Assessment with Multimodal Large Language Models. In: Gervasi, O., Murgante, B., Garau, C., Taniar, D., C. Rocha, A.M.A., Faginas Lago, M.N. (eds) Computational Science and Its Applications – ICCSA 2024 Workshops. ICCSA 2024. Lecture Notes in Computer Science, vol 14819. Springer, Cham. https://doi.org/10.1007/978-3-031-65282-0_26
Download citation
DOI: https://doi.org/10.1007/978-3-031-65282-0_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-65281-3
Online ISBN: 978-3-031-65282-0
eBook Packages: Computer ScienceComputer Science (R0)