Skip to main content

Transforming Large-Scale Participation Data Through Topic Modelling in Urban Design Processes

  • Conference paper
  • First Online:
Computer-Aided Architectural Design. INTERCONNECTIONS: Co-computing Beyond Boundaries (CAAD Futures 2023)

Abstract

The advancements in digital tools and data collection methods ensure the continuing growth of textual data obtained through large-scale participation processes in urban contexts. In order to extract the thematic content of such underutilized textual datasets, topic modeling (TM) and content analysis have been deployed as promising AI-based Natural Language Processing (NLP) techniques. Yet, implementing such techniques has not been exploited in urban design domains due to the complexity of textual datasets and the lack of a systematic evaluation framework. In this paper, we addressed the challenges in the utilization of large textual data by using a real-world dataset collected via a digital participation platform in Madrid, Spain. Firstly, we identified prominent data structures and potential information embedded into the dataset by using a document-oriented NoSQL database. In this step, we systematically discussed data pre-processing steps to convert them into a series of structured data collections. Secondly, we evaluated three different TM algorithms, i.e. LDA, LSI, and HDP, according to a number of hyperparameters controlling the learning process. This step aimed to reveal the required number of topics to extract meaningful content through the algorithms. Lastly, we presented possible textual data visualization techniques to enable the use of textual information in digital participation processes. Consequently, this paper facilitates the use of large textual datasets by investigating data structures & processing, revealing the potentials of different TM algorithms, and eventually analyzing the results with the support of urban big data analytics and computational linguistic techniques for informed urban design processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Airoldi, E., Blei, D., Xing, E., Fienberg, S.: A latent mixed membership model for relational data. In: Proceedings of the 3rd international workshop on Link discovery – LinkKDD’05, pp. 82–89 (2005)

    Google Scholar 

  2. Ataman, C., Tuncer, B.: Urban interventions and participation tools in urban design processes: a systematic review and thematic analysis (1995–2021). Sustain. Cities Soc. 76, 103462 (2022)

    Article  Google Scholar 

  3. Ataman, C., Tunçer, B., Perrault, S.T.: Asynchronous digital participation in urban design processes: qualitative data exploration and analysis with natural language processing. In: POST-CARBON – Proceedings of the 27th CAADRIA Conference, pp. 383–392 (2022)

    Google Scholar 

  4. Vyankatrao Barde, B., Madhavrao Bainwad, A.: An overview of topic modeling methods and tools. In: 2017 International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 745–750 (2017)

    Google Scholar 

  5. Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  6. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  7. Dembski, F., Wössner, U., Letzgus, M., Ruddat, M., Yamu, C.: Urban digital twins for smart cities and citizens: the case study of Herrenberg, Germany. Sustainability 12(6), 2307 (2020)

    Article  Google Scholar 

  8. Dunne, C., Skelton, C., Diamond, S., Meirelles, I., Martino, M.: Quantitative, Qualitative, and Historical Urban Data Visualization Tools for Professionals and Stakeholders, pp. 405–416 (2016)

    Google Scholar 

  9. Egger, R. (ed.): Applied Data Science in Tourism: Interdisciplinary Approaches, Methodologies, and Applications. Springer International Publishing, Cham (2022)

    Google Scholar 

  10. Jacobi, C., van Atteveldt, W., Welbers, K.: Quantitative analysis of large amounts of journalistic texts using topic modelling. Digit. J. 4(1), 89–106 (2015)

    Google Scholar 

  11. Jelodar, H., et al.: Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimed. Tools Appl. 78(11), 15169–15211 (2018). https://doi.org/10.1007/s11042-018-6894-4

    Article  Google Scholar 

  12. Krasnov, F., Sen, A.: The number of topics optimization: clustering approach. Mach. Learn. Knowl. Extr. 1(1), 416–426 (2019)

    Article  Google Scholar 

  13. Liu, L., Tang, L., Dong, W., Yao, S., Zhou, W.: An overview of topic modeling and its current applications in bioinformatics. Springerplus 5(1), 1–22 (2016). https://doi.org/10.1186/s40064-016-3252-8

    Article  Google Scholar 

  14. Mueller, J., Hangxin, L., Chirkin, A., Klein, B., Schmitt, G.: Citizen design science: a strategy for crowd-creative urban design. Cities 72, 181–188 (2018)

    Article  Google Scholar 

  15. Mazhar Rathore, M., Paul, A., Hong, W.-H., Seo, H., Awan, I., Saeed, S.: Exploiting IoT and big data analytics: defining smart digital city using real-time urban data. Sustain. Cities Soc. 40, 600–610 (2018)

    Article  Google Scholar 

  16. Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In:Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408. https://doi.org/10.1145/2684822.2685324 (2015)

  17. Sbalchiero, S., Eder, M.: Topic modeling, long texts and the best number of topics. Some problems and solutions. Qual. Quant. 54(4), 1095–1108 (2020). https://doi.org/10.1007/s11135-020-00976-w

    Article  Google Scholar 

  18. Nareshkumar Singh, K.S.H., Dickeeta Devi, S., Mamata Devi, H., Mahanta, A.K.: A novel approach for dimension reduction using word embedding: an enhanced text classification approach. Int. J. Inform. Manag. Data Insights 2(1), 100061 (2022)

    Google Scholar 

  19. Tekler, Z.D., Low, R., Choo, K.T.W., Blessing, L.: User Perceptions and adoption of plug load management systems in the workplace. In: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pp. 1–6 (2021)

    Google Scholar 

  20. Wang, Y., Taylor, J.E.: Urban crisis detection technique: a spatial and data driven approach based on latent Dirichlet Allocation (LDA) topic modeling. In:Construction Research Congress 2018, pp. 250–259 (2018). https://doi.org/10.1061/9780784481271.025

  21. Wilkerson, J., Casas, A.: Large-scale computerized text analysis in political science: opportunities and challenges. Annu. Rev. Polit. Sci. 20(1), 529–544 (2017)

    Article  Google Scholar 

  22. Wong, T.-T., Yeh, P.-Y.: Reliable accuracy estimates from k-fold cross validation. IEEE Trans. Knowl. Data Eng. 32(8), 1586–1594 (2020)

    Article  Google Scholar 

  23. Zhao, W., et al.: A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics 16(S13), S8 (2015)

    Article  Google Scholar 

  24. Decide Madrid. https://decide.madrid.es

Download references

Acknowledgment

This research is supported by “Designing mobile-friendly cartograms for visualising geospatial data” Grant, from the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 programme (award number MOE-T2EP20221-0007) and by Singapore International Graduate Award (SINGA).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cem Ataman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ataman, C., Tunçer, B., Perrault, S. (2023). Transforming Large-Scale Participation Data Through Topic Modelling in Urban Design Processes. In: Turrin, M., Andriotis, C., Rafiee, A. (eds) Computer-Aided Architectural Design. INTERCONNECTIONS: Co-computing Beyond Boundaries. CAAD Futures 2023. Communications in Computer and Information Science, vol 1819. Springer, Cham. https://doi.org/10.1007/978-3-031-37189-9_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37189-9_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37188-2

  • Online ISBN: 978-3-031-37189-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics