skip to main content
10.1145/3613905.3650815acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Work in Progress

Experiential Views: Towards Human Experience Evaluation of Designed Spaces using Vision-Language Models

Published: 11 May 2024 Publication History

Abstract

Experiential Views is a proof-of-concept in which we explore a method of helping architects and designers predict how building occupants might experience their designed spaces using AI technology based on Vision-Language Models. Our prototype evaluates a space using a pre-trained model that we fine-tuned with photos and renders of a building. These images were evaluated and labeled based on a preliminary set of three human-centric dimensions that characterize the Social, Tranquil, and Inspirational qualities of a scene. We developed a floor plan visualization and a WebGL-based 3D-viewer that demonstrate how architectural design software could be enhanced to evaluate areas of a built environment based on psychological or emotional criteria. We see this as an early step towards helping designers anticipate emotional responses to their designs to create better experiences for occupants.

Supplemental Material

MP4 File - Video Preview
Video Preview
Transcript for: Video Preview

References

[1]
2024. Autodesk AutoCAD. https://www.autodesk.com/products/autocad/overview. [Online accessed Jan-2024].
[2]
2024. Autodesk Revit. https://www.autodesk.com/products/revit/overview. [Online accessed Jan-2024].
[3]
Sharmeen M.Saleem Abdullah, Siddeeq Y. Ameen, Mohammed A. M. Sadeeq, and Subhi Zeebaree. 2021. Multimodal Emotion Recognition using Deep Learning. Journal of Applied Science and Technology Trends 2, 02 (Apr. 2021), 52–58. https://doi.org/10.38094/jastt20291
[4]
Basma Altaf, Eva Bianchi, Isabella P. Douglas, Kyle Douglas, Brandon Byers, Pablo E. Paredes, Nicole M. Ardoin, Hazel R. Markus, Elizabeth L. Murnane, Lucy Z. Bencharit, James A. Landay, and Sarah L. Billington. 2022. Use of Crowdsourced Online Surveys to Study the Impact of Architectural and Design Choices on Wellbeing. Frontiers in Sustainable Cities 4 (2022). https://doi.org/10.3389/frsc.2022.780376
[5]
Josh Andres, Rodolfo Ocampo, Oliver Bown, Charlton Hill, Caroline Pegram, Adrian Schmidt, Justin Shave, and Brendan Wright. 2023. The Human-Built Environment-Natural Environment Relation - An Immersive Multisensory Exploration with ‘System of a Sound’. In Companion Proceedings of the 28th International Conference on Intelligent User Interfaces (Sydney, NSW, Australia) (IUI ’23 Companion). Association for Computing Machinery, New York, NY, USA, 8–11. https://doi.org/10.1145/3581754.3584119
[6]
AUTOMATIC1111. 2022. Stable Diffusion WebUI. https://github.com/AUTOMATIC1111/stable-diffusion-webui.
[7]
Kirsten Boehner, Rogério DePaula, Paul Dourish, and Phoebe Sengers. 2007. How Emotion is Made and Measured. International Journal of Human-Computer Studies 65, 4 (2007), 275–291. https://doi.org/10.1016/j.ijhcs.2006.11.016
[8]
Alessandro Bondielli and Lucia C. Passaro. 2021. Leveraging CLIP for Image Emotion Recognition. In Proceedings of the Fifth Workshop on Natural Language for Artificial Intelligence (NL4AI 2021) co-located with 20th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2021), Online event, November 29, 2021(CEUR Workshop Proceedings, Vol. 3015), Elena Cabrio, Danilo Croce, Lucia C. Passaro, and Rachele Sprugnoli (Eds.). CEUR-WS.org. https://ceur-ws.org/Vol-3015/paper172.pdf
[9]
Jeffrey A. Brooks, Vineet Tiruvadi, Alice Baird, Panagiotis Tzirakis, Haoqi Li, Chris Gagne, Moses Oh, and Alan Cowen. 2023. Emotion Expression Estimates to Measure and Improve Multimodal Social-Affective Interactions. In Companion Publication of the 25th International Conference on Multimodal Interaction (Paris, France) (ICMI ’23 Companion). Association for Computing Machinery, New York, NY, USA, 353–358. https://doi.org/10.1145/3610661.3616129
[10]
William Browning, Catherine Ryan, and Joseph Clancy. 2014. 14 Patterns of Biophilic Design: Improving Health & Well-Being in the Built Environment. Technical Report. Terrapin Bright Green.
[11]
Cristina Bustos, Carles Civit, Brian Du, Albert Solé-Ribalta, and Àgata Lapedriza. 2023. On the use of Vision-Language models for Visual Sentiment Analysis: a study on CLIP. In 11th International Conference on Affective Computing and Intelligent Interaction, ACII 2023, Cambridge, MA, USA, September 10-13, 2023. IEEE, 1–8. https://doi.org/10.1109/ACII59096.2023.10388075
[12]
Michael Campo and Habib Chaudhury. 2012. Informal Social Interaction Among Residents with Dementia in Special Care Units: Exploring the Role of the Physical and Social Environments. Dementia 11, 3 (2012), 401–423. https://doi.org/10.1177/1471301211421189
[13]
Sunwoo Chang and Hanjong Jun. 2019. Hybrid Deep-Learning Model to Recognise Emotional Responses of Users towards Architectural Design Alternatives. Journal of Asian Architecture and Building Engineering 18, 5 (2019), 381–391. https://doi.org/10.1080/13467581.2019.1660663
[14]
Alexander Coburn, Oshin Vartanian, Yoed N. Kenett, Marcos Nadal, Franziska Hartung, Gregor Hayn-Leichsenring, Gorka Navarrete, José L. González-Mora, and Anjan Chatterjee. 2020. Psychological and Neural Responses to Architectural Interiors. Cortex 126 (2020), 217–241. https://doi.org/10.1016/j.cortex.2020.01.009
[15]
Susanne Colenberg, Tuuli Jylhä, and Monique Arkesteijn. 2021. The relationship between interior office space and employee health and well-being – a literature review. Building Research & Information 49, 3 (2021), 352–366. https://doi.org/10.1080/09613218.2019.1710098
[16]
Sinuo Deng, Lifang Wu, Ge Shi, Lehao Xing, Wenjin Hu, Heng Zhang, and Ye Xiang. 2023. Simple But Powerful, a Language-Supervised Method for Image Emotion Classification. IEEE Transactions on Affective Computing 14, 4 (2023), 3317–3331. https://doi.org/10.1109/TAFFC.2022.3225049
[17]
Jacinta Francis, Billie Giles-Corti, Lisa Wood, and Matthew Knuiman. 2012. Creating Sense of Community: The Role of Public Space. Journal of Environmental Psychology 32, 4 (2012), 401–409. https://doi.org/10.1016/j.jenvp.2012.07.002
[18]
Wilson S Geisler. 2008. Visual Perception and the Statistical Properties of Natural Scenes. Annual Review of Psychology 59 (2008), 167–192. https://doi.org/10.1146/annurev.psych.58.110405.085632
[19]
Changyang Li, Haikun Huang, Jyh-Ming Lien, and Lap-Fai Yu. 2021. Synthesizing scene-aware virtual reality teleport graphs. ACM Transaction on Graphics 40, 6, Article 229 (dec 2021), 15 pages. https://doi.org/10.1145/3478513.3480478
[20]
Neda Norouzi, Antonio Martinez, and Zayra Rico. 2023. Architectural Design Qualities of an Adolescent Psychiatric Hospital to Benefit Patients and Staff. HERD: Health Environments Research & Design Journal 16, 4 (2023), 103–117. https://doi.org/10.1177/19375867231180907
[21]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models from Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. https://proceedings.mlr.press/v139/radford21a.html
[22]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10684–10695. https://doi.org/10.1109/CVPR52688.2022.01042
[23]
Oshin Vartanian, Gorka Navarrete, Letizia Palumbo, and Anjan Chatterjee. 2021. Individual Differences in Preference for Architectural Interiors. Journal of Environmental Psychology 77 (2021), Article 101668. https://doi.org/10.1016/j.jenvp.2021.101668
[24]
Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. 2022. Learning to Prompt for Vision-Language Models. International Journal of Computer Vision 130, 9 (2022), 2337–2348. https://doi.org/10.1007/s11263-022-01653-1

Cited By

View all
  • (2024)Toward Facilitating Search in VR With the Assistance of Vision Large Language ModelsProceedings of the 30th ACM Symposium on Virtual Reality Software and Technology10.1145/3641825.3687742(1-14)Online publication date: 9-Oct-2024

Index Terms

  1. Experiential Views: Towards Human Experience Evaluation of Designed Spaces using Vision-Language Models

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems
        May 2024
        4761 pages
        ISBN:9798400703317
        DOI:10.1145/3613905
        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 11 May 2024

        Check for updates

        Author Tags

        1. architectural design
        2. human-centric building design
        3. vision-language models

        Qualifiers

        • Work in progress
        • Research
        • Refereed limited

        Conference

        CHI '24

        Acceptance Rates

        Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

        Upcoming Conference

        CHI 2025
        ACM CHI Conference on Human Factors in Computing Systems
        April 26 - May 1, 2025
        Yokohama , Japan

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)248
        • Downloads (Last 6 weeks)55
        Reflects downloads up to 01 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Toward Facilitating Search in VR With the Assistance of Vision Large Language ModelsProceedings of the 30th ACM Symposium on Virtual Reality Software and Technology10.1145/3641825.3687742(1-14)Online publication date: 9-Oct-2024

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        Full Text

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media