skip to main content

The City as a Personal Assistant: Turning Urban Landmarks into Conversational Agents for Serving Hyper Local Information

Published: 07 July 2022 Publication History


Conversational agents are increasingly becoming digital partners in our everyday computational experiences. Although rich and fresh in content, they are oblivious to users' locality beyond geospatial weather and traffic conditions. We introduce Lingo, a hyper-local conversational agent embedded deeply into the urban infrastructure that provides rich, purposeful, detailed, and in some cases, playful information relevant to a neighbourhood. Drawing lessons from a mixed-method contextual study (online survey, n = 1992 and semi-structured interviews, n = 21), we identify requirements for such a hyper-local conversational agent and a sample set of questions serving urban neighbourhoods of Belgium. Our agent design is manifested into a two-part system. First, a multi-modal reasoning engine serves as a hyper-local information source using automated machine-learning models operating on camera, microphone, and environmental sensor data. Second, a smart conversational speaker and a smartphone application serve as hyper-local information access points. Finally, we introduce a covert communication mechanism over Wi-Fi management frames that bridges the two parts of our Lingo system and enables the privacy-preserving proxemic interactions. We describe the design, implementation, and technical assessment of Lingo together with usability (n = 20) and real-world deployment (n = 5) studies. We reflect on information quality, accessibility benefits, and interaction dynamics and demonstrate the efficacy of Lingo in offering hyper-local information at the finest granularity in urban neighbourhoods while reducing access time up to a factor of 25.

Supplemental Material

ZIP File - acer
Supplemental movie, appendix, image and software files for, The City as a Personal Assistant: Turning Urban Landmarks into Conversational Agents for Serving Hyper Local Information


2020. Cisco Annual Internet Report (2018-2023). [Accessed: 2021-05-15].
2021. IEEE Standard for Information Technology-Telecommunications and Information Exchange between Systems - Local and Metropolitan Area Networks-Specific Requirements - Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE Std 802.11-2020 (Revision of IEEE Std 802.11-2016) (2021), 1--4379.
2021. YAMNet. [Accessed: 2021-05-15].
Utku Günay Acer, Marc van den Broeck, Claudio Forlivesi, Florian Heller, and Fahim Kawsar. 2019. Scaling Crowdsourcing with Mobile Workforce: A Case Study with Belgian Postal Service. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 2, Article 35 (June 2019), 32 pages.
Utku Günay Acer and Otto Waltari. 2017. WiPush: Opportunistic Notifications over WiFi without Association. In Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (Melbourne, VIC, Australia) (MobiQuitous 2017). Association for Computing Machinery, New York, NY, USA, 353--362.
Florian Alt, Alireza Sahami Shirazi, Albrecht Schmidt, Urs Kramer, and Zahid Nawaz. 2010. Location-Based Crowdsourcing: Extending Crowdsourcing to the Real World. In Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries (Reykjavik, Iceland) (NordiCHI '10). Association for Computing Machinery, New York, NY, USA, 13--22.
Tawfiq Ammari, Jofish Kaye, Janice Y. Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Trans. Comput.-Hum. Interact. 26, 3, Article 17 (apr 2019), 28 pages.
Ganesh Ananthanarayanan, Paramvir Bahl, Peter Bodík, Krishna Chintalapudi, Matthai Philipose, Lenin Ravindranath, and Sudipta Sinha. 2017. Real-time video analytics: The killer app for edge computing. computer 50, 10 (2017), 58--67.
Elian Aubry, Thomas Silverston, Abdelkader Lahmadi, and Olivier Festor. 2014. CrowdOut: A mobile crowdsourcing service for road safety in digital cities. In 2014 IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM WORKSHOPS). 86--91.
Matthias Baldauf, Stefan Ribler, and Peter Fröhlich. 2019. Alexa, I'm in Need! Investigating the Potential and Barriers of Voice Assistance Services for Social Work. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services (Taipei, Taiwan) (MobileHCI 2019). Association for Computing Machinery, New York, NY, USA, Article 50, 6 pages.
Frank Bentley, Chris Luvogt, Max Silverman, Rushani Wirasinghe, Brooke White, and Danielle Lottridge. 2018. Understanding the Long-Term Use of Smart Speaker Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 3, Article 91 (sep 2018), 24 pages.
Lokesh Boominathan, Srinivas S S Kruthiventi, and R. Venkatesh Babu. 2016. CrowdNet: A Deep Convolutional Network for Dense Crowd Counting. arXiv:1608.06197 [cs.CV]
John Brooke. 1996. "SUS-A quick and dirty usability scale." Usability evaluation in industry. CRC Press. ISBN: 9780748404605.
Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J Franklin, Joseph E Gonzalez, and Ion Stoica. 2017. Clipper: A low-latency online prediction serving system. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 613--627.
Subhankar Dhar and Upkar Varshney. 2011. Challenges and Business Models for Mobile Location-Based Services and Advertising. Commun. ACM 54, 5 (May 2011), 121--128.
Aarthi Easwara Moorthy and Kim-Phuong L Vu. 2015. Privacy concerns for use of voice activated personal assistant in the public space. International Journal of Human-Computer Interaction 31, 4 (2015), 307--335.
Jakob Eriksson, Lewis Girod, Bret Hull, Ryan Newton, Samuel Madden, and Hari Balakrishnan. 2008. The Pothole Patrol: Using a Mobile Sensor Network for Road Surface Monitoring. In Proceedings of the 6th International Conference on Mobile Systems, Applications, and Services (Breckenridge, CO, USA) (MobiSys '08). ACM, New York, NY, USA, 29--39.
Saul Greenberg and Nicolai Marquardt. 2016. Using Social Science Theory to Inspire Surface Design: A Case Study of Proxemic Interactions. In Designing Digital Surface Applications, Frank Maurer (Ed.). SurfNet, University of Calgary, Calgary, Canada, 26--38.
Vishal Gupta and Mukesh Rohil. 2012. Enhancing Wi-Fi with IEEE 802.11u for Mobile Data Offloading. International Journal of Mobile Network Communications & Telematics 2 (08 2012).
Desislava Hristova, Afra Mashhadi, Giovanni Quattrone, and Licia Capra. 2012. Mapping Community Engagement with Urban Crowd-Sourcing. In Proc. When the City Meets the Citizen Workshop (WCMCW). AAAI, Palo Alto, CA, USA, 14--19.
Inseok Hwang, Youngki Lee, Chungkuk Yoo, Chulhong Min, Dongsun Yim, and John Kim. 2019. Towards Interpersonal Assistants: Next-Generation Conversational Agents. IEEE Pervasive Comput. 18, 2 (2019), 21--31.
Samvit Jain, Xun Zhang, Yuhao Zhou, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu, Paramvir Bahl, and Joseph Gonzalez. 2020. Spatula: Efficient cross-camera video analytics on large camera networks. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). IEEE, 110--124.
Michael J Kuhn. 2015. Virtual game assistant based on artificial intelligence. US Patent 9,202,171.
Axel Küpper. 2005. Location-based services: fundamentals and operation. John Wiley & Sons.
Gierad Laput, Karan Ahuja, Mayank Goel, and Chris Harrison. 2018. Ubicoustics: Plug-and-Play Acoustic Activity Recognition. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (Berlin, Germany) (UIST '18). Association for Computing Machinery, New York, NY, USA, 213--224.
Dawei Liang and Edison Thomaz. 2019. Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3, 1, Article 17 (March 2019), 18 pages.
P. Luff, D. Frohlich, and N.G. Gilbert. 2014. Computers and Conversation. Elsevier Science.
Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA": The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI '16). Association for Computing Machinery, New York, NY, USA, 5286--5297.
Nicolas Maisonneuve, Matthias Stevens, Maria Niessen, and Luc Steels. 2009. NoiseTube: Measuring and mapping noise pollution with mobile phones. 215--228.
Nicolai Marquardt and Saul Greenberg. 2015. Proxemic Interactions: From Theory to Practice. Synthesis Lectures on Human-Centered Informatics 8 (02 2015), 1--199.
Prashanth Mohan, Venkata N. Padmanabhan, and Ramachandran Ramjee. 2008. Nericell: Rich Monitoring of Road and Traffic Conditions Using Mobile Smartphones. In Proceedings of the 6th ACM Conference on Embedded Network Sensor Systems (Raleigh, NC, USA) (SenSys '08). Association for Computing Machinery, New York, NY, USA, 323--336.
Lindsay C. Page and Hunter Gehlbach. 2017. How an Artificially Intelligent Virtual Assistant Helps Students Navigate the Road to College. AERA Open 3, 4 (2017). arXiv:
Salvatore Parise, Patricia J Guinan, and Ron Kafka. 2016. Solving the crisis of immediacy: How digital technology can transform the customer experience. Business Horizons 59, 4 (2016), 411--420.
Chunjong Park, Chulhong Min, Sourav Bhattacharya, and Fahim Kawsar. 2020. Augmenting Conversational Agents with Ambient Acoustic Contexts. In 22nd International Conference on Human-Computer Interaction with Mobile Devices and Services (Oldenburg, Germany) (MobileHCI '20). Association for Computing Machinery, New York, NY, USA, Article 33, 9 pages.
Jennifer Pearson, Simon Robinson, Thomas Reitmaier, Matt Jones, Shashank Ahire, Anirudha Joshi, Deepak Sahoo, Nimish Maravi, and Bhakti Bhikne. 2019. StreetWise: Smart Speakers vs Human Help in Public Slum Settings. Association for Computing Machinery, New York, NY, USA, 1--13.
Nishant Piyush, Tanupriya Choudhury, and Praveen Kumar. 2016. Conversational commerce anew era of e-business. In 2016 International Conference System Modeling Advancement in Research Trends (SMART). 322--327.
Martin Porcheron, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice Interfaces in Everyday Life. Association for Computing Machinery, New York, NY, USA, 1--12.
Daniele Quercia, Luca Maria Aiello, Rossano Schifanella, and Adam Davies. 2015. The Digital Life of Walkable Streets. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW '15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 875--884.
Daniele Quercia, Diarmuid Ò Séaghdha, and Jon Crowcroft. 2012. Talk of the city: Our tweets, our community happiness. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 6.
Rajib Kumar Rana, Chun Tung Chou, Salil S. Kanhere, Nirupama Bulusu, and Wen Hu. 2010. Ear-Phone: An End-to-End Participatory Urban Noise Mapping System. In Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks (Stockholm, Sweden) (IPSN '10). Association for Computing Machinery, New York, NY, USA, 105--116.
Jonathan Raper, Georg Gartner, Hassan Karimi, and Chris Rizos. 2007. Applications of Location-Based Services: A Selected Review. J. Locat. Based Serv. 1, 2 (June 2007), 89--111.
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arXiv:1804.02767 [cs.CV]
Darshan Santani, Jidraph Njuguna, Tierra Bills, Aisha W. Bryant, Reginald Bryant, Jonathan Ledgard, and Daniel Gatica-Perez. 2015. CommuniSense: Crowdsourcing Road Hazards in Nairobi. In Proceedings of the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services (Copenhagen, Denmark) (MobileHCI '15). Association for Computing Machinery, New York, NY, USA, 445--456.
Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural Responding Machine for Short-Text Conversation. arXiv:1503.02364 [cs] (March 2015). arXiv: 1503.02364.
Haichen Shen, Lequn Chen, Yuchen Jin, Liangyu Zhao, Bingyu Kong, Matthai Philipose, Arvind Krishnamurthy, and Ravi Sundaram. 2019. Nexus: A GPU Cluster Engine for Accelerating DNN-Based Video Analysis. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP '19). Association for Computing Machinery, New York, NY, USA, 322--337.
Heung-Yeung Shum, Xiao-dong He, and Di Li. 2018. From Eliza to XiaoIce: challenges and opportunities with social chatbots. Frontiers of Information Technology & Electronic Engineering 19, 1 (2018), 10--26.
Jaisie Sin and Cosmin Munteanu. 2019. An Information Behaviour-Based Approach to Virtual Doctor Design. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services (Taipei, Taiwan) (MobileHCI 2019). Association for Computing Machinery, New York, NY, USA, Article 44, 6 pages.
Myrthe L. Tielman, Mark A. Neerincx, Rafael Bidarra, Ben Kybartas, and Willem-Paul Brinkman. 2017. A Therapy System for Post-Traumatic Stress Disorder Using a Virtual Agent and Virtual Storytelling to Reconstruct Traumatic Memories. Journal of Medical Systems 41, 8 (Aug. 2017), 125.
Alessandro Venerandi, Giovanni Quattrone, Licia Capra, Daniele Quercia, and Diego Saez-Trumper. 2015. Measuring Urban Deprivation from User Generated Content. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (Vancouver, BC, Canada) (CSCW '15). Association for Computing Machinery, New York, NY, USA, 254--264.
Alexandra Voit, Jasmin Niess, Caroline Eckerth, Maike Ernst, Henrike Weingärtner, and Paweł W. Woźniak. 2020. 'It's Not a Romantic Relationship': Stories of Adoption and Abandonment of Smart Speakers at Home. In 19th International Conference on Mobile and Ubiquitous Multimedia (Essen, Germany) (MUM 2020). Association for Computing Machinery, New York, NY, USA, 71--82.
Li Wang and Dennis Sng. 2015. Deep learning algorithms with applications to video analytics for a smart city: A survey. arXiv preprint arXiv:1512.03131 (2015).
Richard Y. Wang and Diane M. Strong. 1996. Beyond Accuracy: What Data Quality Means to Data Consumers. J. Manage. Inf. Syst. 12, 4 (March 1996), 5--33.
Xi Yang, Marco Aurisicchio, and Weston Baxter. 2019. Understanding Affective Experiences with Conversational Agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). ACM, New York, NY, USA, Article 542, 12 pages.
Vinicius Zambaldi, Joao Pesce, Daniele Quercia, and Virgilio Almeida. 2014. Lightweight contextual ranking of city pictures: urban sociology to the rescue. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 8.
Li Zhou, Jianfeng Gao, Di Li, and Heung-Yeung Shum. 2020. The design and implementation of xiaoice, an empathetic social chatbot. Computational Linguistics 46, 1 (2020), 53--93.

Cited By

View all
  • (2024)On Multimodal Emotion Recognition for Human-Chatbot Interaction in the WildProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685759(12-21)Online publication date: 4-Nov-2024
  • (2024)Conversational LocalizationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314047:4(1-32)Online publication date: 12-Jan-2024
  • (2023)SensiX++: Bringing MLOps and Multi-tenant Model Serving to Sensory Edge DevicesACM Transactions on Embedded Computing Systems10.1145/361750722:6(1-27)Online publication date: 9-Nov-2023
  • Show More Cited By



Information & Contributors


Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 6, Issue 2
June 2022
1551 pages
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022
Published in IMWUT Volume 6, Issue 2


Request permissions for this article.

Check for updates

Author Tags

  1. Citizen Engagement
  2. Conversational Agent
  3. Edge AI
  4. Spontaneous Interaction


  • Research-article
  • Research
  • Refereed


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)7
Reflects downloads up to 02 Mar 2025

Other Metrics


Cited By

View all
  • (2024)On Multimodal Emotion Recognition for Human-Chatbot Interaction in the WildProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685759(12-21)Online publication date: 4-Nov-2024
  • (2024)Conversational LocalizationProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314047:4(1-32)Online publication date: 12-Jan-2024
  • (2023)SensiX++: Bringing MLOps and Multi-tenant Model Serving to Sensory Edge DevicesACM Transactions on Embedded Computing Systems10.1145/361750722:6(1-27)Online publication date: 9-Nov-2023
  • (2023)Experiencing Data on Location: A Case Study of Visualizing Air Quality for CitizensDaten vor Ort erleben: Eine Fallstudie zur Visualisierung der Luftqualität für Bürgerinnen und BürgerKN - Journal of Cartography and Geographic Information10.1007/s42489-023-00140-y73:2(97-108)Online publication date: 13-Jun-2023

View Options

Login options

Full Access

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media