1 Introduction
The deployment of
autonomous delivery robots (ADRs) is becoming increasingly prevalent in urban spaces, holding promise for enhancing last-mile delivery efficiency [
59], as well as reducing energy consumption and emissions [
23]. ADRs encompass various robotic platforms, including self-driving vehicles, autonomous delivery pods, and sidewalk delivery robots, also known as
personal delivery devices (PDDs). This study focuses on small-scale PDDs weighing between 20 and 250 kilograms and operating at a maximum speed of 12 m/s, which often share sidewalks with other sidewalk users [
63].
While other types of mobile robots, such as assistive robots [
53] or cleaning robots [
26], typically operate in closed-off and more predictable environments (e.g., workplaces or domestic spaces [
18,
100]), delivery robots must navigate complex and dynamic urban traffic environments and interact with a diverse range of road and sidewalk users, including pedestrians, cyclists, and vehicle users [
29]. This requires delivery robots to be highly adaptable and capable of handling unexpected situations, which can pose challenges to their smooth deployment. Although advances in robotics technology continue to enhance the operational capability of delivery robots, potential challenges during real-world deployment are difficult to fully anticipate during the robot development process [
83]. To shed light on the practical considerations of these delivery robots, recent studies have turned their attention to the real-world deployment of delivery robots, for example, evaluating traffic accessibility [
4,
21,
65] or observing people's interactions with these robots [
2,
27,
92,
97].
When designing ubiquitous technologies for urban contexts, it is crucial to consider non-users as stakeholders [
87]. When delivery robots operate in public spaces, they could potentially affect incidental road and sidewalk users who encounter the robot without intending to interact with it, who have been referred to as
incidentally co-present persons (InCoPs) [
72]. In situations such as unmarked crossings and shared spaces where formal traffic rules are lacking, road and sidewalk users often rely on social norms to convey intentions and anticipate behaviours [
69,
71]. To navigate such complex environments effectively, delivery robots and other
autonomous vehicles (AVs) must be equipped with the ability to communicate with other road and sidewalk users and follow social norms [
79,
94]. Numerous studies have focused on probing driver-pedestrian interaction patterns and designing
external human–machine interfaces (eHMIs) for traditional road conditions [
20]. However, the communication requirements for delivery robots extend beyond these contexts, as they operate more frequently in unstructured areas such as sidewalks and shared pedestrian zones and often encounter sidewalk users in close proximity. Furthermore, the lack of transparency in
artificial intelligence (AI)-driven systems can create difficulties in understanding the behaviours of delivery robots, negatively impacting the quality of interaction and diminishing people's trust and acceptance of the technology [
17,
79].
In September 2022, a video capturing a delivery robot ignoring police tape and barging through a crime scene went viral on social media, sparking discussion about the readiness of such robotics technology in real-world settings. This incident is symbolic of the many scenarios that delivery robots need to address in real-world operations and underscores the need for
human–robot interaction (HRI) research to study how these systems interact with people in real-world settings. Delivery robot technology companies have been increasingly conducting pilot programs in public spaces, such as
Starship robot1 pilot programs across various U.S. campuses and Europe, as well as
Kiwibot2 pilot program in Pittsburgh [
27,
29,
77], resulting in a notable presence of delivery robots in urban environments. Consequently, there is a growing amount of user-generated content on social media about people's encounters with these robots. These resources provide valuable data for HRI researchers to study real-world deployment scenarios and people's attitudes towards delivery robots, which can offer insights for designing better external interfaces to facilitate interactions between road and sidewalk users and delivery robots [
62].
To inform the interaction design of delivery robots through insights from real-world deployment scenarios and the public's attitudes towards delivery robots, we conducted an online ethnographic study. Our study utilised a systematic search approach to collect user-generated videos depicting delivery robot operations in urban spaces on video-sharing platform
TikTok,3 along with their corresponding comments. We conducted video content analysis, identifying scenarios in which effective communication from the delivery robot is essential to facilitate smooth interactions between the robot and sidewalk users, as well as people's behavioural patterns when encountering robots. Furthermore, through a thematic analysis of the corresponding comments on these videos, we identified several themes regarding people's attitudes towards delivery robots, including acceptance, perceptions and information needs. This study makes twofold contributions. First, the identified scenarios delve into potential design opportunities to augment communication between delivery robots and other road and sidewalk users in complex urban contexts and situations, going beyond the conventional focus on path negotiation. Second, our triangulated analysis of videos and comments provides insights into design considerations for the interaction design between delivery robots and other road and sidewalk users.
3 Methodology
To address the current lack of empirical investigation into the real-world deployment of delivery robots, we conducted an online ethnography study [
88] to analyse user-uploaded videos on
TikTok that capture encounters between delivery robots and the recording person or other surrounding people. We analysed those videos to identify real-world scenarios which necessity communication of delivery robots to facilitate smooth interactions with road and sidewalk users. Additionally, we conducted a thematic analysis of comments in response to videos to gain further insights into people's attitudes towards delivery robots and how these can inform the interaction design of delivery robots.
In sum, our online ethnography was guided by the following research questions:
RQ1: What are the real-world encounters documented online that reflect scenarios in which communication between delivery robots and road and sidewalk users is necessary?
RQ2: What are the public's attitudes towards delivery robots, and how can these attitudes inform the interaction design of delivery robots?
3.1 Data Collection
3.1.1 Initial Search Phase.
To ensure a systematic and comprehensive approach, we conducted an initial search across diverse video-sharing platforms such as
TikTok and
YouTube, as well as the search engine
Google, to obtain a broad overview of videos depicting interactions with delivery robots. After trialing various combinations of keywords across these platforms, we selected
TikTok as the most suitable platform for our study due to its extensive collection of user-generated content that captures genuine and spontaneous encounters with delivery robots. Furthermore, we decided on a search keyword strategy of combining technology-related terms and case-specific keywords (as proposed in prior research studies [
3,
50]), and ultimately settled on the search terms
delivery robot +
lost and
delivery robot +
why. To broaden the scope of the search, the names of three popular delivery robots, including
Starship robot,
Coco robot5 and
Postmates robot6 were included as alternative search terms for
delivery robot. To ensure that our search was comprehensive yet efficient, we followed the stop criterion suggested in previous research [
3,
62] and stopped the search after reviewing at least 25 successive videos that were deemed irrelevant.
The search was conducted by the first author between 5 November and 12 November 2022, and yielded an initial set of 612 video samples. Meta information for each video was recorded during the search process, including the link to the video, the upload date, the comments, the video's views and the number of likes.
Table 1 provides an overview of all the delivery robots included in the dataset.
3.1.2 Video Screening and Filtering.
After collecting the initial video samples, we conducted two rounds of screening and filtering to ensure the quality of the data. During the first round, duplicated and inaccessible videos were removed following the common criteria used in the initial filtering process [
50,
62,
64,
73], which resulted in 475 videos. In the second round, we adopted similar exclusion criteria used in [
50,
62] to exclude videos that did not feature a delivery robot in operation or contained advertisements, staged acts or non-English speech. The exclusion of non-English videos, totalling 17, was based on concerns regarding potential misinterpretation and loss of originality due to translation. Additionally, videos with excessive editing, disrupted chronological order [
43] or rapid short clips were removed, following the concerns raised by [
62] regarding the impact on the validity and neutrality of the recording. Two videos that were no longer accessible during the later analysis process were also excluded. In the end, the dataset consisted of a total of 117 videos that were eligible for analysis.
3.1.3 Comments Extracting.
To gain further insights into the public's attitudes towards delivery robots, we extracted the top 50 independent comments (i.e., comments that are not replies or threads) from each of the 117 eligible videos. We decided on this criterion to address the wide variation in comment counts across videos and to create a dataset that is both rich and manageable, thus allowing us to maintain a balance between in-depth analysis and practicality. This comment extraction protocol was adapted from previous online ethnography studies in HRI [
37,
86]. To ensure the richness and diversity of video scenarios, we included videos with a lower number of comments, even though similar studies tend to exclude such videos if they have fewer comments than the number they aim to extract. Thus, in cases where the number of comments was less than 50, all eligible comments were extracted.
To standardise the dataset, the comments were further processed by all three coders (i.e., the first, third, and fourth authors) based on a set of exclusion criteria agreed upon by consensus. Comments that met the following exclusion criteria were excluded: (1) non-English comments; (2) comments unrelated to robots; (3) comments with ambiguous meaning or reference, which could hinder coders from accurately interpreting the underlying sentiments and (4) comments that contained jokes which, upon discussion and agreement among all coders, were determined not to reflect people's true attitudes. The filtering process resulted in a final dataset of 2,067 comments.
3.2 Data Analysis
This section outlines the methodology used in the study for analysing both the video content and the comments posted under the videos. The methodology includes the use of thematic analysis [
11] to analyse and look for patterns in the comments and an approach inspired by open coding [
15] to identify the encounter scenarios in the video content.
3.2.1 Identifying the Scenarios.
We began the video analysis procedure with open coding of the content depicted in the videos [
15]. To do so, we first transcribed the spoken words in the videos and annotated the videos with a comprehensive description of the scenarios and the behaviours of individuals captured in the footage, including both the recorder and other people featured in the video. Timestamps were added to the transcripts to offer traceability for the analysis process. Additionally, the behaviours of individuals were coded by all three coders using a coding scheme developed and agreed upon by all coders to supplement the scenario analysis. An example video annotation can be found in
Table 2.
Given the aim of RQ1 is to identify the real-world encounters documented online that reflect scenarios that require communication between delivery robots and road and sidewalk users, certain eligible videos from the filtering process may not explicitly reflect this need. For example, some videos may only capture a functioning delivery robot without featuring instances that highlight potential communication breakdowns with passersby. To ensure that only videos containing relevant scenarios were included in the scenario identification process, we applied a set of selection criteria. Specifically, we only included scenarios where the lack of communication between the delivery robot and road and sidewalk users could result in confusion, misunderstanding, degraded experience, or potential interaction failure, resulting in 89 videos. The inclusion of scenarios was determined by the lack of communication that disrupted smooth interactions, as directly observed in the video or articulated by individuals depicted. It should be noted that, although selection criteria were implemented to ensure that only videos containing relevant scenarios were included in the scenario identification process, the comments that accompanied these videos were not excluded from the below-mentioned comment analysis. This decision was made to enable an examination of people's attitudes towards the delivery robots within a broader context, rather than limiting the analysis solely to their opinions on communication breakdown situations.
The selected videos were then grouped and summarised into high-level categories based on the robot's behaviours and the interactions that occurred between robots and road and sidewalk users as observed in the video. The behaviours of road and sidewalk users were first directly transcribed from the videos, then systematically coded and categorised based on recurring patterns. The derived categories of the scenarios and the observed behaviours of road and sidewalk users are reported in the results section.
3.2.2 Comment Analysis.
We employed a combination of deductive and inductive thematic analysis approaches to analyse the comment dataset. The coding process began with a bottom-up approach and was refined through an iterative process to develop a robust coding scheme. In order to incorporate diverse perspectives from researchers in related fields, the data analysis was conducted collaboratively with three coders, including the first author, as well as the third and fourth authors, who are HCI researchers specialising in AV–pedestrian interaction.
The first author conducted a comprehensive examination of the data and selected a representative subset that constituted 10% of the total comments. This subset was subjected to independent coding using an inductive approach by all three coders. The resulting coded data were consolidated into one spreadsheet, and a 1.5-hour meeting was held to review the codes, discuss agreements and disagreements among the coders and deliberate on the initial themes identified. Following the meeting, the coders collaborated to develop and agree upon an initial coding scheme. The coders then applied a deductive approach to independently code another subset of comments (10%) using the collaboratively developed initial coding scheme. The inter-coder reliability check [
25] was performed on the second subset of 10% comments, which yielded a moderate level of percentage agreement at 0.65.
To further increase the reliability of the coding scheme, we initiated another discussion to iterate over the initial coding scheme. During the second coding discussion meeting, we adhered to a process similar to that of the first meeting, with a specific emphasis on addressing codes with lower agreement rates, in order to ensure a consistent and coherent understanding of the coding scheme among the three coders. We then applied the revised coding scheme to another subset of the 10% comments. The second round of inter-coder reliability checks yielded a high level of inter-coder reliability [
25], indicated by a high percentage agreement of 0.85, and a good Krippendorff's alpha [
31] of 0.746. This suggests that the coding process was reliable and consistent and that the codes assigned to the data were valid and accurately reflected the content of the comments, which ensures the credibility of the findings derived from the coded data. Finally, the three coders applied the coding scheme independently to an equal subset of the remaining data. The themes that were identified throughout our comment analysis are presented as part of the results section.
4 Video Content Analysis Results
This section presents the results of our video content analysis, which starts with an introduction to the various categories of scenarios where road and sidewalk users require communication from robots to improve their interactions. The related issues that may arise in each situation with the absence of effective communication are also highlighted alongside these categories. We then discuss people's behaviour patterns when encountering robots identified through video content analysis. Our analysis considered various agents of behaviour, including pedestrians, vehicle drivers, cyclists, and other individuals present on sidewalks (e.g., people sitting at cafes). The inclusion of this diverse group aligns with the definition of InCoPs in [
72], as they are all stakeholders who may potentially be influenced by the presence of delivery robots.
4.1 Extracted Scenarios
To address RQ1, a detailed analysis was conducted by annotating, open-coding, and clustering the video content. The analysis identified five typical scenario categories where the lack of communication between the delivery robot and road and sidewalk users could lead to confusion, misunderstanding, degraded experience, or potential interaction failure. In this section, we will present these scenario categories and discuss the issues identified during the analysis process.
4.1.1 Scenario 1: The Robot Is Incapable of Performing Its Task in Complex Traffic Conditions.
Despite significant technological advancements that have enhanced delivery robots’ mobility, our analysis found that these robots could still face substantial challenges in complex urban traffic environments, impeding their smooth operation. Unpredictable obstacles (as shown in
Figure 1(a)) and diverse urban terrains, such as road curbs (as depicted in
Figure 1(b)), can obstruct the path of delivery robots and cause them to become immobilised. Moreover, traffic infrastructures dedicated to human use can present additional challenges for delivery robots. These robots are primarily designed for transportation purposes and lack manipulation functionality, leading to further inefficiencies and delays that require human intervention.
In these scenarios, despite passersby showing care towards the delivery robot, the lack of effective communication from the robot creates uncertainty among people about its status and whether they should offer assistance. This is exemplified by the recorder in
Figure 1(a) discussing with two other pedestrians whether they should help the robot:
‘What happened? Should we help the robot?’. Furthermore, the absence of communication has the potential to undermine people's trust in the robot. For instance, a comment under a video capturing a stuck robot doubted the capability of the robot,
‘Why are we helping them? They’re meant to be smart[…]’.
4.1.2 Scenario 2: The Robot Abruptly Stops or Redirects, Interrupting Its Smooth Operation.
The sudden stops or redirections of a delivery robot, which disrupt its consistent operating state, can often cause confusion among nearby road users. As shown in
Figure 2(a), where the robot suddenly stopped on a sidewalk with no visible obstacles, the recorder of the video falsely assumed that the robot had detected their presence and stopped, saying.
‘It sees me, so it stopped’. A similar situation occurred in
Figure 2(b), where the recorder assumed that the delivery robot's repetitive back-and-forth turning was due to their presence obstructing the robot's path, as reflected by their intention to give way for the robot,
‘Hold on, let me get out of your way’. The uncertainty surrounding the cause of delivery robots’ operating interruptions and whether pedestrians are involved can cause them to hesitate or alter their course, resulting in reduced pedestrian efficiency and potential safety hazards in complex urban environments.
Furthermore, unexpected movement interruptions can also lead to assumptions of robot malfunction, which can decrease people's trust. For instance, in
Figure 2(c), the recorder mistakenly assumed that the robot was
‘lost’ when it turned around and headed in a different direction. In a previous study examining people's interactions with service robots in public spaces, unexpected path deviations such as detours were also found to have a negative impact on people's trust in the robot [
62]. Therefore, it is essential for delivery robots to provide explanations for changes in their operating state to prevent misunderstandings and avoid disruptions in the traffic flow of other road users.
4.1.3 Scenario 3: The Robot Needs Negotiation with Other Road and Sidewalk Users at Intersections.
Intersections require effective communication between delivery robots and road and sidewalk users to facilitate successful negotiation among multiple parties. Our video analysis found that the path-planning mechanism of delivery robots often assigns themselves the lowest priority at intersections, leading to extended wait times until there are no vehicles on the road before crossing. However, while this prioritisation may be for safety reasons, the lack of communication with vehicle users can hinder traffic efficiency and lead to frustration among drivers. For example, in
Figure 3(a), even though the robot did not exhibit any signs of crossing intention, three out of five drivers came to a full stop and waited for the delivery robot to cross, causing unnecessary delays. In a similar situation, a delivery robot's prolonged wait at the intersection made a driver mistakenly believe that they were obstructing the robot's path, resulting in the driver reversing their vehicle to give way to the robot. Moreover, the absence of effective communication can lead to impatience and frustration among drivers. This was exemplified by a driver's angry shouting towards a stopped robot at a zebra crossing (see
Figure 3(b)), where the robot had to stop to give way to another vehicle.
From the pedestrian perspective, even though they are not engaged in the same negotiation process with delivery robots as drivers at intersections, the lack of communication can negatively impact pedestrian mobility, experience and safety when crossing the street. In some instances, when the delivery robot stopped and waited to cross, pedestrians were observed attempting to guide the robot in various ways, which not only impeded their own crossing but also potentially increased the risk of traffic accidents. Moreover, the delivery robot's waiting behaviour can influence pedestrians’ decision to cross. For example, a pedestrian asked a waiting delivery robot
‘Are you waiting for me to move?’ (
Figure 3(c), instead of crossing the street immediately after the traffic light turned green.
4.1.4 Scenario 4: The Robot Encounters Other Road and Sidewalk Users in Close Proximity.
When delivery robots navigate urban environments, conflicts with other road and sidewalk users are inevitable, particularly on narrow sidewalks or in bottleneck traffic situations. While delivery robots can typically avoid pedestrians autonomously, the lack of transparency regarding the rules they follow to navigate around people can raise questions about the right of way in such situations.
Figure 4(a) illustrates a scenario where an elderly woman in a mobility scooter had to stop and find a way to navigate around the robot, while another man had to step out onto the driveway. In response to this video, one comment raised the question,
‘Can she not give way?’ Furthermore, the lack of communication can also impact the social interactions of groups of pedestrians and their overall comfort in the urban environment. As shown in
Figure 4(b), a group of three pedestrians who were chatting together had to scatter as the robot approached, disrupting their social activity by causing them to stop their conversation and temporarily shift their attention from the interpersonal interaction to the robot's movements.
Moreover, as shown in
Figure 4(c), pedestrians may be startled by oncoming delivery robots that lack communication regarding their intention to stop. In this scenario, the video recorder was shouting
‘stop!’ in terror at the robot approaching them due to concerns about being hit by the robot.
4.1.5 Scenario 5: The Robot Does Not Comply with Conventions or Regulations.
Our video analysis revealed instances where delivery robots failed to comply with conventions and traffic regulations, potentially due to the challenges posed by complex urban environments and technological imperfections. A typical example of such a scenario is a video showing a delivery robot ignoring police tape and entering a crime scene, as shown in
Figure 5(a). A similar case can be seen in
Figure 5(b), where a person was trying to direct a robot falsely entering a marching band to leave. These unexpected behaviours led some people to doubt the reliability of the technology, as demonstrated in a comment under the video of
Figure 5(a) referring to the robot's actions as a
‘tech blunder’.
Furthermore, it is inevitable that delivery robots may make errors while in operation, and in some cases, even violate traffic regulations. As illustrated in
Figure 5(c), the robot was observed jaywalking, and crossing over the motor vehicle lane. These behaviours elicited comments such as
‘drives as crazy as a human’ or mention the potential of
‘causing a car accident’ under the video, indicating people's safety concerns about the robots. In addition, errors in robot behaviour can lead to a decrease in people's trust in them, as suggested in [
30]. Even though effective communication of the robot's internal state cannot prevent the malfunction from happening, it can still help people better comprehend the situation and plan their own path accordingly, potentially avoiding hazardous outcomes.
4.2 Behaviours of Road and Sidewalk Users
In this section, we present six typical behaviour categories summarised from people's interactions with the robot captured in the video. These behaviour categories provide a comprehensive understanding of the dynamics and responses exhibited by individuals when encountering delivery robots, which can offer valuable insights into the design of interactions between robots and other road and sidewalk users.
To account for the potential influence of being recorded, we highlight the number of behaviour instances initiated by the video recorder, as well as the number of protagonists aware of the recording, when reporting the behaviour instance count.
4.2.1 Attention.
When encountering the delivery robot, the most frequent behaviour among road and sidewalk users that we observed was slowing down or stopping to gaze
7 at the robot (n = 59). In two instances, pedestrians even followed the robot for a brief period of time to observe it more closely. Notably, people's attention was more attracted to scenarios where the robot's movement was interrupted or the robot behaved abnormally, as shown in
Figure 6.
Furthermore, our observation indicated that some people showed noticeable interest in the robot's perception channel, such as the camera or sensor. This observation is consistent with the results from our comment analysis result that the robot's perception is one aspect of people's potential information needs. Specifically, five pedestrians approached the front of the delivery robot or got close to its camera to inspect it closely (thereof three were protagonists aware of the recording).
4.2.2 Making Way for the Robot.
During encounters with the delivery robot, road and sidewalk users frequently altered their paths (n = 20, 6 recorders) or stopped (n = 8, 3 recorders) to make way for the robot, particularly when it was in close proximity (as shown in
Figure 6). These behaviours suggested that road and sidewalk users were generally respectful and accommodating towards the delivery robot. Notably, some pedestrians even stepped off the sidewalk onto the driveway (n = 1) or the lawn (n = 2) to allow the robot to pass on narrow sidewalks. It is worth noting that people do not only yield to the oncoming robots as two pedestrians were observed to stop and step aside for a robot coming from behind after noticing its approach.
Apart from pedestrians, vehicle users also exhibited the willingness to yield for the robot (n = 5, two recorders) when encountering a stopped delivery robot at an intersection, with three of them stopping completely in front of the robot to make way for it. In three instances, drivers (all recorders) even backed their cars or drove away to maintain a larger distance from the delivery robot, as they believed that their car was detected by the robot and blocking the robot's path.
4.2.3 Assistance.
Among the videos in our dataset, there were 14 captured instances where the delivery robot was unable to operate independently due to challenging traffic conditions, and in 13 of those cases, the robot received assistance from passing pedestrians. Seven pedestrians physically pushed the delivery robot when it was stuck (thereof five were recorders and two were protagonists aware of the recording), with one of them even observed escorting the robot with their arms surrounding the robot after it resumed movement. Additionally, six pedestrians assisted the robot in pressing the traffic light button or removing obstacles due to the robot's inability to manipulate traffic infrastructure or move objects (four recorders). Furthermore, our observation also noted instances of expressed joy and excitement following the provision of assistance help, as evidenced by laughing and changes in their speech tone (n = 7, thereof 4 were recorders and 3 were protagonists aware of the recording).
In addition to offering help when the delivery robot encountered difficulty, 11 road and sidewalk users were observed attempting to aid the robot's operation by directing it through verbal or gestural cues when the robot was not moving or had entered a restricted area (e.g., the crime scene as shown in
Figure 5(a)) (six recorders). For instance, in one video where the robot was not moving despite the green traffic light being on, a pedestrian used hand gestures of curling their fingers towards themselves and said
‘come on’ to direct the robot to cross the intersection.
4.2.4 Displaying Etiquette.
Our observations indicate that some individuals interacted with the delivery robot in a socially conscious manner. The demonstration of social etiquette towards the robot suggests that road and sidewalk users may perceive the robot as a social agent rather than a simple machine, which is consistent with the results of the comment analysis. The social interactions observed include greetings upon encountering the robot (n = 10, thereof 8 were recorders and 1 were protagonists aware of the recording), bidding farewell when it left (n = 8, thereof 5 were recorders and 1 protagonist aware of the recording), and expressing apologies after obstructing the robot's path (n = 2, thereof 1 recorder and 1 protagonist aware of the recording). For example, one pedestrian made a prayer-like hand gesture to express apology towards the robot and gestured an ‘after you’ motion to indicate their intention of yielding the way for the robot after blocking its path.
Moreover, one of the delivery robots included in our study was equipped with verbal communication capabilities to express gratitude to pedestrians who provided assistance. In these instances, all eight individuals responded to the robot's gratitude with expressions such as ‘you’re welcome’ (thereof 6 recorders and 2 protagonists aware of the recording), accompanied by a surprised (n = 8) or joyful emotional expression (n = 7). This observation suggests that reciprocal social etiquette from a robot could lead to positive social interaction between humans and robots.
4.2.5 Interference.
The actions of pedestrians may pose a challenge to the smooth operation of delivery robots. Six pedestrians intentionally tested the robot's operation by intentionally stepping in front of it (thereof 2 records and 2 protagonists aware of the recording). Notably, one of them pretended to tie their shoelaces to mask their intent from the robot instead of standing directly in front of it. Moreover, we observed eight instances of playful behaviours towards the robot, including people playfully chasing it (n = 5, including 3 children, 1 recorder), pretending to kick it (n = 2) or placing a beverage can on top of it as it passed by (n = 1). Notably, no interference from road and sidewalk users was observed in scenarios where robots encountered operational difficulties and are incapable of performing their tasks.
4.2.6 Conversational Communication.
Our video content analysis recorded 43 instances of conversational interactions between the delivery robot and video recorders, with 35 initiated by the recorder or protagonists who were aware of the recording, and 8 initiated by the robot expressing gratitude as mentioned in the above section. Among these interactions, 11 were questions posed by the recorder to the robot, such as inquiring about its destination, ‘Where are you going?’, or checking its status, ‘Are you lost?’ when the delivery robot exhibited less-than-smooth operation. In such cases, three recorders expressed encouragement for the robot, for instance, by shouting ‘You made it!’ when the robot resumed movement from a temporary breakdown. In contrast, one driver expressed frustration by yelling angrily at the robot for blocking their path. Among the observation of conversational interactions between the delivery robot and video recorders, 21 instances of verbal communication were related to social etiquette as introduced above.
5 Comment Analysis Results
In this section, we present the results from the thematic analysis of the user-generated comments pertaining to the general public's attitudes toward delivery robots (addressing RQ2). The analysis reveals three broad themes.
The first theme pertains to people's Perceptions of delivery robots, including the tendency for people to anthropomorphise the robots, the perception of the robots as social agents, and the overall impression that delivery robots are cute and novel. The second theme concerns the Acceptance of delivery robots, including people's attitudes towards the robot presence, their willingness to collaborate with delivery robots, and the factors that influence their acceptance. The third theme covers the Information that people would like to know about the delivery robots, such as reasons behind delivery robots’ behaviours, as well as information regarding several technical aspects.
5.1 Perception
5.1.1 Anthropomorphism.
The analysis of user-generated comments revealed that people tend to anthropomorphise delivery robots, despite the robots evaluated in the study featuring predominantly mechanical appearances or exhibiting only minimal anthropomorphic traits (e.g., displayed eyes). This was supported by the frequent use of gendered pronouns (n = 87, 22.1%)
8 or personification appellations such as
‘little guy’ or
‘little buddy’ (n = 39, 10.0%) when referring to the robots. In contrast, a smaller proportion of people perceived the delivery robots as mere machines (n = 20, 5.1%), as demonstrated by their use of robotic appellations when referring to the robot, such as
‘box on wheels’. In addition, five comments suggested adding anthropomorphic features to the robot, such as putting
‘googly eyes on them’.
The tendency to anthropomorphise delivery robots is also reflected by people's attribution of human thoughts (n = 73, 18.6%), traits (n = 37, 9.4%) and feelings (n = 30, 7.6%) to the robots. This process, known as mentalisation in psychology [
28], has been found to exist in how people interpret robots, as suggested in previous research [
58,
75]. People's interpretation of the robot's behaviour was often guided by their assignment of human-like thoughts to the robot, as demonstrated by one comment assigning thoughts
‘Why is it stopped like “I remember you! What do you want human […]”’ to the robot stopping in front of the human in the video. In addition, people also speculated about the robot's characteristics and feelings, as reflected in comments describing the robot as
‘polite’ or
‘sensitive’,
‘embarrassed’,
‘tired’, or
‘nervous’. The assignment of the emotion of being
‘scared’ (n = 5, 1.3%) was used when the robot was observed waiting at an intersection to cross the road or stopped because of human presence. In addition, people often draw analogies between the behaviours of robots and those of humans (n = 47, 12.0%). For instance, one comment described a delivery robot stopping at the sidewalk as
‘fell asleep’.
Moreover, our analysis results identified that science fiction films have an impact on the public's perception of delivery robots. A number of comments mentioned their associations with robot domination (n = 39, 10.0%) or well-known robot characters, such as ‘Wall-E’ (n = 16, 4.1%) upon seeing the robots from the video. Although some of these comments may contain a humorous tone, they underscore the pervasive influence of science fiction narratives in media, such as movies, on shaping people's anthropomorphic perceptions. This finding highlights the role that media and cultural representations play in shaping the public's attitudes toward autonomous robot delivery.
5.1.2 Social Agent.
Our analysis revealed that people tend to perceive delivery robots as social agents, as indicated by their expectation and appreciation of robots’ adherence to social norms (n = 67, 69.8%). In contrast, only a limited number of comments expressed concerns about the potential decrease in social interactions that could result from relying on robots for delivery instead of humans (n = 3, 3.1%), such as ‘(losing) small talks with the delivery drivers.’
Our analysis further highlights the significance of robots’ social communication (n = 49, 51.0%) abilities in determining their perceived sociability. For instance, one of the delivery robots in our study was equipped with the communication ability to verbally request human assistance and express gratitude through phrases like ‘thank you’, which elicited generally positive reactions from the comments (n = 29, 30.2%). Furthermore, politeness is a crucial element of social communication, and people expect robots to display it when seeking human assistance. Some comments criticised the robots for their lack of politeness (n = 3, 3.1%), such as one comment stating ‘not even a please, it can wait’. In addition, non-verbal communication modalities such as facial expressions (i.e., screens displaying simple facial expressions) (n = 7, 7.3%) and music responses (i.e., playing a short music tune after customers picked up their delivery) (n = 4, 4.2%) received positive feedback in 11 (11.5%) comments, which could also contribute to the delivery robot's perceived sociability.
In addition to social communication abilities, people also expect robots to navigate around other road and sidewalk users in a socially polite manner (n = 10, 9.6%). For instance, eight comments considered it polite behaviour when the robot stopped or altered its trajectory to give way to other road and sidewalk users. In contrast, in a video where an elderly woman in a mobility scooter was giving way to a robot, two comments argued that the robot should have given way to the woman as a sign of politeness.
The way people intended or appreciated people in the video to interact with delivery robots socially also demonstrated their perception of these robots as social agents. Twenty-six (17.1%) comments expressed people's intentions to engage in social interactions with the delivery robots, including actions such as greeting, ‘hug(ging)’ and ‘hold(ing) hands’. Furthermore, eight comments (8.3%) expressed appreciation for individuals in the video who demonstrated social etiquette when interacting with the robots. In one video, a person's response of ‘You are welcome’ to the robot's gratitude elicited a positive reaction from a comment, which stated, ‘It made me smile and giggle when it thanked her and she said ”you’re welcome”.’
5.1.3 Cuteness and Novelty.
Our analysis revealed that people have generally positive impressions of the delivery robot (n = 113, 57.7%), with cuteness as a type of attractiveness being the predominant impression that people associate with the robot (n = 96, 49.0%). This was indicated by the adjectives used to describe them, such as ‘cute’ or ‘adorable’. While this could be related to people's tendency to anthropomorphise robots, three commenters explicitly pointed out that they found the robot to be cute despite its mechanical appearance, e.g., ‘WHY ARE THEY SO CUTE?!? They’re boxes on wheels and I still have feelings for them!’. In contrast, unpleasant impressions of robots were relatively rare (n = 12, 6.1%). A few comments used terms such as ‘scary’ or ‘creepy’ to describe the delivery robot.
The novelty of delivery robots is another common impression that they leave on people, as this technology has not yet been widely adopted as a common delivery method in most parts of the world. This is reflected in 73 (37.2%) comments expressing people's surprise upon seeing the delivery robot in the video or asking about it, with phrases like ‘What is that [the robot]?’. In addition, 17 (8.7%) comments used adjectives like ‘cool’ or ‘awesome’ to express admiration for the robot representing an innovative technology.
5.2 Acceptance
5.2.1 Robot Presence.
The results of our comment analysis suggest a generally positive attitude towards the presence of delivery robots as a service in urban settings (n = 119, 68.8%). Specifically, many comments expressed affection for the delivery robot (n = 63, 36.4%) and interest in seeing or using the delivery robot (n = 43, 24.9%). In addition, 13 (7.5%) comments suggested that the robot represents the ‘future’. In contrast, a relatively small minority of comments expressed a resistant attitude towards accepting delivery robot deployment, with some expressing reluctance to use the service (n = 15, 9.2%) or aversions towards the robots (n = 4, 2.3%). The negative attitudes towards the presence of robots can also be reflected by people's intentions to perform reckless behaviour towards the robot (n = 35, 20.2%), such as ‘kick it over’ or ‘ram it with my car.’
5.2.2 Collaboration with Robot.
In addition to the explicit attitudes expressed towards delivery robots, the acceptance of these robots by the general public can also be inferred from people's opinions on emerging human–robot collaborations (HRCs). Most comments (n = 204, 98.6%) expressed supportive views towards humans offering help or expressed sympathy for the robots in situations in which they encounter operation difficulties and require human interventions. In contrast, only a minority of respondents (n = 3, 1.4%) expressed opposition to offering assistance for delivery robots.
Eighty-eight (42.5%) comments align with the pattern that commenters agree with or express their opinion that people should help robots when they are in need. In several instances, commenters even expressed their intention to assist the robot captured in the video when it encountered operational difficulties (n = 15, 7.2%). For example, one comment stated that ‘I’d have walked it across the road’ in reference to a robot in the video waiting for a long time to cross an intersection. Furthermore, some comments even criticised the people in the video for not helping the robot or intentionally interfering with it (n = 33, 15.9%), as one person stated their angry feeling of being ‘heated’ at drivers who didn’t yield for the delivery robot.
Moreover, we found that emotional connections and sympathy could be formed with the delivery robot (n = 68, 32.9%), as evidenced by people expressing feelings of sadness (n = 23, 11.1%) or a desire to ‘cry’ for the robot (n = 11, 5.3%) when it struggled to complete its tasks. The question of the robot's right of way was also raised, with three comments advocating for the robot to have the same rights as pedestrians, as noted by one comment: ‘[…] cars are supposed to stop for them. They’re considered pedestrians, and it's illegal not to stop at crosswalks.’ These findings could suggest that some people have the tendency to view delivery robots as entities deserving of respect and certain right in public traffic settings.
5.2.3 Factors Influencing Acceptance.
The comments analysed in our study provide insights into various factors that could impact the acceptance of delivery robot deployments by the general public. A key concern identified in the analysis is the robot's capability to efficiently carry out delivery tasks. While some comments expressed favourable evaluations of the robots’ capabilities (n = 40, 12.1%) and even preferred them over human delivery (n = 5), a larger proportion of comments expressed concerns regarding the robots’ capability to complete delivery tasks (n = 56, 17.0%) or their delivery efficiency (n = 37, 11.2%).
Instead of directly doubting the delivery robots’ operational capabilities, 130 comments expressed concerns about the robots’ vulnerability in complex or challenging scenarios, particularly regarding their ability to confront deliberate interference by pedestrians (n = 97, 29.4%), such as bullying or delivery theft. For example, one person regretfully commented: ‘The sad thing is as soon as we saw the robots we knew they were going to get stolen and kicked and messed with.’ In addition to concerns about intentional human interference, challenging road conditions, which could potentially hinder the robots’ operation, also emerged as another concern raised in some comments (n = 33, 10.0%). For example, one comment expressed their worries about the robot's performance in snowy areas as follows: ‘Good luck on our snow-covered sidewalks’.
Some comments also addressed the potential impacts of delivery robot deployment on traffic (n = 25, 7.5%) or society at large (n = 42, 12.7%), highlighting the significance of these factors in shaping people's acceptance of delivery robot technology. People's concerns regarding the societal impact of delivery robots were particularly focused on the prospect of job loss resulting from increased automation (n = 42, 12.7%). With regard to traffic impacts, despite the smaller size of delivery robots and their consequently reduced likelihood of posing a threat to the safety of other road and sidewalk users, the potential for increased hazards on sidewalks could still result in the reluctance to accept the robots (n = 14, 4.2%), as expressed in one comment: ‘More hazards on the pavements. It's a no from me’. Besides traffic safety concerns, people also expressed concerns about the possible negative impact of delivery robots on traffic efficiency (n = 11, 3.3%), as demonstrated by one comment referring to the video in which a robot blocked the traffic as ‘the future of waiting.’
5.3 Information
5.3.1 Behaviour Information.
Based on the comment analysis, it was found that people request additional information to be communicated by delivery robots, beyond their current features of simple flashing lights or facial expressions. The most frequently discussed topic among the comments was the need to understand the reasons behind the robots’ behaviours (n = 98, 94.2%). This was evidenced by comments where people attempted to interpret the robots’ behaviours (n = 85, 81.7%) or questioned why they behaved in a particular way (n = 13, 12.5%). For example, one commenter attempted to understand the unexpected path-planning of a delivery robot when it drove off the sidewalk and into the lawn, suggesting that ‘Is that his tracks from before? Maybe it's done the route before and it goes in the exact same pattern’. Compared to the reasons behind the robots’ behaviour and actions, the delivery robots’ future actions received less attention among commenters (n = 6, 9.6%). For instance, one commenter made assumptions that ‘after the robot waits 5 minutes it auto returns’ in response to a video depicting a delivery robot stopping on a sidewalk.
5.3.2 Technical Information.
Technological knowledge related to delivery robots is another common topic of discussion among people. The mode of operation of the robot, specifically whether it is controlled by humans or operated autonomously, is the most frequently discussed aspect (n = 62, 72.1%). The majority of the comments hold the belief that the current technology does not allow for fully autonomous operation, so delivery robots are controlled manually by human operators (n = 41, 47.7%). Despite the fact that our study included robots with varying levels of autonomy, either fully-autonomous (e.g., Starship robot) or with human operators assisting their operation (e.g., Coco robot), people's assumptions about robot control suggest a limited understanding of robot technology and the necessity of making the operational modes of the delivery robot visible to road and sidewalk users.
Furthermore, specific technical information regarding the delivery robot's perception and localisation was of interest to some commenters (n = 24, 28.0%). Comments concerning the robot's sensors (e.g. camera and lidar) and inquiries about whether the robot was ‘watching’ revealed their curiosity about how the robot perceives its surroundings (n = 17, 19.8%). The localisation system, such as maps and GPS, was another topic of interest in several comments (n = 7, 8.1%). For instance, in one video where the delivery robot chose a circuitous path on the pavement instead of a more direct route, one comment assumed that the robot must have ‘mapped all pavements separately from streets’, attempting to justify the robot's behaviour. These comments indicate a need for opening more technical details of delivery robots to other road and sidewalk users, which could enhance public understanding of the robots’ capabilities in urban environments, thereby fostering greater trust and acceptance.