Abstract
The COVID-19 pandemic has highlighted the importance of social distancing to prevent the spread of infectious diseases. However, enforcing social distancing in public spaces with traditional methods, such as hiring workers or using robots to remind people, can be unwelcome. In this paper, we propose a new technique to help maintain social distancing using augmented reality. We hypothesize that visualizing respiratory droplets with an augmented-reality interface can help individuals avoid getting too close to others. To test this hypothesis, we developed an augmented reality prototype that combines real-time head pose tracking with particle systems to align special effects to the subjects in the real world. We conducted user studies to evaluate the effectiveness of our method in daily communication scenes. Experimental results validated the efficiency of the proposed method and showed the potential of augmented reality for future epidemiological protection.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Avoid common mistakes on your manuscript.
1 Introduction
In the past 3 years, the world has witnessed the impact of pandemic. The fast spread of COVID-19 virus poses a threat to public health and puts a lot of pressure on healthcare infrastructure. While clinic treatment is essential to help reduce infected population, public health also plays a significant role to stop the virus spread. The effects of keeping a social distance in the physical world in slowing the spread of the virus have been widely recognized by researchers as well as the social workers (Kamga and Eickemeyer 2021). However, as a social population in the world, human beings naturally require communication with each other in the real world. Even if we are aware of the significance of social distance for delaying the spread of virus, it is still difficult for us to always keep a social distance with a strong self-consciousness. Although the shock of COVID-19 is gradually receding, we will still need to get prepared for the next potential harmful virus from the lessons learnt in the past pandemic era (Fig. 1).
Modern human computer interfaces have been prompted to assist our daily life under the pandemic. Online office tools and meeting rooms have popularized the work-from-home-style life to reduce the possibility of infection during working (Darus and Saahar 2022). However, as the significance of face-to-face communication is generally acknowledged (Duffy et al. 2005), we cannot only rely on the remote office or communication tools to make a virtual world for all human beings avoiding living in the real world. As the social distance is known to be important in reducing the possibility of infection, a common way is to regularly remind people in public space to keep a social distance by social workers, volunteers or even loudspeakers repeatedly playing recorded reminding words. Combining the recent advances in robotics and computer vision, researchers even develop robots to move on a street and work as a reminder for social distance maintenance. Generally, those reminding-style ways are not welcome to people as they feel to be enforced or softly enforced to obey the commands from others. Basically, we still resort to education to better motivate the citizens to keep a social distance in the public space. However, people may still involuntarily forget about the social distance in our daily communication, as this is counter-intuitive to always keep a distance of about 1.8 m in social activities (Chen et al. 2023).
In this work, we propose a new technique to help the social distance maintenance based on augmented reality (AR). Motivated by the education in the significance of social distance for pandemic prevention, where a dedicated video is sometimes played to show the spread of the respiratory droplets with special effects, we propose to highlight the breathed air with an augmented reality interface. Our insight is that humans are sensitive to the secretion or expired gas generated from other subjects. We hypothesize that if the respiratory droplets could be visualized along with the people around each individual, the subject may avoid to keep too close to other individuals. With this insight, we develop an AR prototype to augment the daily communication scenes with highlighted breathing droplets visualization. We combine the realtime head pose tracking along with particle systems, automatically aligning the special effects to the subjects in the real world through an AR glass. With the proposed AR interface, we design user studies to test the effects of our method in daily communication scenes. Experimental results are analyzed to imply positive supports to our hypothesis and we also report other findings from our experiments.
The main contributions of this paper can be summarized as follows
-
we propose a new AR-based interface to efficiently keep the social distance;
-
we demonstrate the effectiveness of our framework by conducting user studies through quantitative and qualitative comparisons.
2 Related works
2.1 Social distancing
People have been looking for effective methods to maintain social distance, especially during periods of infectious disease outbreaks, as the importance of social distancing has become more apparent. During the early stage of the COVID-19 pandemic, various social distancing policies were implemented in different countries, such as limited stay-at-home orders, nonessential business closures, bans on large gatherings, school closures, mandates, and limits on restaurants and bars (Fong et al. 2020). These measures were aimed at enforcing social distancing (Abouk and Heydari 2021). In addition, at the beginning of the pandemic, online video conferences gradually became part of people’s lives, which reduced face-to-face interactions among individuals through the use of network communication (Marks 2020). Some researchers have designed software systems to assess the risk of COVID-19 aerosol transmission, examining infection risks across various indoor environments and different measures (Vanhaeverbeke 2023; Lelieveld 2020).
In addition, many researchers have been working on inventing devices to encourage people to maintain social distance. For example, fully autonomous monitoring robots based on a quadruped platform, equipped with multiple cameras and a three-dimensional light detection and ranging sensor, have been developed. The robots can move freely in dynamic scenes and send friendly voice prompts to advise crowded people to disperse (Chen et al. 2021). Others have developed wearable smart devices that emit an alert to the user when someone is detected within six feet of their proximity (Nadikattu Raja etal. 2020). Research by Chakraborty et al. (2021) indicates that mobile AR applications can be used to obtain accurate estimations of distances to virtual others in real-world settings. Researchers have also proposed new applications of Digital Twins to measure the social distance between individuals (Mukhopadhyay 2022; Mukhopadhyay et al. 2021). These devices have been tested and found to be useful in maintaining social distance. Most of these devices use sensors to detect distances and provide warnings to the user, but there is relatively little research on the technology needed to encourage users to proactively avoid others.
We have also observed attempts to use virtual reality (VR) to encourage individuals to maintain a certain social distance. VR systems have been made to simulate real environments and keep users at a certain distance from virtual people (Martí Mason et al. 2020). Anastasiou et al. (2022) developed camera-based systems and other surveillance applications that can recommend real-time optimal movement paths to avoid crowded indoor or outdoor environments. The previous alert-based devices put users in a passive state of receiving commands, which leads to a lack of sustained social distancing. In contrast, we aim to devise a method for spontaneous social distancing, which can help users develop a habit of keeping a certain distance from others at all times, making it possible for users to remember to keep their distance even when not wearing the device.
2.2 AR for social communication
The rapid growth of augmented reality and virtual reality technologies in the past decade has attracted significant research and development efforts from academia and industry. By seamlessly integrating virtual content with the real world, AR provides users with a sensory experience that goes beyond reality (Chakareski 2017). AR has been widely applied in the field of interpersonal communication, but its practicality still requires further development of technology and further integration with other fields (Carmigniani et al. 2011).
The combination of emerging AR technology and communication can enable users to use 360-degree videos for remote communication. With the advancement of the latest communication technologies, the transmission speed and quality of videos have also been improved. In addition, AR can reduce the amount of verbal description in interpersonal communication by providing auxiliary information. A typical example is displaying objects being interacted with by others or the environment in which they are located, allowing users to directly obtain a large amount of information through visual feedback, which brings a sense of face-to-face communication even in remote work (Lorusso et al. 2018; Ahied et al. 2020). In addition to enhancing the experience of communication, companies have also developed AR systems to enhance the social skills of children with special needs (Wilson 2020).
From the beginning of the COVID-19 pandemic to date, studies related to AR have also appeared to meet the challenges related to COVID-19. Most of the research are in the medical field (Ahied et al. 2020; Luck et al. 2021; AlMazeedi et al. 2020) in terms of employing AR in conducting surgeries. The studies revolved around the educational area (Labib et al. 2021; Munzil and Rochmawati 2021; He et al. 2021) in terms of improving the remote learning environment to clearly deliver the information to students. AR has also seen applications in commercial fields that are tailored to the social situation. For example, during lockdowns when people were unable to try and purchase clothing in person, AR technology enabled users to remotely try on clothes. This innovative approach has to some extent changed the business mode of the retail industry during the pandemic (Papagiannis 2020). Our application of AR to help people maintain a certain social distance is also a new attempt in the field of social communication.
3 Method
We base our idea to help individuals keep an appropriate social distance with others on AR-based interface. As human beings are not sensitive to the breathing air from different individuals, which has no color and smell, so that they are not aware of the potential aerosol infection. We hypothesize that if the respiratory droplets are observable to the subjects, they will naturally react to avoid getting into the region with the breathing gas. As the visual channel is very efficient in displaying information, we will resort to visualize respiratory droplets in-situ from different individuals and expect the visually highlighted breathing effects will stimulate the subjects to keep an appropriate social distance to others.
In this regard, the AR-based interface well fit our design goals. With the progress of AR displays and programming tools, it is possible to augment the real world with virtual objects or effects. In order to produce the virtual animations that well align the real world, we aim to recognize the individuals in the public space, track their head pose to well simulate the breathing effects, align the virtual effects to each individual and render them with the head-mounted display (HMD) to highlight the breathing gas in the real world. All the involved computation should be completed in real-time, so that the augmentation of the real world will be natural to each subject. As shown in Fig. 2, we illustrate the pipeline about how we implement our prototype AR system.
Pipeline of our approach. (1) The camera on AR HMD captures images and input them to our system. (2) We utilize face recognition to locate faces in the captured images. (3) We passing the position of the detected faces to the ResNet50 network to regress the pose angles of the heads. (4) With the detected face and the its orientation, we use the Unity engine to render the special effects. Users wearing the AR HMD will see the displayed virtual effects
The HoloLens 2 is a MR device, but it is well-suited for AR tasks due to its ability to overlay virtual elements onto the real world effectively. Our primary goal is to deliver an AR experience that integrates virtual components with the real environment to assist with social distancing. This choice of device ensures that we can achieve the AR functionalities required for our application (Speicher et al. 2019).
Using the camera view of the Hololens 2 device as visual input, we propose to use a facial recognition method to identify and annotate the pixel position of the face region. Three pose angles are then regressed using the ResNet50 (He et al. 2015) for the annotated region. Finally, the collected information is rendered by the Unity engine for special effects and displayed on the Hololens 2, creating an augmented reality (AR) effect that is continuously updated in the camera video stream.
In the overall implementation process, we utilize the Hololens as the device for image acquisition and displaying rendered AR effects. Because the default vision toolkits provided by the current version of Hololens MRTK is not ready to provide reliable head pose tracking for one or multiple people in the view, we only adopt the raw image data collected by the Hololens sensor and opt to integrate a dedicated head pose tracking module in our system. The computational tasks, however, are performed on the connected computer after the Hololens transmitted data to it. The computer involved in the computation and the Hololens device worn by the user are on the same local area network and communicate using sockets. The Hololens transmits the image data at the current frame to the computer, which processes it and then sends all head pose data at that frame back to the Hololens. The specific workflow is illustrated in Fig. 3. This approach is necessitated by the limited computational power of the Hololens. Additionally, apart from processing the required data, the device needs to compute other parameters related to AR display. Hence, relying solely on the computing unit of Hololens do not meet the computational speed required for our purposes.
3.1 Head pose tracking
In our AR solution, the first problem to solve is recognizing each subject in the field view of the AR HMD and tracking the pose of their heads. This is enabled by advanced learning-based computer vision techniques. Several challenging technical problems in this task include tracking the head pose when it is in extreme orientation or occluded. Besides, it is also involved to deal with cases when there are multiple persons in the scene. For a robust location and track of the head poses, we incorporate a step of face tracking, followed by a head pose estimation step. For the face tracking step, we use a one-stage lightweight network to recognize the facial region (He et al. 2019), and the model is trained on the WIDER FACE benchmark dataset. We use the tracked face to clip the raw image and use the cropped region to solve for the head pose. For head pose estimation, traditional methods usually detect facial landmarks first and then use the Perspective-n-Point method (Zhu and Ramanan 2012; Bulat and Tzimiropoulos 2017) to restore the 3D pose from 2D information. However, this method is less stable when a small number of feature points are found in the view. Ruiz et al. (2018) have demonstrated that a direct, holistic approach to estimate 3D head pose from image intensities using convolutional neural networks delivers superior accuracy in comparison to keypoint-based methods. Therefore, we use this deep-learning-based method to directly regress for the head pose. Finally, we solve the head pose using a model trained on the 300W-LP dataset (Zhu et al. 2016). After obtaining the image region containing the head, the network used to regress the head pose information consists of both convolutional and fully connected parts. The convolutional part of the network is based on ResNet50, which directly receives the 3-channel RGB data of the head image. The output of 512 channels from ResNet50 is then processed through an average pooling layer and a fully connected layer to estimate the three head pose angles. We note that ResNet50 is just one implementation of the head pose regression network in our system. Other lightweight networks like MobileNet are also applicable to trade off the performance and computational cost. Although this network does not match the computational speed of other lightweight networks like MobileNet, it demonstrates slightly higher accuracy in scenarios involving occlusions and offers better robustness.
3.2 AR effects
We simulate the breathing droplets by using a particle system prepared in Unity Engine. Breathing effects with a higher quality based on, e.g., fluid simulation, are also available with current graphics tools. However, in order to keep a higher running performance, we trade-off the quality of the dynamic effects and use particle systems to highlight the respiratory droplets with an appropriate computation and rendering cost. After obtaining all the head pose information, we are ready to align the virtual effects to captured images from the camera on the HMD to form AR views. The obtained head pose information includes the position and orientation angles of all heads within the current field of view. We collect the head pose data, send them to the Unity engine with the position of the head in the pixel coordinate system as well as the 3D orientation estimated from the learning-based algorithm, and transform the particle system with the obtained head pose. If multiple subjects are detected, we duplicate the particle system and transform them with each head pose to generate the AR effects. To generate AR effects for all individuals when multiple people appear in the field of view, we maintain a list that corresponds to the AR effects for each person and dynamically update it as the situation changes. Each time a new head appears in the current frame, its pose information automatically occupies a new position in the list. When a head disappears from the field of view, its position in the list is cleared. The rendering of AR effects is based on the data stored in this list, distinguishing different AR effects by different storage positions in the list. The visual effects are refreshed with the update of the detected head poses. We chose the Hololens2 from Microsoft as the AR device. After deploying all the programs on the AR device, users can see the visualization effects when wearing the HMD.
One limitation of our work is that we cannot track the subject who is behind the Hololens user because the device does not have rear-facing sensors that can be utilized. Instead of installing extra cameras behind the HMD, we also note that when considering aerosol transmission, face-to-face interaction or encounters usually pose a higher infection risk. Therefore, in our current prototype, we did not consider to track the people who were not in the user’s field of view.
3.3 Other implementation details
Considering the limited computational resource of AR devices and the requirements on real-time AR effects, we did not use labeled recognition when tracking the position of human heads. Instead, we only identified the position information of all heads in the field of view without providing each person’s identity. This approach may result in the inability to correspond to the respective position changes of different individuals between different frames. To address this, we let Unity process as many frames as possible in unit time and only generate special effects for the current frame. When the frame rate reaches a certain value, the effect can present a coherent sense of position to the human visual system to some extent.
Limited by current AR development toolkits, we can only call one monocular camera and there are difficulties in obtaining the depth information without resorting to external depth cameras. In order to further reduce the computational workload of AR devices and improve the processing speed, we did not use deep learning methods for depth information discrimination. Instead, we analyzed the size of the identified head pixels. We fit the pixel size with the rotation angle and human depth information and found a good functional relationship between them. Therefore, we directly calculate the estimated depth after obtaining all head information. The error introduced by this relationship was measured through pre-experiment testing. This pre-experiment included depth estimation under different head poses, and we compared the estimated depths with the actual distances to calculate the error. The quick method may introduce the error in the depth estimation with less than 10%. Although deep learning methods can obtain relatively accurate depth information, in the AR rendering effect, the depth information only affects the visual scale of the special effects. The size deviation caused by the error is relatively small compared with the random size of the special effects. In subsequent experiments, the subjects did not report any abnormal perception on the size of the visual effects.
4 Results
We show the results produced by our AR special effects in this section. In our experiment, we manually adjusted the design parameters of the particle system with its emission distance 0.8 m and emission angle 15° by referencing the reported range of breathing aerosols (Johnson and Morawska 2009). We plotted a set of our results in Fig. 4. In Fig. 4a, we adjusted the color and transparency of the particles and showed the bubble-like breathing droplets from a talking man in a street. We also showed the effects of tracking two persons in a group talking in Fig. 4b. Each individual in the view of the AR glass were well tracked to produce particle effects aligned with their head poses. Other than dynamic effects based on particle systems, we might also augment the scene with a pyramid shape to directly show the spatial range of the possible droplets, as shown in Fig. 4c. With the tracked head pose in real time, we could directly replace the particle animation with other shapes or scenes in the Unity Engine, for any other customized special effects. In Fig. 5, we also listed a sequence of a real-time captured video, where the speaker is actively rotating his head, to test the online head pose tracking of our method. With the AR particle effects generated when facing different head poses, we found our prototype worked for many dynamically-moving head poses.
In our experiments, we tested our model using a laptop with a GPU of NVIDIA GeForce RTX 4060, which was connected to the AR HMD using the UDP protocol. The average time taken for face recognition, pose estimation and rendering is measured as listed in Table 1. It can be observed that the output special effects remains limited to approximately 30 frames per second. We employed an extrapolation method to ensure smoother transitions between frames, thereby avoiding abrupt changes in subsequent special effects generation. We utilized actual data from the preceding three frames for second-order extrapolation to predict the position of the head’s pose. As for the orientation of the head, we use spherical linear interpolation (slerp) for predicting the pose angles. We upsampled the tracked poses to triple the frame rate, reaching more than 30 fps after the extrapolation. Note that we did not directly show the coordinate system of the tracked head and only utilized it as the source of the particles. Therefore, even if the extrapolated pose might have a subtle jump when a newly tracked pose was returned, the system did not expose noticeable artefacts. The tiny shift of the particle source emitted only a few randomly-distributed moving particles, which was not observable among the large set of existing particles.
5 User experiments
Our basic experiment aimed to explore the performance of users in maintaining social distance with and without AR effects. Our hypothesis is that with the proposed AR interface, users would actively keep a larger social distance. We conducted user experiments to check if our hypothesis was supported.
5.1 Participants
We invited a total of 50 volunteers participate in the user experiment, including 36 college students with age between 18 and 25 (mean = 21, SD = 2.13) and 14 working staff with age over 30 (mean = 35, SD = 5.41). The participants consisted of 32 males and 18 females, and none of them had extreme visual impairments such as severe myopia or color blindness. Among these volunteers, only 2 had experience in using AR HMD.
5.2 Design procedure
Our user experiments were conducted in an empty corridor. Participants wore a Hololens2 HMD and stood in front of a wall with a simple background. Each of them was asked to hear a user facing to him/her, talking about the introduction to Augmented Reality technology for a minute. The participants were allowed to move freely wherever they felt comfortable in hearing the speaker. In order to make the test scene consistent for all the participants, we asked the speaker to stand at the same position while talking in all the experiments, so that we could better observe the behaviors of the participants in a consistent manner.
5.2.1 Comparison group
We performed three different experiments to form the comparison group: the scene without AR effects (no AR), with AR effects using particle system to display the respiratory droplets (AR Particle), and with AR effects using a pyramid aligned to the head of the speaker to display the range of breathing gas (AR Pyramid). Each participant wore the AR glass, observing the three scenes in the comparison group in a random order. They were instructed to be flexible in maintaining a satisfactory social distance with the speaker in the three different scenes. We measured the quantitative metrics by observing their behaviors. After the tests were completed, we conducted a questionnaire with the participants and interviewed them for their subjective feedback.
5.2.2 Measurements
As the quantitative measurements, we tracked the standing position of each participant to help obtain the social distance. The standing points of the recorder and speaker were marked when we recorded the speaking scene, and we marked the standing position of each participant when the trial ended. Thus, we were able to compute the Euclidean distance to the speaker when the audience felt comfortable and it was used to approximate the social distance (Alshaweesh et al. 2023) in our experiment. After the experiment, we distributed a survey which mainly included a rating of the AR effects in terms of its assistance in maintaining social distance, including the naturalness (feeling the augmented scene physically meaningful), the effectiveness (helpful in maintaining a greater social distance) and the acceptability of AR effects in practical application (willingness to use in daily life). We used a 5-point Likert scale to collect subjective ratings (1: strongly disagree, 5: strongly agree) on the three dimensions.
5.3 Results
During the experiment, we observed that when the AR effects(including AR particle and AR pyramid effects) were turned on, the participants would naturally move to keep a larger distance to the place what the speaker stood on. Although the speaking scene lasted for about 1 min, the audience spent about 10 s to find the listening place where they felt comfortable. When we measured the social distance after the scene was played, they did not move and the distance was considered converged.
We processed all the data and obtained the estimated social distance that participants could maintain under the three different conditions. The results were shown as a boxplot in Fig. 6a. We also plotted the final standing points of the participants in a coordinate system with respect to the speaker, as shown in Fig. 6b. Different colors in the legend represent various experimental conditions, including no AR, AR with particle effects, and AR with pyramid effects. From the figure, we found that the participants preferred stepping back in our experiments rather than moving to the sides. We interviewed several participants for this observation after the data analysis and were told that they thought it was better engaged in the conversation when facing to the speaker. Some participants reported that the fastest way to maintain social distance is to simply step backwards to maintain a straight-line distance from others.
Moreover, in reality, moving sideways while talking to someone was not considered polite, which was also reported by two participants to explain why they preferred to step backward for social distancing. In addition, because the deviation angle after moving was not large during testing, participants felt that they did not experience abnormal visual effects in facing to recorded speaking scene.
We conducted Shapiro–Wilk normality tests on the data from the three different conditions, including no AR, AR with particle effects, and AR with pyramid effects, which showed that they all follow normal distributions (\(p = 0.18,\) 0.24, 0.22). A significant difference was found between the condition with no processing and the condition with AR effects (including both particle and pyramid effects) through a t-test with 99.9% confidence interval. We also tested the difference between the AR effects using particle system and pyramid range, the t-test reported they have a significant difference with 95% confidence level. Therefore, our hypothesis that users would actively keep a larger social distance with the proposed AR interface was examined to be supported. In addition, we also found that among the different AR effects, the particle system had a better effect. We believe that this difference is significant and not due to random factors.
In the questionnaire survey of participants, we conducted a comprehensive analysis of the subjective experience ratings on whether the AR interface was helpful. The data results are shown in the Fig. 6c. It can be seen that the particle group performs better, particularly in terms of naturalness. Analyzing the particle group data, for effectiveness, as the Shapiro–Wilk normality test showed the normal distribution (\(p = 0.23\)). For acceptability and naturalness, as the Shapiro–Wilk normality test showed the normal distribution was violated (\(p < 0.001\)). We also found positive feedback from the subjective ratings. Almost all participants believed that using this AR effect was very effective in helping people maintain a certain distance. The average rating score for effectiveness given by the participants was approximately 4.35. Additionally, most participants found this AR effect to be visually natural and comfortable. However, when asked if they would be willing to use it in their daily lives, the evaluations were not consistent among the participants. Those who were less willing to use it felt that wearing AR devices in daily life was inconvenient and that additional effects in their field of vision could affect their normal vision. Those who were willing to use it felt that this portable AR device not only met their expectations for new technology, but was also very effective in maintaining distance if required. In the survey, we also found that the vast majority of college students were willing to use AR devices, while those who were less willing to use them were mostly older users.
In the followed interviews with the participants, we were also told why the participants felt the interface was effective in social distancing. Many participants reported that they perceived the AR effects as if a physical object moving toward them, making them more conscious of avoiding collisions, while others believed that the particle system produced an effect similar to sneeze droplets, made them aware of the airflow as carrying dirt and stimulated them to avoid a close contact. In addition, some of the participants expressed that even after removing the AR device, they still recalled the particle effects from the previous experiment, which naturally led them to maintain a certain social distance. This to some extent achieved the intended application goal of our system.
Based on the above experimental data, we concluded that AR effects could help people maintain a greater social distance and induced a subjective avoidance psychology when facing others, thereby avoiding close contact between people.
5.4 Further study
Regarding that the AR interface with particle systems to highlight the breathing effects was commonly preferred in our first user study, we kept using this design and would like to continue studying the influence of more design parameters of the particle system in our application. The particle system was guided by a conic region with its apex at the mouth and nose area. The particles were generated from the point and spread towards the bottom of the cone. The height and angle of the cone corresponded to the emission distance and spray angle of the particles. We studied the effects of those parameters controlling the range of the particle system in their effects for social distancing here. Furthermore, we adjusted the color of the particles to investigate if the appearance would improve the overall performance. We selected 30 volunteers who were more willing to use AR devices from the basic experiment and invited them to participate in our further study.
5.4.1 Particle emission distance
Particle emission distance corresponded to the height of the cone that generated the particle effects, which could be controlled by the velocity and the life span of the particles. In the AR system, a longer emission distance indicated that the particles would be perceived by the user on the HMD travelling for a longer distance from the mouth and nose to dissipate, which visually created a more significant effect. We set up six different particle effects with total emission distances ranging from 0.2 to 1.2 m (with its value being 0.8 m for our basic experiments), while keeping other parameters consistent with the baseline experiment. We repeated the experiment while maintaining social distancing. The experimental results were shown in Fig. 7a.
As the traveling distance of the particles increased, the social distance that the subjects would maintain showed an upward trend. However, when the visual length increased to a certain threshold, the results showed that the maintained social distance did not increase prominently. When the particle emission distance was greater than 1 m, and the social distance remained in the range of 2–2.1 m.
Regarding the above observation, we interviewed a part of the subjects for the possible reasons. We found that when the emission length increased from a smaller value, the subjects thought about to avoid the emitted particles if they were at a farther distance, so that they would move to keep a longer social distance. However, because the particles spread outward with a certain angle, when the emission distance was too long, the density of the particles would be at a low level after traveling a long distance. Thus, the visual impact weakened and subjects might not over-react to the sparse particles. Some subjects reported that if the emission distance was too far, the particles did not look like aerosols produced by breathing, which also weakened their awareness of avoidance psychologically.
5.4.2 Particle spray angle
We varied the particle spray angle which was represented by the change of the cone angle controlling the range of particle effects. In our AR particle system, a larger particle spray angle meant that the particles would be observed to spread over a wider angle, creating a visually larger coverage area. We set up five different particle effects with spray angles ranging from 5° to 25° (the baseline experiment was 15°), while keeping other parameters consistent with the baseline experiment. We conducted repeated experiments with social distancing among participants. The results of the experiment were shown in Fig. 7b.
When the angle was small (5°and 10°), the change in the social distances maintained by the participants was also small. The average social distance was close to the cases without the AR interface (less than 1.5 m in these cases). When the angle increased to a certain value, the social distancing increased (nearly approaching 2 m), but when it was above to a threshold (about 20°), the social distance decreased to some extent.
For the above situation, we interviewed several participants and summarized the reasons as follows. When the particle spray angle was too small, the visual effect was basically similar to a ray in space. It was difficult to identify a dynamic conical region, which made users hard to associate it with the actual breathing aerosols, resulting in less motivation to avoid them in response. When the spray angle was too large, the particles rapidly expanded to the outer position, causing the particle effects blocking most of the face of the speaker from the front view. It also made users hard to associate it with the breathing aerosols. Furthermore, it affected the moving of the participants as well as the recognition of people in front of the user.
5.4.3 Particle color
Particle colors visually represented different particle materials rendered by the particle system. We used different bubble textures to produce particle systems with different visual effects, with the color simulating light blue, purple, pink, yellow, green, or transparent bubbles. In this study, we also kept other parameters consistent with the formal user experiment. We conducted repeated experiments and reported social distances maintained among participants. The results of the experiment were illustrated in Fig. 7c.
From the data, we found there was little difference in the performance of various particle colors in helping to maintain social distance. We conducted a Friedman test on the data grouped by different colors, which showed no significant differences between the groups (\(p = 0.34\)). The pink, yellow, and transparent bubbles performed slightly worse than darker colored bubbles. The average social distance was about 2.03 m when using darker colors. This might be due to the limitations of the Hololens display screen, as lighter colored particle effects were not very clear to the user in bright lighting conditions. Especially for the transparent effects, the average social distance was about 1.79 m. Some participants indicated that in bright lighting conditions, the light-colored bubbles appeared to be almost transparent and difficult to observe for changes, similar to the transparent bubbles. Therefore, they may not be suitable for our system. However, considering that different people might have perceptual biases towards colors, the generalization of the experimental results needed to be further studied.
In our further study, we did not experiment with different particle sizes. Based on our further study in terms of the above three design parameters, it was natural to expect that if the particle size was too small, the density of the particles would be not high enough. Conversely, the view would be blocked with extremely large particles. When selecting the size of the particles, we referred to the common sizes of bubbles in daily life (such as the bubbles produced by a bubble machine), aiming for a visually comfortable value, and we continued to use this size in subsequent experiments without changing it.
6 Limitation and future work
From the experiments, we verified the power of the proposed AR interface in motivating users actively keep a good social distance. From the recommended social distance for the COVID-19, which is 1.8 m, we are able to select appropriate design parameters to assist the social communication in the physical world. With the current AR tools available, there are still some issues that need to be discussed.
Firstly, the head pose tracking solutions are still not a completely solved problem in computer vision. Our system works well for most typical scenes. However, for head poses with an extremely large angle or cases with severe occlusion, the visual information may not be sufficient for tracking the head pose. Those extreme cases are also challenging for the face recognition problem. In a dynamic environment, people out of the field of view may also face to the AR user, it is also important to detect these cases considering the temporal continuity of the scene. Therefore, we expect that more robust human detection and head pose estimation algorithms would help improve the performance of our system.
Secondly, the computational cost is essential for the proposed AR interface. That’s why we do not use physics-based simulation of the droplets in our system. Considering that our application will be used in street-level environments in the future, we must ensure that the system’s computational capability meets real-time requirements. Limited by the HoloLens’s computing power, our current solution is to use external GPUs to assist the computation, which does impose certain constraints on portability. To address this, optimizing lightweight networks to further reduce computational costs could be an approach to adapt the HoloLens’s processing power without additional computing devices. Another solution is to improve the hardware performance of AR deployment devices or build a cloud computing structure for our application with the help of modern communication techniques.
During deployment, we also found that the mixed reality toolkit MRTK provided by Hololens working with the Unity Engine did not well optimize the real-time streaming of the camera data to Python APIs, causing some unnecessary networking time cost. Moreover, due to the experimental setup involving the connection of the Hololens to a laptop, there is a limitation in terms of portability. Although our pipeline could also be transferred to other AR environment such as Apple ARKit, we work with a wearable AR glass rather than a mobile-phone AR device because it is more natural to observe the augmented scene via an AR glass. In this case, users do not need to hold tablets or phones and are allowed for common social communication with hand gestures. We believe that the realtime AR would not be a problem with the updates from the community of AR development environments and upcoming hardware such as the Apple Vision Pro.
Our current research is primarily based on data from younger individuals, with less consideration given to older adults and other demographic groups. In future studies, experiments will involve older adults and other populations. Given that older adults may experience sensory decline and slower reaction times, we may need to customize the system differently for them compared to younger users. For example, extending the particle emission distance or making the particles more vibrant might be helpful. This could involve adjusting certain parameters of the current AR system to ensure a better user experience for older individuals. In addition, the aerosols produced by breathing are influenced by various factors, such as lung capacity. For example, generally, individuals with larger body sizes or greater masses might expel more particles. To ensure the AR system’s effectiveness across different demographics, it needs to account for these variations and adjust parameters and models accordingly. For example, the AR system can dynamically adjust the density and range of the aerosol particle effects based on the detected body size to more accurately simulate real-world conditions in future works. This approach can enhance the system’s applicability and effectiveness for diverse populations.
In our study, the background of the scenario was relatively simple while we note that in extreme cases, the head pose estimation module may still fail if the human head is badly occluded or the image is corrupted. However, in practice, we tested our system and found it commonly worked well in real-world scenarios with cluttered environment. With the advancement of modern computer vision techniques (Roberto et al. 2020), we believe this learning-based module will be ready for the daily use.
There are also issues with the display quality of AR devices. In our testing, there is a certain difference in the display quality between the Hololens2 HMD and general LED screens that we see in daily life, especially in high-brightness outdoor environments (Kress and Cummings 2017). In addition, the range of view of the current AR glass is still not large enough. With the release of new versions of AR glasses, we expect a better experience in social communication with the proposed AR interface in terms of visual quality.
In this work, we evaluated social distances in a face-to-face communication scene with a speaker talking and an audience listening. This is a representative scene and the pairwise social distance is also easy to evaluate. From existing reports (Al-Sa’d et al. 2022), the average social distance for multiple persons is highly correlated with the pairwise social distance. Therefore, we consider our experiment produces meaningful results in examining the effects of the proposed AR interface, although we only study a simple scenario. In the future, we will study our AR interface for more complicated social communication cases, such as walking on a street with random pedestrians in the opposite direction, or even a party scene with many random grouping of the participants. Furthermore, it will be interesting to work on collaborative AR to study the cases if all the participants wearing HMDs and observe each other with our AR interface. In the future, we will also explore to make our system more intelligent to cope with, e.g., speakers wearing masks or suddenly sneezing. It will involve more computational resources and efficient learning-based solutions are to be researched.
7 Conclusion
We propose an AR-based system that can assist individuals in maintaining a safe social distance. Through user experiments, we have validated the effectiveness of the AR effects. Further research investigates the impact of various AR parameters on user experience and the system’s effectiveness in promoting social distancing. The findings from this study highlight the potential of AR systems in enhancing social distancing maintenance through real-time visualization of aerosol spread and head pose tracking. The implications of this study extend beyond controlled environments to real-world applications, such as crowded urban settings and public transportation systems and we may also customize our design for different populations. By demonstrating the effectiveness of AR in visualizing invisible risks like aerosol transmission, this research provides valuable insights for future innovations in public health and safety technology.
Although the era of coronavirus is leaving us, the shock of pandemic has significantly changed our life and perspective about how to share the world with others. History does not repeat itself but it rhymes. We hope our work will inspire our offspring in facing with the next unknown epidemic disease in the future. With the future ecological industry of AR devices and platforms, we also hope the proposed AR interface will be developed to be a natural and useful tool for pandemic prevention in the future.
Data availability
No datasets were generated or analysed during the current study.
References
Abouk R, Heydari B (2021) The immediate effect of COVID-19 policies on social-distancing behavior in the United States. Public Health Rep 136(2):245–252
Ahied M, Muharrami L, Fikriyah A, Rosidi I (2020) Improving students’ scientific literacy through distance learning with augmented reality-based multimedia amid the COVID-19 pandemic. J Pendidikan IPA Indonesia 9(4):499–511
AlMazeedi S, AlHasan A, AlSherif O, Hachach-Haram N, Al-Youha S, Al-Sabah S (2020) Employing augmented reality telesurgery for COVID-19 positive surgical patients. Br J Surg 107(3):386–387
Al-Sa’d M, Kiranyaz S, Ahmad I, Sundell C, Vakkuri M, Gabbouj M (2022) A social distance estimation and crowd monitoring system for surveillance cameras. Sensors 22(2):418
Alshaweesh O, Wedyan M, Alazab M et al (2023) A new augmented reality system for calculating social distancing between children at school. Electronics 12(2):358
Anastasiou C, Costa C, Chrysanthis PK, Shahabi C, Zeinalipour-Yazti D (2022) ASTRO: reducing COVID-19 exposure through contact prediction and avoidance. ACM Trans Spatial Algorithms Syst (TSAS) 8(2):11
Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In: International conference on computer vision, pp 1–5
Carmigniani J, Furht B, Anisetti M, Ceravolo P, Damiani E, Ivkovic V (2011) Augmented reality technologies, systems and applications. Multimed Tools Appl 51:341–377
Chakareski J (2017) VR/AR immersive communication: Caching, edge computing, and transmission trade-offs. In: Proceedings of the workshop on virtual reality and augmented reality network, pp 36–41
Chakraborty S, Stefanucci J, Creem-Regehr S, Bodenheimer B (2021) Distance estimation with social distancing: a mobile augmented reality study. In: 2021 IEEE international symposium on mixed and augmented reality adjunct (ISMAR-adjunct), pp 87–91
Chen Z, Fan T, Zhao X et al (2021) Autonomous social distancing in urban environments using a quadruped robot. IEEE Access 9:8392–8403
Chen J, Jia L, Liang Y, Li Y (2023) Investigation and design response of social distancing in urban outdoor spaces under the normalized epidemic prevention background. Chin J Urban For 21(01):81–86
Darus NM, Saahar S (2022) Effective communication and organization culture in enhancing employee’s work performance during work from home (WFH). Malays J Soc Sci Humanit (MJSSH) 7(5):e001478–e001478
Duffy B, Smith K, Terhanian G, Bremer J (2005) Comparing data from online and face-to-face surveys. Int J Mark Res 47(6):615–639
Fong MW, Gao H, Wong JY et al (2020) Nonpharmaceutical measures for pandemic influenza in nonhealthcare settings-social distancing measures. Emerg Infect Dis 26(5):976
He F, Li M, Maniker RB, Kessler DO, Feiner SK (2021) Augmented reality guidance for configuring an anesthesia machine to serve as a ventilator for COVID-19 patients. In: Proceedings of the 2021 IEEE conference on virtual reality and 3D user interfaces abstracts and workshops (VRW). IEEE, Lisbon, pp 701–702
He Y, Xu D, Wu L, Jian M, Xiang S, Pan C (2019) Lffd: A light and fast face detector for edge devices. arXiv:1904.10633
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:1512.03385
Johnson GR, Morawska L (2009) The mechanism of breath aerosol formation. J Aerosol Med Pulm Drug Deliv 22(3):229–237
Kamga C, Eickemeyer P (2021) Slowing the spread of COVID-19: review of “social distancing’’ interventions deployed by public transit in the united states and Canada. Transp Policy 106:25–36
Kress BC, Cummings WJ (2017) Towards the ultimate mixed reality experience: Hololens display architecture choices. In: SID symposium digest of technical papers, vol 48. pp 127–131
Labib UA, Subiantoro AW, Hapsari WP (2021) Augmented reality based media for learning biology during the COVID-19 pandemic: Student admission. In: Proceedings of the 6th international seminar on science education (ISSE 2020). Atlantis Press, Yogyakarta, pp 899–905
Lelieveld J et al (2020) Model calculations of aerosol transmission and infection risk of COVID-19 in indoor environments. Int J Environ Res Public Health 17(21):8114
Lorusso ML, Giorgetti M, Travellini S, Gelsomini M, Roccetti M, Ferretti S, Casadei M, Cantamesse M, Marfia G, Bellini P (2018) Giok the alien: an AR-based integrated system for the empowerment of problem-solving, pragmatic, and social skills in pre-school children. Sensors 18(7):2368
Luck J, Gosling N, Saour S (2021) Undergraduate surgical education during COVID-19: Could augmented reality provide a solution? Br J Surg 108(1):129–130
Marks P (2020) Virtual collaboration in the age of the coronavirus. Commun ACM 63(9):21–23
Martí Mason D, Kapinaj M, Pinel Martínez A, Stella L (2020) Impact of social distancing to mitigate the spread of COVID-19 in a virtual environment. In: 26th ACM symposium on virtual reality software and technology (VRST ’20). ACM, New York, p 3
Mukhopadhyay A et al (2022) Virtual-reality-based digital twin of office spaces with social distance measurement feature. Virtual Real Intell Hardw 4(1):55–75
Mukhopadhyay A, Reddy GSR, Ghosh S, Murthy LRD, Biswas P (2021) Validating social distancing through deep learning and VR-based digital twins. In: Proceedings of the 27th ACM symposium on virtual reality software and technology, VRST ’21. Association for Computing Machinery, New York
Munzil M, Rochmawati S (2021) Development of e-learning teaching materials based on guided inquiry models equipped with augmented reality on hydrocarbon topics as teaching materials for COVID-19 pandemic. In: AIP conference proceedings, vol 2330. AIP Publishing LLC, p 20025
Nadikattu RR, Mohammad SM, Whig P (2020) Novel economical social distancing smart device for COVID-19. Int J Electr Eng Technol (IJEET)
Papagiannis H (2020) How AR is redefining retail in the pandemic. Harv Bus Rev 7
Roberto V, Buenaposada JM, Baumela L (2020) Multi-task head pose estimation in-the-wild. IEEE Trans Pattern Anal Mach Intell 43(8):2874–2881
Ruiz N, Chong E, Rehg JM (2018) Fine-grained head pose estimation without keypoints. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2074–2083
Speicher M, Hall BD, Nebeling M (2019) What is mixed reality? In: Proceedings of the 2019 CHI conference on human factors in computing systems, pp 1–15
Vanhaeverbeke J et al (2023) Real-time estimation and monitoring of COVID-19 aerosol transmission risk in office buildings. Sensors 23(5):2459
Wilson AD (2020) Combating the spread of coronavirus by modeling fomites with depth cameras. Proc ACM Hum Comput Interact 4(ISS):1–13
Zhu X, Lei Z, Liu X, Shi H, Li SZ (2016) Face alignment across large poses: A 3d solution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 146–155
Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: Proceedings IEEE conference on computer vision and pattern recognition (CVPR), pp 2879–2886
Acknowledgements
This work has been supported by the NSFC under Grants No.62133009 and 92148205, the Natural Science Foundation of Jiangsu Province Major Project under grant BK20232008, Jiangsu Key Research and Development Plan under Grant BE2023023-4, the Joint Fund Project 8091B042206 and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Contributions
Chaoran Li conducted software development and wrote the main manuscript text. Lifeng Zhu coordinated the planning and wrote the main manuscript text. Heng Zhang conducted experimental investigation and data presentation. Aiguo Song was responsible for managing and execution of the research activities. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (mp4 101804 KB)
Supplementary file 2 (mp4 17661 KB)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, C., Zhu, L., Zhang, H. et al. Efficient social distance maintenance with augmented reality. Virtual Reality 29, 21 (2025). https://doi.org/10.1007/s10055-025-01102-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10055-025-01102-7