1 Introduction

As urban sensing technologies advance, there is an increasing amount of macroscopic and microscopic data about urban spaces, which are collected and stored in digital forms. For example, mobile phones generate a large amount of Call Detail Records (CDR) that allow researchers to analyze macroscopic patterns of urban dynamics. Surveillance cameras generate a lot of image data that can be used to analyze microscopic patterns of pedestrians and vehicles. Collecting such data would enable effective spatial analytics for designing and improving commercial spheres, railway station spheres, etc. However, it continues to be difficult to collect rich, microscopic data about large spaces such as city blocks and neighborhoods.

In this paper, we build on our previous research [1, 4] to explore techniques and tools for collecting detailed behavioral data in large public spaces by deploying a small number of technology-armed researchers who act according to mobile notifications. To go beyond the limitations to conventional urban sensing, we first examine the challenges of human-in-the-loop sensing. We then propose a mobile behavior sampling tool based on smart notifications so as to address the challenge of in-situ sampling.

One of the challenges in designing smart notifications is to optimize collective data collection efforts to satisfy clearly defined criteria such as minimization of biases. We thus discuss notifications that is based on statistical and spatial models to address this challenge. In case the goal of data collection is to construct datasets for machine learning algorithms, we could exploit proactive and active machine learning techniques as well.

Another challenge is due to the wickedness [12] of the data collection for understanding (and designing for) people and their practices in urban spaces. In particular, we need to consider shifting modes of observation [13] and exploratory processes [5], and thus focus on satisficing rather than optimal solutions in many cases.

2 Limitations to Conventional Urban Sensing

Despite the recent advances in urban sensing technologies, it is still difficult to collect sufficient data for detailed analysis of various human behaviors at scale.

Conventional approaches to urban data collection have a number of limitations as discussed below:

  1. (i)

    Pedestrian traffic census is widely used to quantify the number of people who pass by a particular location in a city. An apparent limitation to this approach is the high cost of direct observation.

  2. (ii)

    Mobile phone carriers collect Call Detail Record (CDR), which allows for tracking of mobile-phone users. However, the granularity of the data collected by using this approach is often quite coarse in terms of time and space.

  3. (iii)

    We can also ask volunteers to carry location tracking devices such as GPS receivers. However, doing so requires time-consuming and costly processes to prepare the devices and setting up complicated technological infrastructures. In addition, it is difficult to collect data from a large number of unbiased samples using this approach. Thus, we could not easily understand the behaviors of the entire population in the physical space of concern.

  4. (iv)

    We can use radio signals from WiFi and/or Bluetooth-enabled commodity devices such as smartphones to estimate the locations of people in indoor spaces. In addition, advanced GNSS (Global Navigation Satellite System) technologies allow for detailed location tracking in outdoor spaces as well. However, the data collected by using these technologies would not be representative of everyone in the physical space of concern. Also, it would be difficult to employ this approach successfully without addressing privacy concerns.

  5. (v)

    Networks of surveillance cameras may facilitate collection of detailed unbiased data. In particular, Benenson et al. [8] have exploited deep learning algorithms to derive pedestrians’ detailed behavioral information from video-camera images. However, it may be difficult to install video cameras in certain urban spaces because of privacy concerns. Moreover, trees, vehicles and buildings can occlude the views of video cameras.

  6. (vi)

    Device-free localization and activity recognition techniques [9] exploit various sensor-detectable patterns such as the changes of ambient radio signals. Despite the developments in this area, it is still difficult to collect detailed data at scale just by relying on this technology.

Crowd replication [1] has been proposed to address some of these limitations. It relies on sensor-armed volunteers who mimic behaviors of people in public spaces. The volunteers record data from their own mobile and wearable devices while replicating the behaviors of people in proximity. The feasibility of this approach has been tested through a field experiment involving 4 sensor-armed volunteers who collected data about a large space near a train station in Japan. A critical aspect of this approach is the sampling strategy for determining the people whose behaviors are to be mimicked by researchers. Without a proper sampling strategy, we may end up collecting biased and/or useless data. In this paper, we propose approaches that could complement this technique.

3 Challenges for Human-in-the-Loop Urban Sensing

We introduce five approaches to enable meaningful analysis of various human behaviors at scale, i.e., in-situ sampling, estimating social activities and emotions, improving data quality in context, meta-sensing, and context-aware privacy and data modeling.

3.1 In-Situ Sampling

In-situ sampling is the act of selecting samples in the field. For example, an urban researcher may in-situ sample a next person to observe in a public space. We can devise computational tools for supporting this process such as a mobile tool that recommends targets for crowd replication [1] or direct observation. We can design such a tool based on relevant statistical models, thereby supporting or scaffolding the in-situ decision making by researchers.

In-situ sampling tools can consider various models of the real-world to support researchers to collect data efficiently. For instance, Tobler’s first law of geography or some patterns of spatial auto-correlations can be exploited when investigating a relevant real-world phenomenon.

3.2 Estimating Social Activities and Emotions

We can use machine learning techniques to infer social activities and emotions of the people in public spaces. Our preliminary analysis of human activities in public spaces suggested that the group size and the strength of body motion are correlated with different social activities. Moreover, in line with existing research (e.g., [10]), we can exploit off-the-shelf devices such as smartphones, smart watches and smart glasses to analyze detailed motions and infer geospatial emotional perception. In addition, those devices allow for collection and analysis of proximity patterns that may influence or reflect social interactions and activities. These provide starting points for constructing richer datasets about behaviors in public spaces.

3.3 Improving Data Quality in Context

We can exploit contextual information to improve the quality of collected data. For example, when researchers observe or mimic behaviors of people in cities, we should consider the impact of distances on human perception [11]. We should also consider temporal distances between the time something happens and the time at which it is recorded. These contextual factors may have different impact on different researchers in terms of the quality of the data they produce. In this context, we can employ data-centric approaches to develop various types of personalized mechanisms for the improvement of data quality.

3.4 Meta-sensing

There are other approaches to improve the quality of collected data. For example, we can use notifications asking multiple researchers to observe a same target person. We can also use notifications to have researchers observe other researchers. These types of observation could be considered as ‘sensing’ of the urban sensing environment itself, which we call urban meta-sensing, and allow for evaluation of researchers as well as ‘calibration’ of the human-in-the-loop urban sensing system.

3.5 Context-Aware Privacy and Data Modeling

We should collect data carefully in order to address privacy concerns of the people in public spaces. Even when researchers collect anonymous data only, observing or mimicking behaviors of pedestrians in close proximity would disturb people. In this case, it may not be so much about the privacy with the collected data. However, it is clearly important to protect the privacy of people in the physical space since privacy is inseparable from physical distances. It seems also inseparable from temporal factors such as the lengths of observation (e.g., milliseconds, seconds, minutes, hours, and days).

As distances and temporal lengths can affect data quality as well as privacy concerns, we argue for a modeling approach that considers both privacy and data quality in relation to their physical context such as distances and temporal lengths. This can lead to a development of a customizable, privacy-aware and data quality-aware framework for data collection.

4 Smart Notifications for In-Situ Sampling

Next we discuss the design and development of smart notifications that support researchers and volunteers to select the targets for data collection in situ in urban spaces. In general, it is too time consuming and costly to observe everyone in a public space. One might consider deploying a large number of researchers for exhaustive data collection. However, observer effects would make such an approach infeasible. We thus focus on the need to collect unbiased useful data by observing or replicating behaviors of a limited number of people. In this context, we propose a mobile tool that recommends appropriate targets for observation (or replication) based on relevant sampling methods [14,15,16]:

The mobile tool is intended to support the following three sampling methods:

  1. 1.

    Notification-based sampling method: We assume that researchers and/or volunteers use mobile devices such as smartphones, smart watches, or smart glasses to receive notifications from the computational backend system. The backend system collaborates with mobile clients to trigger notifications based on different sampling strategies so as to enhance the perceptions and support the in-situ decision making of researchers and volunteers.

  2. 2.

    Simulation-based spatial and cluster sampling method: Although we sometimes desire to collect perfect data, we must consider a number of practical constraints when collecting data in the real world. There are a number of practical sampling methods for observational studies [14]. In general, these practical sampling approaches consider the nature of data collection and relevant research goals in order to maximize the usefulness of the collected data while keeping the costs of data collection reasonably small. Of particular interest in the context of this research is spatial sampling [16] and cluster sampling [15] as they can be applied to the analytics of behaviors in urban spaces. One of the challenges in applying these techniques to urban spaces is potential limitation of the models of spaces and clusters. We thus consider the uses of simulation-based modeling of spaces and clusters.

  3. 3.

    Adaptive sampling method: When researchers and volunteers receive a notification, it would guide them to a location at which appropriate samples can be observed. It would also show how they can select samples at the location (e.g., “select the person that arrives at the location next”). In this manner, the system combines computational algorithms and “physically-based algorithms” to select samples. As they receive multiple notifications and observe multiple targets, the system incrementally accumulates the data that would be useful for improving and refining the future sampling processes. In this context, we exploit adaptive sampling as part of the mobile tool.

We intend to evaluate the effectiveness and usability of the mobile tool and the backend system, and improve it iteratively. This will include simulation-based evaluation to evaluate different aspects of the collected data as well as usability evaluation based on common assessment tools such as NASA-TLX.

5 Simulation-Based Modeling of Spaces and Clusters

As discussed in the previous section, we consider the uses of simulation-based modeling of spaces and clusters. We first use a model of pedestrian behaviors to generate large datasets of simulated pedestrian behaviors within the specified city blocks and neighborhoods. Subsequently, we derive relevant clusters and spatial patterns based on the datasets.

We have extended the Social Force Model (SFM) [17] by integrating it with a probabilistic model of route-choice behavior [19]. SFM emulates the motion of pedestrians as if they act according to “social forces.” This model can consider shapes and structures of roads and interaction among pedestrians, and can produce finer-grained pedestrian movements than other simplistic models such as the Random Waypoint Model.

In this model, the velocity \( \vec{v}_{i} \) of a pedestrian is governed by the four force terms:

$$ \frac{{d\overrightarrow {{v_{i} }} }}{dt} = {\vec{f}}_{i} + {\vec{f}}_{iB} + \mathop \sum \limits_{j \ne i} {\vec{f}}_{ij} + \mathop \sum \limits_{k} {\vec{f}}_{ik} + fluctuations $$
  1. 1.

    \( {\vec{f}}_{i} \) is the acceleration toward the next destination considering a desired speed

  2. 2.

    \( {\vec{f}}_{iB} \) is the repulsive force due to borders

  3. 3.

    \( f_{ij} \) is the similar repulsive force due to pedestrian j

  4. 4.

    \( {\vec{f}}_{ik} \) is the attractive force due to people, objects, and events at position \( \vec{k} \)

The above equation also takes into consideration fluctuations due to accidental or deliberate deviations from the optimal behavior. The desired speed is approximately Gaussian distributed with a mean value of 1.3 m/s and a standard deviation of 0.3 m/s [18].

Again, we have extended the SFM by integrating it with a simple probabilistic route-choice behavior. At an intersection, a pedestrian who walks on the left sidewalk of a street turns left or goes straight with the same probability of 0.5. Similarly, we determined the probabilities of pedestrians walking on the right sidewalk of a street. A map illustrated in Fig. 1 is drawn on the basis of real streets in downtown Tokyo. Three circular sensing areas (A, B, and C) with a radius of 10 m for evaluation purpose are also shown in the figure. The sensing areas A, B, and C were selected as a representative of vertical street, horizontal street, and intersection, respectively.

Fig. 1.
figure 1

Sample visualization of simulated pedestrian behaviors based on the Social Force Model.

6 The Mobile Behavior Sampling Tool

Next we describe the client-server system architecture of our mobile behavior sampling tool (see Fig. 2).

Fig. 2.
figure 2

System architecture of mobile behavior sampling tool

The software on the client side is based on Community Reminder, a smartphone-based platform that supports community members to design and use context-aware reminders [6]. Its mobile context sensing module exploits the AWARE Framework [20] to detect various mobile contexts, which are used to trigger notifications at the right time. Researchers and volunteers can use Android smartphones, smart watches, and smart glasses to receive and respond to notifications.

The server manages the collected data and notifications, and provides the core mechanisms for mobile behavior sampling including the simulation of pedestrian behaviors.

7 Conclusion

To go beyond the limitations to conventional urban sensing, we have examined the challenges for human-in-the-loop sensing, including in-situ sampling, estimating social activities and emotions, improving data quality in context, meta-sensing, and context-aware privacy and data modeling. We then proposed smart notifications with the aim to address the challenge of in-situ sampling. Our smart notifications play critical roles in the proposed mobile tool that supports notification-based sampling method, simulation-based spatial and cluster sampling method, and adaptive sampling method. Moreover, it employs SFM-based modeling of spaces and clusters to improve sampling. We also described the system architecture to integrate different components and provide services on different mobile devices, including smartphones, smart watches, and smart glasses.

We expect that the proposed system will help people collect rich, microscopic data about large spaces such as city blocks and neighborhoods as it has the following advantages:

  1. 1.

    It can help collect data about all kinds of people in public spaces including elderly people and children who may not have GPS-enabled smartphones. It would also facilitate the uses of collected data for various purposes including inclusive design of urban environments, and development of personalized, context-aware digital services.

  2. 2.

    It allows for smart sampling of the targets for observation. We therefore expect that researchers and volunteers will be able to collect quality data with smaller biases than what can be collected by using existing urban sensing systems. Our smart sampling mechanisms could also be used in different spaces besides urban public spaces.

  3. 3.

    We also discussed approaches to improve data quality by exploiting contextual factors and meta-sensing. They can be considered in the design of the next versions of the system.

  4. 4.

    Context-aware privacy and data modeling can help select appropriate data collection methods in different situations and enhance the privacy of urban inhabitants. This can also be considered in the design of the next versions of the system.

Our future plans include iterative refinement of the system architecture and integration of system components as well as a full test of feasibility in different urban spaces. We also intend to incorporate more features for supporting exploratory data collection processes and shifting modes of observation.