1 Introduction

Real-time visualization of the degree of indoor congestion is very helpful to improve users’ experience in public spaces such as an event space, and it is also helpful for managers trying to equalize the density of people. In a public space, people gather in some popular shops or exhibition booths and shape the crowd around them. This crowd can often be the cause of trouble. For example, because of the narrow space available for walking, pedestrians are prone to falling over, and there is a high possibility that others will be tripped. If the degree of indoor congestion is visualized, users can identify crowded places to avoid congestion, and they can also reduce their waiting times. Moreover, managers of public facilities can equalize the density of the people to prevent problems caused by crowding. Therefore, visualization of the degree of indoor congestion helps in keeping public spaces safe and in improving users’ experience and management of facilities.

However, visualization of indoor congestion is difficult because it costs too much. If we install fixed sensors in all areas of a public space, the costs would be too high. For example, when the event space covers an area of \(20{,}000\,\mathrm {m^2}\), if the manager wants to sense congestion in each \(400\,\mathrm {m^2}\), 50 sensors would be needed. It is also a difficult task to install sensors in all areas. Sensors would need to be placed in each individual part and there would also be the need to calibrate sensors in some cases. This process would cost much time and human resources. Moreover, the layout of an event space changes frequently because of the changes of the contents. Sensors would need to be reinstalled to manage these frequent changes.

As mentioned above, visualization of the degree of congestion is helpful, but it is difficult because of the large costs involved. Hence, we propose a low-cost method for the visualization of the degree of indoor congestion with smartphone-based participatory sensing in this paper. In our proposal, we use the smartphones of the visitors in an event space as sensors and scan Bluetooth devices and Wi-Fi access points around them. Then, the degree of congestion of each area can be visualized from the scanned data.

The contributions of this work are:

  • the proposal of a low-cost system for the visualization of the degree of indoor congestion;

  • the development of a prototype system; and

  • the experimental evaluation of the proposed system.

The rest of the paper is organized as follows. Section 2 highlights the related work and Sect. 3 presents the requirements of the low-cost visualization system. Section 4 shows how we designed the system. Section 5 demonstrates the implementation of the prototype system and an evaluation is given in Sect. 6. Finally, Sect. 7 concludes our paper.

2 Related Works

Sensing pedestrian flow and pedestrian density has attracted a lot of attention, and it is an active area of study and development. Previous research has developed methods of sensing pedestrian flow and pedestrian density, but suitable methods for short-term low-cost sensing are limited.

The service-based visualization of the degree of congestion has already been published by some enterprises, because of the spread of smartphones [1, 2]. Users’ smartphones upload their positions obtained from a global positioning system (GPS) sensor when users use the map services, then the server of the enterprise processes the data and visualizes congestion. However, we cannot use GPS in an indoor environment because GPS signals are weakened by interference from the roofs or walls of buildings.

Light detection and ranging (LiDAR)-based pedestrian-flow sensing is an example method of pedestrian-flow sensing [3, 12]. In [3], the degree of congestion of the entire event space is clearly visualized. However, LiDAR is too expensive that to cover the whole space. Moreover, LiDAR is not good at responding to changes of the layout in spaces, because LiDAR detects pedestrians by detecting the difference between the obtained image and the background image of the space. Therefore, we need to prepare a background image when we change the layout of the space.

Audio-based and acceleration-based congestion classification is useful for estimating the smoothness of pedestrian flows [6]. They proposed a method to measure congestion by collecting sound and the acceleration of surroundings from smartphones. In their research, they analyzed the relationship between step intervals and congestion and the relationship between surrounding sounds and congestion by fast Fourier transformation (FFT). However, it is difficult to use this method in our case, because we need to measure the congestion in a public space such as an event space. Here, the sounds and the exhibition style are different in each case, for example, it is very loud in a music concert, but it is very quiet in a museum.

Bluetooth-scan-based sensing is the method we have chosen. In recent years, many Bluetooth devices have entered daily life such as smartphones, portable headsets, and wearable bands. A previous experiment in scanning Bluetooth devices in a museum has succeeded in analyzing the flow of visitors [11]. The authors installed fixed sensors at seven significant places in the museum and collected the MAC address emitted by Bluetooth devices. They reported that about 8.2% of visitors activated Bluetooth on their mobile device while in the museum. Moreover, previous research used the number of devices, the mean signal strength, and the variance of the signal strength of scanned Bluetooth devices for categorizing the degree of congestion [10]. They also classified the degree of congestion using a classifier tree with six features of the scanned Bluetooth devices [9]. This research succeeded in classifying crowd density with over 75% recognition accuracy on seven discrete classes.

As we have shown above, there are various methods for congestion sensing. Many of these methods achieved congestion sensing, but only the Bluetooth-based sensing is useful for our goal of a low-cost system of real-time visualization of indoor congestion. A detailed discussion of the method of selection for sensing is described in Sect. 4.3.

3 Requirements

The method for congestion sensing in public space needs to be low cost. In public spaces, we need frequent preparation for sensing because the layout of the public space can often change. For example, in a museum, the contents of a special exhibition change about every 3 months. In an exhibition hall, the layout changes more frequently. In extreme cases, changes can occur every day. Thus, we consider a method to reduce the costs of sensing.

We focused on three aspects of costs for sensing: time, money, and labor. Time means the time required for preparation and sensing. Installing sensors in all areas of the event space would require a large amount of time. Some types of sensors also need calibration for sensing. Money means the costs of purchasing and installing sensors. For example, LiDAR is a very elaborate sensor, but it often costs over US$5000. In addition, if we need to install many sensors in the space, these costs are multiplied. We also want to reduce the labor required for preparation and sensing. Installing sensors in a large space can require a lot of labor, and installing sensors in areas such as the ceiling can be a difficult task.

4 System Design

4.1 System Overview

In this section, we design the real-time visualization system of the degree of indoor congestion. In our system, there are three components: preparation for indoor localization, data collection, and visualization, as shown in Fig. 1. To localize sensing points, we need to prepare for indoor localization with methods such as an installation of sensors and calibration of sensors. After that, participants collect data to visualize congestion. Then, the system rotates the cycle of collection and visualization. We describe the design of each of these three components.

Fig. 1.
figure 1

Process of visualization

4.2 Preparation for Indoor Localization

We decided to adopt a Wi-Fi fingerprinting-based indoor localization method for the system because of its low-cost installation. In a fingerprint-based approach, we make a fingerprint by recording the received signal strength indicators (RSSIs) of Wi-Fi access points at each area in advance. We can localize the object by comparing the RSSIs of Wi-Fi access points with the fingerprint. Hence, we only need to scan Wi-Fi access points in each area in this method when we prepare for indoor localization.

There are two reasons why we selected the Wi-Fi access points fingerprint-based approach for localization. First, some methods need some installation for indoor localization, but in the Wi-Fi access points fingerprint-based approach, Wi-Fi access points are often already installed in public spaces. Second, in this method, we do not need to know the positions of Wi-Fi access points, unlike in other Wi-Fi-based methods.

Indoor localization is an area of active research. However, some methods need to install expensive sensors and some methods need to calibrate or replace sensors frequently. There are four approaches to Wi-Fi-based indoor localization: triangulation by the signal strength of multiple access points, fingerprinting of the signal strength of access points, triangulation by the angle of arrival of signals, and triangulation by the time of flight of signals. Recent research has realized decimeter-level indoor localization [5, 7]. As mentioned previously, unlike in other Wi-Fi-based approaches, in a fingerprint-based approach we do not need to know the positions of Wi-Fi access points. Radiofrequency identifiers (RFIDs) are also used for indoor localization. In the RFID-based method, reference RFID tags are deployed, and a reader can measure the signal strength from RFID tags. Then we can locate the position of the target RFID tag by comparing the signal strength of the target RFID tag with those of reference RFIDs. RFID-based localization has also realized decimeter-level localization [8]. A RFID tags are cheap sensors, but we would need to fix many reference RFID tags before localization. LiDAR and cameras can also be used for localization. They can localize pedestrians without pedestrians holding any devices, but they need calibration for installation and they are expensive.

4.3 Data Collection

In the data-collection phase, we need to collect congestion data and location data to visualize the degree of indoor congestion. We can collect location data by just scanning Wi-Fi access points. Hence, we need to consider a method to obtain congestion data.

We adopted a Bluetooth-based method for congestion sensing because of its low cost and precision. Bluetooth is a wireless communication technology at 2.4 GHz and is implemented on many devices such as smartphones, portable audio players, and smart watches. A Bluetooth device can scan other devices on its protocol and we now introduce the scanning process. The core architecture of Bluetooth is composed of three elements: controller stack, host stack, and host–controller interface (HCI). The controller stack defines the lower-level layers as a physical protocol including the physical layer and radio transceiver, and the host stack defines the higher level layers as logical protocols including application programming interfaces (APIs) and profiles. HCI delivers data between the host stack and controller stack. Bluetooth Low Energy (LE) is the controller stack that is available from Bluetooth version 4, and this protocol has no compatibility with legacy Bluetooth (the Basic Rate/Enhanced Data Rate (BR/EDR)). In Bluetooth BR/EDR, to discover other devices, a Bluetooth device can broadcast inquiry messages, and a device that can be discovered responds to inquiry messages with its ID. In this way, a Bluetooth device can discover nearby devices in BR/EDR. In Bluetooth LE, a Bluetooth device can enter the advertising mode to show its existence, and other devices can scan the advertising device by entering the scanning mode. Then, Bluetooth devices can also discover nearby devices in LE [4]. Bluetooth devices are categorized into three classes by their power: devices that can communicate within 1 m are categorized as class 3, within 10 m as class 2, and within 100 m as class 1. Almost all smartphones and headsets are class 2, so they can communicate within 10 m.

Now, we compare methods for congestion sensing from the five points of view of time cost, money cost, labor cost, indoor support, and precision of sensing congestion in public spaces. We compare service-based (with GPS), LiDAR-based, audio- and acceleration-based, and Bluetooth-based. GPS is a low-cost sensor in participatory sensing because we do not need to install sensors or purchase any equipment owing to pedestrians’ smartphones already including GPS modules. However, we cannot use GPS indoors, so the service-based method is difficult to adapt. The LiDAR-based method is not low cost because the sensors need to be installed and calibrated. Moreover, LiDAR sensors are far too expensive. The audio- and acceleration-based method is low cost and supports indoor sensing. However, as we mentioned in Sect. 2, this method is specialized to detecting the smoothness of pedestrian flow, so it is not suitable for our purpose. The Bluetooth-based method is good for low-cost sensing in participatory sensing because pedestrians’ smartphones already have Bluetooth modules, so we do not need to prepare additional equipment. Moreover, the Bluetooth-based method is available indoors and the precision of congestion estimation is good. Thus, we consider that the Bluetooth-based method with participatory sensing is the best method of congestion sensing. In Table 1 we present a comparison of the sensing methods.

Table 1. Comparison of methods for our use

4.4 Visualization

In the visualization phase, we considered two elements. First, we visualize the congestion map on the participants’ smartphones. Second, the server interpolates the congestion data if there is no data from some areas. In participatory sensing, it is difficult to rally participants without rewards. So we decided to visualize the congestion map on participants smartphones. Participants can receive beneficial information as the reward for participation. It is possible that there is no congestion data from some areas, so we decided to interpolate missing data using a spatiotemporal data interpolation method. There are labor costs involved if someone needs to collect data in each period. Congestion data are spatiotemporal data, so we consider that we can interpolate data with some methods.

5 Implementation

5.1 Overview of Our Prototype System

In this section, we describe the details of our implementation. The system is composed of two elements: a smartphone application and a server application. The smartphone application is implemented as an Android application. This application manages the scanning of Wi-Fi access points and Bluetooth devices, and visualization of the congestion map. The main functions of the server application are the localization of smartphones, congestion estimation, data storage, and data interpolation. An overview of the prototype system is shown in Fig. 2.

In preparation for the indoor localization phase, managers send information about the positions and Wi-Fi access points data using the smartphone application. Then, the server application stores the data of Wi-Fi access points in the database of Wi-Fi access points. In the data-collection phase, the smartphone application automatically scans Wi-Fi access points and Bluetooth devices and sends data to the server. The server application stores the data in the congestion database. In the visualization phase, the server application interpolates the missing data by kriging. When the smartphone application requests the congestion data, the server application returns the interpolated data and the smartphone application visualizes the data on the map. We now demonstrate the details of each phase.

Fig. 2.
figure 2

System overview

5.2 Preparation for Indoor Localization

The manager of a public space can easily make a Wi-Fi fingerprint using the Wi-Fi scan mode of the smartphone application. In the Wi-Fi scan mode, the smartphone application sends the position and data of Wi-Fi access points to the server. Then the server application stores received data in the database. We can make a Wi-Fi access point fingerprint by simply doing this in each area.

The detailed usage of the Wi-Fi scan mode is described in the following. First, the manager fills in forms about the position (building name, level, room number (specification of the place), latitude, and longitude). Second, the manager pushes the send button, the smartphone application then scans the Wi-Fi access points and sends the information about the scanned access points and the input position to the server. A sample image of the Wi-Fi scan mode is given in Fig. 3a.

The Wi-Fi scan mode has a data delete function. If the manager sends data with delete checked, corresponding data will be deleted from the database on the server. Moreover, the smartphone application stores the history of sent data, so the manager can fill in the forms from history.

5.3 Data Collection

A participant can collect location data and congestion data by just holding their smartphone with the smartphone application installed. The smartphone application automatically scans Bluetooth devices and Wi-Fi access points every 30 s. After scanning, the application sends the data to the server as a JavaScript Object Notation (JSON) object over Hypertext Transfer Protocol (HTTP). The server receives the data from smartphone applications, then the server localizes the positions of the participant from scanned Wi-Fi access points data by comparing Wi-Fi fingerprints. After localization, the Bluetooth data are stored in the database with the localized position. When the server estimates the congestion, the server looks up the Bluetooth data and counts the number of Bluetooth devices. Hence, participants can achieve data collection for visualizing congestion by just walking around with the smartphone application.

5.4 Visualization

When a participant wants to see the real-time congestion map, they can load the congestion map by pushing the request button. When the request button is pushed, the smartphone application requests congestion data from the server. Then the server returns congestion data as a JSON object over HTTP. The smartphone application visualizes the degree of congestion by painting colors on the map. At present, we have decided to paint the high-density areas red and the low-density areas blue.

If the server does not have Bluetooth data from some areas, the server interpolates the congestion with kriging. In the prototype implementation, the interpolation program always runs in the background. The program interpolates the data and stores the interpolated congestion in the database.

Fig. 3.
figure 3

Screenshot of the smartphone application

6 Evaluation

We set up two experiments for evaluating our system. First, we evaluated the cost to prepare for indoor localization. We take the cost of setting up the system into account and then we need to evaluate this. Second, we evaluated the relationship between congestion and the number of Bluetooth devices. Previous research has already evaluated Bluetooth-based congestion sensing, but these approaches did not consider how wide an area is covered by the scanning Bluetooth devices. Thus, we need to evaluate the Bluetooth-based congestion-sensing method.

We performed our experiments at Makuhari Messe International Exhibition Hall 9–11 and at the Yaesu Shopping Mall. Makuhari Messe is the one of the largest exhibition halls in Japan. Tokyo Auto Salon 2017 was being held at the Makuhari Messe Hall during our experiment. Tokyo Auto Salon 2017 is an automotive industry showcase and demonstrate the latest technologies of automobile companies. The Yaesu Shopping Mall is the one of the largest shopping malls in Japan. The experimental environments are shown in Table 2. We partitioned the two venues into grid cells (\(20\,\mathrm {m}\,\times \,20\,\mathrm {m}\)) as shown in Fig. 8. Red stars denote sensing points, red lines denote main thoroughfares, and yellow rectangles denote booths or shops.

Table 2. Information about the experimental space

6.1 Cost to Prepare for Indoor Localization

We made Wi-Fi access points fingerprints by scanning Wi-Fi access points at each grid cell at Makuhari Messe International Exhibition Hall 9–11. Only three people were involved. The instruction of usage of the application took 10 min. After that, we scanned Wi-Fi access points at 40 grid cells. It took 30 min to scan all the grid cells. Thus, it took only 40 min to prepare for indoor congestion visualization in the main hall.

Table 3. Assigned IDs to data

6.2 Relationship Between Congestion and Number of Bluetooth Devices

To evaluate the relationship between congestion and the number of Bluetooth devices, we scanned the Bluetooth devices and counted the number of people in Makuhari Messe and the Yaesu Shopping Mall. We first describe how we collected the Bluetooth data and the pedestrian data. For scanning Bluetooth devices, we developed an Android application. By using this application, an Android smartphone scans Bluetooth BR/EDR devices for about 12 s and scans Bluetooth LE devices for 5 s. We scanned Bluetooth devices from three smartphones simultaneously, then we pick up all device data (Address, RSSI) scanned by three smartphones and if scanned devices are duplicated, selected the device whose RSSI is maximum. We counted the number of pedestrians in different ways between Makuhari Messe and the Yaesu Shopping Mall. In Makuhari Messe, we took photographs using a Ricoh Theta S and counted the number of people from the photographs. We counted the number of people within a distance of \(10\,\mathrm {m}\). In some photographs, we also counted the number of people within a distance of \(15\,\mathrm {m}\). In the Yaesu Shopping Mall, we counted the number of surrounding people within a distance of \(15\,\mathrm {m}\). The scenes of our measurements are shown in Fig. 4d.

Fig. 4.
figure 4

Experiment

The results of the experiments are shown in Fig. 5. For the sake of simplicity, we assigned IDs to the obtained pedestrian data as shown in Table 3. The lines on the graphs are regression lines. We consider the \(y = ax\) model about the relationship between the number of pedestrians and the number of Bluetooth devices. In Fig. 5b, we can see a relationship between the number of Bluetooth devices and the number of pedestrians, but this is not visible in Fig. 5a.

Fig. 5.
figure 5

Relationship between the number of pedestrians and the number of bluetooth devices

We calculated Pearson correlation coefficients between the number of pedestrians and the number of Bluetooth devices to examine the relationship between them. Before calculating the Pearson correlation coefficients, we applied the Shapiro–Wilk test to both the Bluetooth data and the pedestrian data for an \(\alpha \) level of 0.05. The p-values of the tests are shown in Table 4. We cannot reject the null hypothesis that the data are from a normally distributed population except for Bluetooth LE data of C. Hence, we calculated the Pearson correlation coefficients, and the results are shown in Table 5. As shown in Table 5, the number of pedestrians and the number of Bluetooth devices has some correlation in data B and C, but it has hardly any correlation in data A. We considered that the reason why we could not find a correlation in data A was the short radius of counting pedestrians. In light of these results, our application detected Bluetooth devices out of range of \(10\,\text {m}\).

Table 4. The p-values of the Shapiro–Wilk test
Table 5. Pearson correlation coefficient between the number of pedestrians and the number of bluetooth devices
Fig. 6.
figure 6

Pearson correlation coefficient between the number of pedestrians and the number of bluetooth devices with various RSSI thresholds

Fig. 7.
figure 7

Visualization of pedestrian data B and corresponding bluetooth data (Color figure online)

Fig. 8.
figure 8

Layouts of the experimental space (Color figure online)

We also calculated the Pearson correlation coefficients with various thresholds of RSSI of Bluetooth devices because we considered that the RSSI thresholds exclude some influence of distant devices. The results are shown in Fig. 6. In Fig. 6a, the Pearson correlation coefficient is constantly low. It is possible that we cannot exclude the influence of distant devices. In Fig. 6c the shape of the graph is particularly different from others. We consider that the environment of the Yaesu Shopping Mall affected the results. The Yaesu Shopping Mall has many walls unlike Makuhari Messe, and walls can interfere with the Bluetooth signal. However, in Fig. 6b, the Pearson correlation coefficients have a peak around \(-83\,\text {dB}\). This result shows the achievement of removing the noise of distant devices.

Finally, we visualized the pedestrian data B and the corresponding Bluetooth data as shown in Fig. 7. We categorized congestion degrees into four classes as shown in Table 6 and red color means high density and blue color means low density. The threshold of Bluetooth RSSI is set as \(-83\) and the Bluetooth devices include both Bluetooth BR/EDR and Bluetooth LE. In this setting, the Pearson correlation coefficient is 0.482.

Table 6. Congestion degree categories

7 Conclusion

In this paper, we proposed a system of real-time visualization of the degree of indoor congestion with smartphone-based participatory sensing. Then, we demonstrated the design and implementation of the prototype system. We also set up two experiments, and the results of the experiments show the ease of preparing and installing the proposed system, and also show the relationship between the number of pedestrians and the number of Bluetooth devices. These results support the applicability of our proposed system.

As future work, we have two tasks. First, we need to investigate the validity period of scanned data. In participatory sensing, we cannot control the sensing interval in one area, so it sometimes happens that we cannot obtain data for some period. Thus, we need to decide the validity period of scanned data. Second, we consider developing a function for predicting congestion. Congestion data is spatiotemporal data, so we consider that it can be predicted from historical data and data from the surrounding areas.