Towards Sensing and Sharing Auditory Context Information Using Wearable Device

Sashima, Akio; Kawamoto, Mitsuru

doi:10.1007/978-3-030-43887-6_5

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1168))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1424 Accesses

Abstract

Data-driven information services using wearable devices have attracted attention in the areas of healthcare, medical care, and educational services. In the services, the users’ daily behaviors are modeled with the sensing data of the physical statuses of individual users, e.g., body movements, heart rates, etc. However, to understand human behaviors more deeply, it is also important to know the context information of the users, such as the surrounding environment and participating activities. In this paper, we describe extracting auditory context information from ambient sound data sensed by smart watches. First, we describe a prototype of our wearable ambient sound sensing system by using smart watches. Then, we describe an analysis of the sound data sensed by the system. We formalize the context extraction process as unsupervised segmentation of multi-dimensional time-series data and apply non-negative matrix factorization (NMF) and k-means clustering to the segmentation at the first step of the study. We confirm that the periods segmented by the analysis roughly correspond to actual contexts.

You have full access to this open access chapter, Download conference paper PDF

Data Collection and Analysis for Automatically Generating Record of Human Behaviors by Environmental Sound Recognition

Analyzing Sounds of Home Environment for Device Recognition

Extraction of Key Segments from Day-Long Sound Data

Keywords

1 Introduction

Recently, wearable devices are important tools for recent data-driven information services: healthcare, medical care, and educational services. For example, smart watches (e.g., Fitbit [1], Apple Watch [2], etc.) can continuously sense users’ activities, model the users’ daily behaviors based on the sensing data to provide personalized information services. Currently, most of wearable devices are used for sensing physical statuses of users, e.g., body movements, heart rates, etc. and have often ignored to know background information of the behaviors, namely context information. However, to understand human behaviors more deeply, it is important to know context information, such as the surrounding environment and participating activities. If context information is available by using wearable devices, it is possible to create a new service based on deep understanding of the behaviors.

For example, we have developed a wearable telecare system by using smart watches and wireless biosensors [3]. The service enables family members to share physiological information of cared persons in a peer-to-peer manner. If the family members can get the context information of their cared persons, it can be useful to understand the meaning of physiological data.

In this paper, we describe how to extract auditory context information from ambient sound data sensed by smart watches. Although researches of mobile audio sensing using smartphones have been proposed [4, 5], researches using wearable devices to extract context information are few. As researches using wearable devices are still immature compared to smartphones, it is necessary to investigate the potential use of wearable sensing.

This paper is organized as follows: first, we show the prototype of our wearable ambient sound sensing system by using smart watches. Then, we describe an analysis of the sound data sensed by the system. As the first step of the study, we have applied non-negative matrix factorization (NMF) [6] and k-means clustering to unsupervised segmentation of time-series data. At last, based on the implementation and analysis, we discuss issues and future works of sensing and sharing context information by using wearable devices.

2 Wearable Ambient Sound Sensing System

We describe the prototype of our wearable ambient sound sensing system by using smart watches.

2.1 Wearable Device

The wearable ambient sound sensing system uses a commercially available smart watch: Polar M600 [7]. We have implemented the system as an application program on the Wear OS by Google™smartwatch OS.

The system has sensing and analyzing functions to extract auditory features from ambient sound data. A graphical user interface (GUI) to start/stop the sensing process is also implemented. The GUI using MPAndroidChart [8] shows the recording status of smart watch, such as an auditory feature vector. Currently, communication facilities to share the sensing data with other users are not implemented.

2.2 Auditory Sensing Data

Analyzing processes of the prototype system consist of three processes: sensing ambient sound, converting it to auditory feature vectors, and recording them.

In the sensing process, a microphone of the device senses ambient sound as 16kHz, 16bit, monaural audio format. Then in the converting process, the device immediately converts the data to audio feature vectors, Mel-Frequency Cepstrum Coefficients (MFCC), pitch (F0), and sound pressure (SPL). In the process, the system uses fast Fourier transform (FFT) to obtain the power spectrum of the signal, and calculates F0 and SPL. Then, to derive MFCC, it executes some steps transforming the spectrum so as to reflect human auditory characteristics (e.g., Mel-scale, vocal tract, etc.). We have implemented the sensing and converting processes by using an open source software: TarsosDSP [9]. In the recording process, it records the auditory data in its internal memory. We can retrieve the auditory data through a USB connection.

3 Extracting Context Information

Using our prototype system, we experimentally obtained auditory data of a user who come home from his office by bicycle. After obtaining the data, we have analyzed the data to investigate the feasibility of extracting context information. The auditory data has been analyzed offline.

In this study, we assume that the auditory contexts are latent, continuous periods characterized by auditory features. For example, if a user stayed at home until noon and then moved to a hospital by car, the home, car, and hospital are contexts of the user because they can be characterized by auditory features. As auditory data recorded with the system records is a multi-dimensional time-series data, it means that each context corresponds to a segment of the time-series classified by auditory features.

3.1 Segmentation of Multi-dimensional Time-Series Data

At the first step of the study, we have applied NMF and k-means clustering to segmentation of multi-dimensional time-series data.

The obtained data is 32 dimension time-series data. It represents MFCC, SPL, F0. The length is 48271. At first, we converted the data to the averaged data every 30 s, and normalized it so that it did not contain negative values. The normalized data is shown as a heat map in Fig. 1. In the heat map, the x-axis represents each feature of the data, the y-axis represents the time when the data was recorded. Each cell corresponds to a feature value at each time. Darker cells represent larger values and lighter cells represent smaller values.

As the data still contains artifacts which was caused by body movements. We applied NMF to factorize the data so as to extract important features and reduced the artifacts. In this experiment, the dimension of factorized data is 6. Figure 2 shows a heat map of the factorized data representing a change of the auditory features. The intensity of color indicates the degree of correlation with each factor.

NMF factorizes a matrix according to non-negative constraints. The bases tend to represent local features of the data, namely “parts.” As the parts can better correspond to intuitive notions by humans, it is useful to visualize and understand auditory contexts. The bases of other methods, such as principal component analysis and vector quantization, do not always correspond to the intuitive notions [10].

Then we classified each vector in the factorized data using k-means clustering (k = 4). The clustering was based on Euclidean distance of the 6 features of each period.

Figure 3 is a line graph that plotting the class labels, 0–3, assigned to each period along a time axis. We can see that consecutive periods tend to be classified into the same cluster. This means that we have succeeded in extracting temporally continuous acoustic features and segmenting the time-series data into four contextual periods. Additionally, the extracted features roughly correspond to actual contexts, such as riding a bicycle and eating at home.

4 Discussion and Future Work

This study is at the preliminary stage, so there are a lot of issues to realize the system that senses and shares auditory context information using the wearable devices.

An issue is when and where context information is extracted from the sensing data and how it is shared. The analysis in this study is performed offline on a personal computer. If we share the context information on wearable devices, we require to design the total architecture of the service system, which includes designing communication, data processing architecture, and user interactions. Designing efficient architecture with limited computation resources of wearable devices is future work.

Considering the privacy issue, the system is not designed to record raw sound data. So it is difficult to validate the correctness of extracted context information based on the raw data. To scale up the prototype to practical service, more sophisticated unsupervised (or sem-unsupervised) approaches should be required.

5 Conclusion

We have shown the prototype of our wearable ambient sound sensing system by using smart watches. We experimentally obtained ambient sound data sensed by the system and analyzed the data to extract context information. In the analysis, we formalized the context extraction process as an unsupervised segmentation of multi-dimensional time-series data and applied NMF and k-means clustering to the segmentation. We confirm that the periods segmented by the analysis roughly corresponded to actual contexts.

References

Fitbit: https://www.fitbit.com/home. Accessed 28 June 2019
Apple Watch: https://www.apple.com/watch/. Accessed 28 June 2019
Sashima, A., Kawamoto, M., Kurumatani, K.: A peer-to-peer telecare system using smart watches and wireless biosensors. Health Technol. 8(5), 317–328 (2018). https://doi.org/10.1007/s12553-018-0240-8
Article Google Scholar
Lu, H., Pan, W., Lane, N.D., Choudhury, T., Campbell, A.T. : SoundSense: scalable sound sensing for people-centric applications on mobile phones. In: Proceedings of the 7th International Conference on Mobile Systems, Applications, and Services, pp. 165–178, ACM, New York (2009)
Google Scholar
Lane, N.D., Georgiev, P., Qendro, L.: DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In: Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 283–294, ACM (2015)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Proceedings of the 13th International Conference on Neural Information Processing Systems, pp. 535–541 (2000)
Google Scholar
Polar M600: https://www.polar.com/us-en/products/sport/M600-GPS-smartwatch. Accessed 28 June 2019
MPAndroidChart: https://github.com/PhilJay/MPAndroidChart. Accessed 28 June 2019
TarsosDSP: https://github.com/JorenSix/TarsosDSP. Accessed 28 June 2019
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Article Google Scholar

Download references

Acknowledgement

This study was partly funded by Japan Society for the Promotion of Science (grant number 17K00145) and JST CREST, JPMJCR18A4.

Author information

Authors and Affiliations

Human Augmentation Research Center, National Institute of Advanced Industrial Science and Technology (AIST), Kashiwa, Japan
Akio Sashima & Mitsuru Kawamoto

Authors

Akio Sashima
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuru Kawamoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akio Sashima .

Editor information

Editors and Affiliations

Institut National des Sciences Appliquées, Rennes, France
Peggy Cellier
Maastricht University, Maastricht, The Netherlands
Kurt Driessens

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sashima, A., Kawamoto, M. (2020). Towards Sensing and Sharing Auditory Context Information Using Wearable Device. In: Cellier, P., Driessens, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham. https://doi.org/10.1007/978-3-030-43887-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-43887-6_5
Published: 28 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43886-9
Online ISBN: 978-3-030-43887-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)