skip to main content
10.1145/3606038.3616155acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Personalised Speech-Based Heart Rate Categorisation Using Weighted-Instance Learning

Published: 29 October 2023 Publication History

Abstract

Running as one of the most popular sports comes with many positive effects, but also with risks. Most injuries are caused by overexertion. To optimise training and prevent injuries, approaches are needed to easily monitor training behaviour. Previous research has shown that heart rate (HR) can be automatically classified using speech data. Real-world applications pose challenges due to the heterogeneity of individuals, which is why we introduce a personalised HR classification in this work. In particular, we first determine runners in the train set with similar acoustic patterns (x-vectors) compared to a runner in the test set. Further, we extract deep representations and hand-crafted features from the input data. Subsequently, using the computed similarity values, we adapt a Support Vector Machine (SVM) for each individual. In this context, we choose the runners with the lowest Euclidean distances and weight their train samples more heavily during the training process of the SVM. Our personalised approach yields a best relative improvement of 20.8% compared to a non-personalised model in a 5-class HR classification task. The obtained results demonstrate the effectiveness of our approach, paving the way for real-world, personalised applications.

References

[1]
Shahin Amiriparian, Maurice Gerczuk, Sandra Ottl, Nicholas Cummins, Michael Freitag, Sergey Pugachevskiy, Alice Baird, and Björn Schuller. 2017. Snore sound classification using image-based deep spectrum features. (2017), 1--5.
[2]
Shahin Amiriparian, Tobias Hübner, Vincent Karas, Maurice Gerczuk, Sandra Ottl, and Björn W Schuller. 2022. Deepspectrumlite: A power-efficient transfer learning framework for embedded speech and audio processing from decentralized data. Frontiers in Artificial Intelligence, Vol. 5 (2022), 1--10.
[3]
Shahin Amiriparian and Björn Schuller. 2022. Ai hears your health: Computer audition for health monitoring. In Proc. ICT for Health, Accessibility and Wellbeing. Springer, Larnaca, Cyprus, 227--233.
[4]
Dwaipayan Biswas, Neide Sim oes-Capela, Chris Van Hoof, and Nick Van Helleputte. 2019. Heart rate estimation from wrist-worn photoplethysmography: A review. IEEE Sensors Journal, Vol. 19, 16 (2019), 6560--6570.
[5]
Moritz Einfalt, Charles Dampeyrou, Dan Zecha, and Rainer Lienhart. 2019. Frame-level event detection in athletics videos with pose-based convolutional sequence networks. In Proceedings Proceedings of the 2nd International Workshop on Multimedia Content Analysis in Sports. 42--50.
[6]
Moritz Einfalt, Dan Zecha, and Rainer Lienhart. 2018. Activity-conditioned continuous human pose estimation for performance analysis of athletes using the example of swimming. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 446--455.
[7]
Bjoern Eskofier, Patrick Kugler, Daniel Melzer, and Pascal Kuehner. 2012. Embedded classification of the perceived fatigue state of runners: Towards a body sensor network for assessing the fatigue state during running. In 2012 Ninth International Conference on Wearable and Implantable Body Sensor Networks. IEEE, 113--117.
[8]
Florian Eyben, Klaus R Scherer, Björn W Schuller, Johan Sundberg, Elisabeth André, Carlos Busso, Laurence Y Devillers, Julien Epps, Petri Laukka, Shrikanth S Narayanan, et al. 2015. The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE transactions on affective computing, Vol. 7, 2 (2015), 190--202.
[9]
Florian Eyben, Martin Wöllmer, and Björn Schuller. 2010. Opensmile: the munich versatile and fast open-source audio feature extractor. In Proc. ACM International Conference on Multimedia. ACM, Ottawa, Canada, 1459--1462.
[10]
Alexander Gebhard, Shahin Amiriparian, Andreas Triantafyllopoulos, Alexander Kathan, Maurice Gerczuk, Sandra Ottl, Valerie Dieter, Mirko Jaumann, David Hildner, Patrick Schneeweiss, et al. 2022. Towards Heart Rate Categorisation from Speech in Outdoor Running Conditions. In Proc. EHB. IEEE, Ia?i, Romania.
[11]
Mohsen Gholami, Christopher Napier, Astrid Garc'ia Pati no, Tyler J Cuthbert, and Carlo Menon. 2020. Fatigue monitoring in running using flexible textile wearable sensors. Sensors, Vol. 20, 19 (2020), 5573.
[12]
Juha Karvonen and Timo Vuorimaa. 1988. Heart rate and exercise intensity during sports activities: practical application. Sports medicine, Vol. 5 (1988), 303--311.
[13]
Alexander Kathan, Shahin Amiriparian, Lukas Christ, Andreas Triantafyllopoulos, Niklas Müller, Andreas König, and Björn W Schuller. 2022a. A personalised approach to audiovisual humour recognition and its individual-level fairness. In Proc. Multimodal Sentiment Analysis Workshop and Challenge (MuSe). ACM, Lisbon, Portugal, 29--36.
[14]
Alexander Kathan, Mathias Harrer, Ludwig Küster, Andreas Triantafyllopoulos, Xiangheng He, Manuel Milling, Maurice Gerczuk, Tianhao Yan, Srividya Tirunellai Rajamani, Elena Heber, Inga Grossmann, David D. Ebert, and Björn W. Schuller. 2022b. Personalised depression forecasting using mobile sensor data and ecological momentary assessment. Frontiers in Digital Health, Vol. 4 (2022), 964582.
[15]
Alexander Kathan, Andreas Triantafyllopoulos, Shahin Amiriparian, Alexander Gebhard, Sandra Ottl, Maurice Gerczuk, Mirko Jaumann, David Hildner, Valerie Dieter, Patrick Schneeweiss, et al. 2022c. Investigating Individual-and Group-Level Model Adaptation for Self-Reported Runner Exertion Prediction from Biomechanics. In Proc. EHB. IEEE, Ia?i, Romania, 1--4.
[16]
Taha Khan, Lina E Lundgren, Eric J"arpe, M Charlotte Olsson, and Pelle Viberg. 2019. A novel method for classification of running fatigue using change-point segmentation. Sensors, Vol. 19, 21 (2019), 4729.
[17]
Boning Li and Akane Sano. 2020. Early versus late modality fusion of deep wearable sensor features for personalized prediction of tomorrow's mood, health, and stress. In Proc. EMBC. IEEE, Virtual Conference, 5896--5899.
[18]
Rasmus Oestergaard Nielsen, Ida Buist, Henrik Sørensen, Martin Lind, and Sten Rasmussen. 2012. Training errors and running related injuries: a systematic review. International journal of sports physical therapy, Vol. 7, 1 (2012), 58.
[19]
Tim Op De Beéck, Wannes Meert, Kurt Schütte, Benedicte Vanwanseele, and Jesse Davis. 2018. Fatigue prediction in outdoor runners via machine learning and sensor fusion. In Proc. ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, London, UK, 606--615.
[20]
Sudha Ramasamy and Archana Balan. 2018. Wearable sensors for ECG measurement: a review. Sensor Review, Vol. 38, 4 (2018), 412--419.
[21]
Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, et al. 2021. SpeechBrain: A general-purpose speech toolkit. arXiv preprint arXiv:2106.04624 (2021), 1--34.
[22]
Bruno Tirotti Saragiotto, Tiê Parma Yamato, Luiz Carlos Hespanhol Junior, Michael J Rainbow, Irene S Davis, and Alexandre Dias Lopes. 2014. What are the main risk factors for running-related injuries? Sports medicine, Vol. 44 (2014), 1153--1163.
[23]
Björn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus Scherer, Fabien Ringeval, Mohamed Chetouani, Felix Weninger, Florian Eyben, Erik Marchi, et al. 2013. The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism. In Proc. INTERSPEECH. ISCA, Lyon, France, 1--5.
[24]
David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Daniel Povey, and Sanjeev Khudanpur. 2018a. Spoken language recognition using x-vectors. In Odyssey, Vol. 2018. 105--111.
[25]
David Snyder, Daniel Garcia-Romero, Gregory Sell, Daniel Povey, and Sanjeev Khudanpur. 2018b. X-vectors: Robust dnn embeddings for speaker recognition. In Proc. ICASSP. IEEE, Calgary, Canada, 5329--5333.
[26]
Kusha Sridhar and Carlos Busso. 2022. Unsupervised personalization of an emotion recognition system: The unique properties of the externalization of valence in speech. IEEE Transactions on Affective Computing, Vol. 13, 4 (2022), 1959--1972.
[27]
Sara Taylor, Natasha Jaques, Ehimwenma Nosakhare, Akane Sano, and Rosalind Picard. 2017. Personalized multitask learning for predicting tomorrow's mood, stress, and health. IEEE Transactions on Affective Computing, Vol. 11, 2 (2017), 200--213.
[28]
Andreas Triantafyllopoulos, Shuo Liu, and Björn W Schuller. 2021. Deep speaker conditioning for speech emotion recognition. In Proc. IEEE International Conference on Multimedia and Expo (ICME). IEEE, Virtual Conference, 1--6.
[29]
Andreas Triantafyllopoulos, Sandra Ottl, Alexander Gebhard, Esther Rituerto-González, Mirko Jaumann, Steffen Hüttner, Valerie Dieter, Patrick Schneeweiß, Inga Krauß, Maurice Gerczuk, et al. 2022. Fatigue prediction in outdoor running conditions using audio data. In Proc. EMBC. IEEE, Glasgow, UK, 2623--2626.
[30]
Jürgen Trouvain and Khiet P Truong. 2015. Prosodic characteristics of read speech before and after treadmill running. In Proc. INTERSPEECH. ISCA, Dresden, Germany, 1--5.
[31]
Khiet P Truong, Arne Nieuwenhuys, Peter Beek, and Vanessa Evers. 2015. A database for analysis of speech under physical stress: Detection of exercise intensity while running and talking. In Proc. INTERSPEECH. ISCA, Dresden, Germany, 3705--3709.
[32]
Johannes Wagner, Andreas Triantafyllopoulos, Hagen Wierstorf, Maximilian Schmitt, Florian Eyben, and Björn W Schuller. 2022. Dawn of the transformer era in speech emotion recognition: closing the valence gap. arXiv preprint arXiv:2203.07378 (2022), 1--25.
[33]
Dan Zecha, Moritz Einfalt, Christian Eggert, and Rainer Lienhart. 2018. Kinematic pose rectification for performance analysis and retrieval in sports. In Proc. IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, Salt Lake City, USA, 1791--1799.
[34]
Dan Zecha, Moritz Einfalt, and Rainer Lienhart. 2019. Refining joint locations for human pose tracking in sports videos. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 0--0.

Cited By

View all
  • (2024)Automatic Speech-Based Charisma Recognition and the Impact of Integrating Auxiliary Characteristics2024 IEEE Conference on Telepresence10.1109/Telepresence63209.2024.10841640(148-153)Online publication date: 16-Nov-2024
  • (2024)Personalised Speech-Based PTSD Prediction Using Weighted-Instance Learning2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)10.1109/EMBC53108.2024.10782220(1-4)Online publication date: 15-Jul-2024

Index Terms

  1. Personalised Speech-Based Heart Rate Categorisation Using Weighted-Instance Learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MMSports '23: Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports
      October 2023
      174 pages
      ISBN:9798400702693
      DOI:10.1145/3606038
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. heart rate categorisation
      2. signal processing
      3. speech processing

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      MM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 29 of 49 submissions, 59%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)43
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Automatic Speech-Based Charisma Recognition and the Impact of Integrating Auxiliary Characteristics2024 IEEE Conference on Telepresence10.1109/Telepresence63209.2024.10841640(148-153)Online publication date: 16-Nov-2024
      • (2024)Personalised Speech-Based PTSD Prediction Using Weighted-Instance Learning2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)10.1109/EMBC53108.2024.10782220(1-4)Online publication date: 15-Jul-2024

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media