skip to main content
10.1145/3512731.3534212acmconferencesArticle/Chapter ViewAbstractPublication PagesicdarConference Proceedingsconference-collections
short-paper

A Hybrid Transformer Network for Detection of Risk Situations on Multimodal Life-Log Health Data

Published:27 June 2022Publication History

ABSTRACT

The paper is focused on the development of hybrid transformer architectures for the detection of risk events on multimodal data recorded on a person with visual and signal sensors. The proposed two-stream architecture consists of a visual transformer and linear transformer of time series. The linear transformer is benchmarked on the publicly available dataset UCI-HAR. The experiments with our architecture have been conducted on the in-the-wild dataset BIRDS. The hybrid transformer architecture has better empirical performance than the 3D CNNs and RNNs in previous work. The accuracy of detection of risk situations shows an improvement of 10% over the single-stream transformers.

Skip Supplemental Material Section

Supplemental Material

282_ICMR_ICDAR.mp4

mp4

36 MB

References

  1. D. Anguita, A. Ghio, L. Oneto, X. Parra, and Jorge Luis Reyes-Ortiz. 2013. A Public Domain Dataset for Human Activity Recognition using Smartphones. In ESANN.Google ScholarGoogle Scholar
  2. Mirza Mansoor Baig, Shereen Afifi, Hamid GholamHosseini, and Farhaan Mirza. 2019. A Systematic Review of Wearable Sensors and IoT-Based Monitoring Applications for Older Adults--a Focus on Ageing Population and Independent Living. Journal of medical systems 43, 8 (2019), 233.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Susanne Boll, Jeannie S. Lee, Jochen Meyer, Nitish Nag, and Noel E. O'Connor. 2019. HealthMedia'19: 4th International Workshop on Multimedia for Personal Health and Health Care. In ACM Multimedia. ACM, 2720--2721.Google ScholarGoogle Scholar
  4. Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Workshop on Deep Learning, December 2014.Google ScholarGoogle Scholar
  5. Cathal Gurrin, Klaus Schoeffmann, Hideo Joho, Andreas Leibetseder, Liting Zhou, Aaron Duane, Duc-Tien Dang-Nguyen, Michael Riegler, Luca Piras, Minh-Triet Tran, Jakub Lokoc, and Wolfgang Huerst. 2019. [Invited papers] Comparing Approaches to Interactive Lifelog Search at the Lifelog Search Challenge (LSC2018). ITE Transactions on Media Technology and Applications 7, 2 (2019), 46--59. https://doi.org/10.3169/mta.7.46Google ScholarGoogle ScholarCross RefCross Ref
  6. Cathal Gurrin, Klaus Schoeffmann, Hideo Joho, and Bernd Munzer. 2019. A Test Collection for Interactive Lifelog Retrieval. In MMM 2019, the 25th International Conference on MultiMedia Modeling. Thessaloniki, Greece.Google ScholarGoogle ScholarCross RefCross Ref
  7. Lisa Anne Hendricks, John Mellor, Rosalia Schneider, Jean-Baptiste Alayrac, and Aida Nematzadeh. 2021. Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers. Transactions of the Association for Computational Linguistics 9 (07 2021), 570--585. https://doi.org/10.1162/tacl_a_00385 arXiv: https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl_a_00385/1929720/tacl_a_00385.pdfGoogle ScholarGoogle Scholar
  8. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Carlos Fernando Crispim Junior, Vincent Buso, Konstantinos Avgerinakis, Georgios Meditskos, Alexia Briassouli, Jenny Benois-Pineau, Ioannis Kompatsiaris, and François Brémond. 2016. Semantic Event Fusion of Different Visual Modality Concepts for Activity Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 38, 8 (2016), 1598--1611.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Xiangyu Z. Shaoqing R. Kaiming, H. and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In CVPR. IEEE Computer Society, 770--778.Google ScholarGoogle Scholar
  11. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster). http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  12. Stephen R Lord, Hylton B Menz, and Catherine Sherrington. 2006. Home environment risk factors for falls in older people and the efficacy of home modifications. Age and ageing 35, suppl_2 (2006), ii55--ii59.Google ScholarGoogle Scholar
  13. Rupayan Mallick, Thinhinane Yebda, Jenny Benois-Pineau, Akka Zemmari, Marion Pech, and Hélène Amieva. 2021. A GRU Neural Network with attention mechanism for detection of risk situations on multimodal lifelog data. In CBMI. IEEE, 1--6.Google ScholarGoogle Scholar
  14. Rupayan Mallick, Thinhinane Yebda, Jenny Benois-Pineau, Akka Zemmari, Marion Pech, and Helene Amieva. 2022. Detection of Risky Situations for Frail Adults with Hybrid Neural Networks on Multimodal Health Data. IEEE MultiMedia (2022), 1--1. https://doi.org/10.1109/MMUL.2022.3147381Google ScholarGoogle ScholarCross RefCross Ref
  15. Tasnim M. Newaz N. Kaiser M. Shamim Nahiduzzaman, Md and Mufti Mahmud. 2020. Machine learning based early fall detection for elderly people with neurological disorder using multimodal data fusion. In International Conference on Brain Informatics. Springer, 204--214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tomislav Pozaic, Ulrich Lindemann, Anna-Karina Grebe, and Wilhelm Stork. 2016. Sit-to-stand transition reveals acute fall risk in activities of daily living. IEEE journal of translational engineering in health and medicine 4 (2016), 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  17. Madian Khabsa Han Fang Hao Ma Sinong Wang, Belinda Z. Li. 2020. Lin- former: Self-Attention with Linear Complexity. CoRR abs/2006.04768 (2020). arXiv:2006.04768 https://arxiv.org/abs/2006.04768Google ScholarGoogle Scholar
  18. Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, and Ashish Vaswani. 2021. Bottleneck Transformers for Visual Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16519--16529.Google ScholarGoogle ScholarCross RefCross Ref
  19. Thanos G Stavropoulos, Asterios Papastergiou, Lampros Mpaltadoros, Spiros Nikolopoulos, and Ioannis Kompatsiaris. 2020. IoT wearable sensors and devices in elderly care: a literature review. Sensors 20, 10 (2020), 2826.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS. 5998--6008.Google ScholarGoogle Scholar
  21. Bian J. Hogan W.R. Wu Y. Yang, X. 2010. Clinical concept extraction using transformers. Jama 303, 3 (2010), 258--266.Google ScholarGoogle Scholar
  22. Thinhinane Yebda, Jenny Benois-Pineau, Marion Pech, Hélène Amièva, and Cathal Gurrin. 2020. Detection of Semantic Risk Situations in Lifelog Data for Improving Life of Frail People. In ICMR. ACM, 402--406.Google ScholarGoogle Scholar
  23. Thinhinane Yebda, Jenny Benois-Pineau, Marion Pech, Hélène Amieva, Laura Middleton, and Max Bergelt. 2021. Multimodal Sensor Data Analysis for Detection of Risk Situations of Fragile People in @home Environments. In MMM (2) (Lecture Notes in Computer Science, Vol. 12573). Springer, 342--353.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    ICDAR '22: Proceedings of the 3rd ACM Workshop on Intelligent Cross-Data Analysis and Retrieval
    June 2022
    80 pages
    ISBN:9781450392419
    DOI:10.1145/3512731

    Copyright © 2022 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 27 June 2022

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • short-paper
  • Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)2

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader