skip to main content
research-article

GIobalFusion: A Global Attentional Deep Learning Framework for Multisensor Information Fusion

Published: 18 March 2020 Publication History

Abstract

The paper enhances deep-neural-network-based inference in sensing applications by introducing a lightweight attention mechanism called the global attention module for multi-sensor information fusion. This mechanism is capable of utilizing information collected from higher layers of the neural network to selectively amplify the influence of informative features and suppress unrelated noise at the fusion layer. We successfully integrate this mechanism into a new end-to-end learning framework, called GIobalFusion, where two global attention modules are deployed for spatial fusion and sensing modality fusion, respectively. Through an extensive evaluation on four public human activity recognition (HAR) datasets, we successfully demonstrate the effectiveness of GlobalFusion at improving information fusion quality. The new approach outperforms the state-of-the-art algorithms on all four datasets with a clear margin. We also show that the learned attention weights agree well with human intuition. We then validate the efficiency of GlobalFusion by testing its inference time and energy consumption on commodity IoT devices. Only a negligible overhead is induced by the global attention modules.

References

[1]
[n.d.]. Monsoon High Voltage Power Monitor. https://www.msoon.com/online-store/High-Voltage-Power-Monitor-Part-Number-AAA10F-p90002590
[2]
[n.d.]. TensorFlow Lite Interpreter. https://www.tensorflow.org/lite/guide/inference
[3]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16). 265--283.
[4]
Kerem Altun, Billur Barshan, and Orkun Tunçel. 2010. Comparative study on classifying human activities with miniature inertial and magnetic sensors. Pattern Recognition 43, 10 (2010), 3605--3620.
[5]
Marc Bachlin, Daniel Roggen, Gerhard Troster, Meir Plotnik, Noit Inbar, Inbal Meidan, Talia Herman, Marina Brozgol, Eliya Shaviv, Nir Giladi, et al. 2009. Potentials of enhanced context awareness in wearable assistants for Parkinson's disease patients with the freezing of gait syndrome. In 2009 International Symposium on Wearable Computers. IEEE, 123--130.
[6]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).
[7]
Sourav Bhattacharya and Nicholas D Lane. 2016. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM. ACM, 176--189.
[8]
Sneha Chaudhari, Gungor Polatkan, Rohan Ramanath, and Varun Mithal. 2019. An attentive survey of attention models. arXiv preprint arXiv:1904.02874 (2019).
[9]
Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, and Jiashi Feng. 2018. A 2-Nets: Double Attention Networks. In Advances in Neural Information Processing Systems. 352--361.
[10]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[12]
David Hall and James Llinas. 2001. Multisensor data fusion. CRC press.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[14]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.
[15]
Saumya Jetley, Nicholas A Lord, Namhoon Lee, and Philip HS Torr. 2018. Learn to pay attention. arXiv preprint arXiv:1804.02391 (2018).
[16]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[17]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[18]
Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Proceedings of the 15th International Conference on Information Processing in Sensor Networks. IEEE Press, 23.
[19]
Nicholas D Lane, Sourav Bhattacharya, Akhil Mathur, Petko Georgiev, Claudio Forlivesi, and Fahim Kawsar. 2017. Squeezing deep learning into mobile and embedded devices. IEEE Pervasive Computing 16, 3 (2017), 82--88.
[20]
Xiaochen Liu, Pradipta Ghosh, Oytun Ulutan, BS Manjunath, Kevin Chan, and Ramesh Govindan. 2019. Caesar: cross-camera complex activity recognition. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems. ACM, 232--244.
[21]
Yang Liu, Zhenjiang Li, Zhidan Liu, and Kaishun Wu. 2019. Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 287--299.
[22]
Zuozhu Liu, Wenyu Zhang, Tony QS Quek, and Shaowei Lin. 2017. Deep fusion of heterogeneous sensor data. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5965--5969.
[23]
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).
[24]
Natalia Neverova, Christian Wolf, Graham Taylor, and Florian Nebout. 2015. Moddrop: adaptive multi-modal gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 8 (2015), 1692--1706.
[25]
Valentin Radu, Nicholas D Lane, Sourav Bhattacharya, Cecilia Mascolo, Mahesh K Marina, and Fahim Kawsar. 2016. Towards multimodal deep learning for activity recognition on mobile devices. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct. ACM, 185--188.
[26]
Valentin Radu, Catherine Tong, Sourav Bhattacharya, Nicholas D Lane, Cecilia Mascolo, Mahesh K Marina, and Fahim Kawsar. 2018. Multimodal deep learning for activity and context recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 157.
[27]
Attila Reiss and Didier Stricker. 2012. Introducing a new benchmarked dataset for activity monitoring. In 2012 16th International Symposium on Wearable Computers. IEEE, 108--109.
[28]
Oren Rippel, Jasper Snoek, and Ryan P Adams. 2015. Spectral representations for convolutional neural networks. In Advances in neural information processing systems. 2449--2457.
[29]
Seyed Ali Rokni and Hassan Ghasemzadeh. 2017. Synchronous dynamic view learning: a framework for autonomous training of activity recognition models using wearable sensors. In Proceedings of the 16th ACM/IEEE International Conference on Information Processing in Sensor Networks. ACM, 79--90.
[30]
Sheng Shen, He Wang, and Romit Roy Choudhury. 2016. I am a smartwatch and i can track my user's arm. In Proceedings of the 14th annual international conference on Mobile systems, applications, and services. ACM, 85--96.
[31]
Timo Sztyler and Heiner Stuckenschmidt. 2016. On-body localization of wearable devices: An investigation of position-aware activity recognition. In 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 1--9.
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[33]
Valentin Vielzeuf, Alexis Lechervy, Stéphane Pateux, and Frédéric Jurie. 2018. Multi-Level Sensor Fusion with Deep Learning. CoRR abs/1811.02447 (2018). arXiv:1811.02447 http://arxiv.org/abs/1811.02447
[34]
Chongyang Wang, Min Peng, Temitayo A Olugbade, Nicholas D Lane, Amanda C De C Williams, and Nadia Bianchi-Berthouze. 2019. Learning Bodily and Temporal Attention in Protective Movement Behavior Detection. arXiv preprint arXiv:1904.10824 (2019).
[35]
Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794--7803.
[36]
Xuyu Wang, Xiangyu Wang, and Shiwen Mao. 2018. RF sensing in the Internet of Things: A general deep learning framework. IEEE Communications Magazine 56, 9 (2018), 62--67.
[37]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. 2048--2057.
[38]
Weitao Xu, Yiran Shen, Neil Bergmann, and Wen Hu. 2016. Sensor-assisted face recognition system on smart glass via multi-view sparse representation classification. In 2016 15th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 1--12.
[39]
Hongfei Xue, Wenjun Jiang, Chenglin Miao, Ye Yuan, Fenglong Ma, Xin Ma, Yijiang Wang, Shuochao Yao, Wenyao Xu, Aidong Zhang, et al. 2019. DeepFusion: A Deep Learning Framework for the Fusion of Heterogeneous Sensory Data. In Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing. ACM, 151--160.
[40]
Shuochao Yao, Shaohan Hu, Yiran Zhao, Aston Zhang, and Tarek Abdelzaher. 2017. Deepsense: A unified deep learning framework for time-series mobile sensing data processing. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 351--360.
[41]
Shuochao Yao, Ailing Piao, Wenjun Jiang, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Jinyang Li, Tianshi Wang, Shaohan Hu, et al. 2019. Stfnets: Learning sensing signals from the time-frequency perspective with short-time fourier neural networks. (2019), 2192--2202.
[42]
Shuochao Yao, Yiran Zhao, Huajie Shao, Dongxin Liu, Shengzhong Liu, Yifan Hao, Ailing Piao, Shaohan Hu, Su Lu, and Tarek F Abdelzaher. 2019. SADeepSense: Self-Attention Deep Learning Framework for Heterogeneous On-Device Sensors in Internet of Things Applications. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 1243--1251.
[43]
Shuochao Yao, Yiran Zhao, Aston Zhang, Shaohan Hu, Huajie Shao, Chao Zhang, Lu Su, and Tarek Abdelzaher. 2018. Deep learning for the internet of things. Computer 51, 5 (2018), 32--41.
[44]
Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017. Deepiot: Compressing deep neural network structures for sensing systems with a compressor-critic framework. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems. ACM, 4.
[45]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in neural information processing systems. 3320--3328.
[46]
Ye Yuan, Guangxu Xun, Kebin Jia, and Aidong Zhang. 2017. A multi-view deep learning method for epileptic seizure detection using short-time fourier transform. In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. ACM, 213--222.
[47]
Ming Zeng, Haoxiang Gao, Tong Yu, Ole J Mengshoel, Helge Langseth, Ian Lane, and Xiaobing Liu. 2018. Understanding and improving recurrent networks for human activity recognition by continuous attention. In Proceedings of the 2018 ACM International Symposium on Wearable Computers. ACM, 56--63.
[48]
Yiran Zhao, Shuochao Yao, Dongxin Liu, Huajie Shao, and Shengzhong Liu. 2019. GreenRoute: A Generalizable Fuel-Saving Vehicular Navigation Service. (2019), 1--10.

Cited By

View all
  • (2025)Eff-WHAR: A Lightweight Design for Efficient Wearable Sensor-Based Human Activity RecognitionIEEE Sensors Journal10.1109/JSEN.2024.350996125:2(3935-3948)Online publication date: 15-Jan-2025
  • (2024)Temporal Action Localization for Inertial-based Human Activity RecognitionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997708:4(1-19)Online publication date: 21-Nov-2024
  • (2024)Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity RecognitionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314157:4(1-25)Online publication date: 12-Jan-2024
  • Show More Cited By

Index Terms

  1. GIobalFusion: A Global Attentional Deep Learning Framework for Multisensor Information Fusion

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
      Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 4, Issue 1
      March 2020
      1006 pages
      EISSN:2474-9567
      DOI:10.1145/3388993
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 March 2020
      Published in IMWUT Volume 4, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Internet of Things (IoT)
      2. multisensor information fusion
      3. neural networks

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)199
      • Downloads (Last 6 weeks)11
      Reflects downloads up to 18 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Eff-WHAR: A Lightweight Design for Efficient Wearable Sensor-Based Human Activity RecognitionIEEE Sensors Journal10.1109/JSEN.2024.350996125:2(3935-3948)Online publication date: 15-Jan-2025
      • (2024)Temporal Action Localization for Inertial-based Human Activity RecognitionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997708:4(1-19)Online publication date: 21-Nov-2024
      • (2024)Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity RecognitionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314157:4(1-25)Online publication date: 12-Jan-2024
      • (2024)DiamondNet: A Neural-Network-Based Heterogeneous Sensor Attentive Fusion for Human Activity RecognitionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.328554735:11(15321-15331)Online publication date: Nov-2024
      • (2024)GT-WHAR: A Generic Graph-Based Temporal Framework for Wearable Human Activity Recognition With Multiple SensorsIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33783318:6(3912-3924)Online publication date: Dec-2024
      • (2024)Enhancing Efficiency in HAR Models: NAS Meets Pruning2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)10.1109/PerComWorkshops59983.2024.10502894(33-38)Online publication date: 11-Mar-2024
      • (2024)iMove: Exploring Bio-Impedance Sensing for Fitness Activity Recognition2024 IEEE International Conference on Pervasive Computing and Communications (PerCom)10.1109/PerCom59722.2024.10494489(194-205)Online publication date: 11-Mar-2024
      • (2024)A Lightweight Deep Human Activity Recognition Algorithm Using Multiknowledge DistillationIEEE Sensors Journal10.1109/JSEN.2024.344330824:19(31495-31511)Online publication date: 1-Oct-2024
      • (2024)EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT)10.1109/CSCAIoT62585.2024.00005(1-7)Online publication date: 13-May-2024
      • (2024)A Systematic Review of Human Activity Recognition Based on Mobile Devices: Overview, Progress and TrendsIEEE Communications Surveys & Tutorials10.1109/COMST.2024.335759126:2(890-929)Online publication date: 23-Jan-2024
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media