Journals & Magazines >IEEE Transactions on Instrume... >Volume: 73

Hierarchical Spatial–Temporal Window Transformer for Pose-Based Rodent Behavior Recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In the fields of neuroscience and pharmacology, understanding rodent behavior is of vital importance for studying the effects of genetic operations and pharmacological th...Show More

Metadata

Abstract:

In the fields of neuroscience and pharmacology, understanding rodent behavior is of vital importance for studying the effects of genetic operations and pharmacological therapies. Conventional behavior recognition methods based on raw images often struggle with noise, such as changes in the lighting conditions and the image backgrounds. On the other hand, pose-based approaches have demonstrated robustness against these challenges. However, existing methods rely on manually constructed features, which are time-consuming and may not fully exploit the potential of the pose data. In this work, we propose the hierarchical spatial–temporal window transformer network (HSTWFormer), a novel approach that efficiently extracts multiscale and cross-spacetime features from rodent pose data. By adopting a pure Transformer structure, HSTWFormer not only avoids the need for a predefined skeletal topology, but also enables adaptive recognition of interactive behaviors between multiple rodents. By merging the features of temporal neighbors, we construct a hierarchical structure with different receptive fields that retain essential information of all scales, enabling the extraction of semantic features from low to high level. Furthermore, a spatial–temporal window attention (STWA) block is introduced to capture correlations between different key points across frames. The STWA blocks facilitate the extraction of both short-term and long-term cross-spacetime features by enabling interactions between window information through window shifting, enhancing the network’s modeling performance. The effectiveness of the proposed HSTWFormer is demonstrated on two datasets, CRIM13 and CalMS21. We achieved accuracies of 79.3% and 69.8% for interactive and overall behaviors in the CRIM13 dataset, and 76.4% accuracy in the CalMS21 dataset. Our method harnesses the wealth of information embedded in key points, showcasing robust modeling capabilities for accurate rodent behavior recognition, and provid...

Published in: IEEE Transactions on Instrumentation and Measurement ( Volume: 73)

Article Sequence Number: 2512914

Date of Publication: 18 March 2024

ISSN Information:

DOI: 10.1109/TIM.2024.3379081

Funding Agency:

Contents

References is not available for this document.

Hierarchical Spatial–Temporal Window Transformer for Pose-Based Rodent Behavior Recognition

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hierarchical Spatial–Temporal Window Transformer for Pose-Based Rodent Behavior Recognition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?