ABSTRACT
Digitization of human motion using 2D or 3D skeleton representations offers exciting possibilities for many applications but, at the same time, requires scalable content-based retrieval techniques to make such data reusable. Although a lot of research effort focuses on extracting content-preserving motion features, there is a lack of techniques that support efficient similarity search on a large scale. In this paper, we introduce a new indexing scheme for organizing large collections of spatio-temporal skeleton sequences. Specifically, we apply the motion-word concept to transform skeleton sequences into structured text-like motion documents, and index such documents using an extended inverted-file approach. Over this index, we design a new similarity search algorithm that exploits the properties of the motion-word representation and provides efficient retrieval with a variable level of approximation, possibly reaching constant search costs disregarding the collection size. Experimental results confirm the usefulness of the proposed approach.
- Fakhreddine Ababsa, Hicham Hadj-Abdelkader, and Marouane Boui. 2019. 3D Human Tracking with Catadioptric Omnidirectional Camera. In International Conference on Multimedia Retrieval (ICMR). ACM, New York, NY, USA, 73--77. https://doi.org/10.1145/3323873.3325027Google ScholarDigital Library
- Andreas Aristidou, Daniel Cohen-Or, Jessica K. Hodgins, Yiorgos Chrysanthou, and Ariel Shamir. 2018. Deep Motifs and Motion Signatures. ACM Transactions on Graphics , Vol. 37, 6 (2018), 187:1--187:13. https://doi.org/10.1145/3272127.3275038Google ScholarDigital Library
- Ricardo Baeza-Yates and Berthier A. Ribeiro-Neto. 2011. Modern Information Retrieval - the concepts and technology behind search, Second edition .Pearson Education Ltd., Harlow, England.Google ScholarDigital Library
- Christian Beecks and Alexander Grass. 2018. Efficient Point-Based Pattern Search in 3D Motion Capture Databases. In 6th IEEE International Conference on Future Internet of Things and Cloud (FiCloud) . IEEE Computer Society, 230--235. https://doi.org/10.1109/FiCloud.2018.00041Google ScholarCross Ref
- Petr Byvshev, Pascal Mettes, and Yu Xiao. 2020. Heterogeneous Non-Local Fusion for Multimodal Activity Recognition. In International Conference on Multimedia Retrieval (ICMR). ACM, 63--72. https://doi.org/10.1145/3372278.3390675Google ScholarDigital Library
- Shuning Chang, Li Yuan, Xuecheng Nie, Ziyuan Huang, Yichen Zhou, Yupeng Chen, Jiashi Feng, and Shuicheng Yan. 2020. Towards Accurate Human Pose Estimation in Videos of Crowded Scenes. In 28th ACM International Conference on Multimedia (MM). ACM, 4630--4634. https://doi.org/10.1145/3394171.3416299Google ScholarDigital Library
- Myung Geol Choi and Taesoo Kwon. 2019. Motion rank: applying page rank to motion data search. The Visual Computer , Vol. 35, 2 (2019), 289--300. https://doi.org/10.1007/s00371-018--1498--6Google ScholarCross Ref
- Mubbasir Kapadia, I-Kao Chiang, Tiju Thomas, Norman I. Badler, and Joseph T. Kider Jr. 2013. Efficient motion retrieval in large motion databases. In Symposium on Interactive 3D Graphics and Games (I3D). ACM, 19--28. https://doi.org/10.1145/2448196.2448199Google ScholarDigital Library
- Bjö rn Krü ger, Jochen Tautges, Andreas Weber, and Arno Zinke. 2010. Fast Local and Global Similarity Searches in Large Motion Capture Databases. In Eurographics/ACM SIGGRAPH Symposium on Computer Animation (SCA) . Eurographics Association, 1--10. https://doi.org/10.2312/SCA/SCA10/001-010Google Scholar
- Jianan Li, Xuemei Xie, Qingzhe Pan, Yuhan Cao, Zhifu Zhao, and Guangming Shi. 2020. SGM-Net: Skeleton-guided multimodal network for action recognition. Pattern Recognition , Vol. 104 (2020), 1--38. https://doi.org/10.1016/j.patcog.2020.107356Google ScholarCross Ref
- Chunhui Liu, Yueyu Hu, Yanghao Li, Sijie Song, and Jiaying Liu. 2017. PKU-MMD: A Large Scale Benchmark for Skeleton-Based Human Action Understanding. In Workshop on Visual Analysis in Smart and Connected Communities (VSCC@MM). ACM, 1--8. https://doi.org/10.1145/3132734.3132739Google ScholarDigital Library
- Na Lv, Ying Wang, Zhiquan Feng, and Jingliang Peng. 2021. Deep Hashing for Motion Capture Data Retrieval. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2215--2219. https://doi.org/10.1109/ICASSP39728.2021.9413505Google Scholar
- Vladimir Mic, David Novak, and Pavel Zezula. 2019. Binary Sketches for Secondary Filtering. ACM Transactions on Information Systems , Vol. 37, 1 (2019), 1:1--1:28. https://doi.org/10.1145/3231936Google ScholarDigital Library
- Meinard Mü ller, Tido Rö der, and Michael Clausen. 2005. Efficient content-based retrieval of motion capture data. ACM Transactions on Graphics , Vol. 24, 3 (2005), 677--685. https://doi.org/10.1145/1073204.1073247Google ScholarDigital Library
- Konstantinos Papadopoulos, Enjie Ghorbel, Renato Baptista, Djamila Aouada, and Bjö rn E. Ottersten. 2019. Two-Stage RGB-Based Action Detection Using Augmented 3D Poses. In 18th International Conference on Computer Analysis of Images and Patterns (CAIP) , Vol. 11678. Springer, 26--35. https://doi.org/10.1007/978--3-030--29888--3_3Google ScholarCross Ref
- James Philbin, Ondrej Chum, Michael Isard, Josef Sivic, and Andrew Zisserman. 2008. Lost in quantization: Improving particular object retrieval in large scale image databases. In Computer Vision and Pattern Recognition (CVPR) . IEEE Computer Society. https://doi.org/10.1109/CVPR.2008.4587635Google ScholarCross Ref
- Thanawin Rakthanmanon, Bilson J. L. Campana, Abdullah Mueen, Gustavo E. A. P. A. Batista, M. Brandon Westover, Qiang Zhu, Jesin Zakaria, and Eamonn J. Keogh. 2012. Searching and mining trillions of time series subsequences under dynamic time warping. In 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). ACM, 262--270. https://doi.org/10.1145/2339530.2339576Google ScholarDigital Library
- Cheng Ren, Xiaoyong Lei, and Guofeng Zhang. 2011. Motion Data Retrieval from Very Large Motion Databases. In International Conference on Virtual Reality and Visualization. IEEE, 70--77. https://doi.org/10.1109/ICVRV.2011.50Google ScholarDigital Library
- Tingxin Ren, Wei Li, Zifei Jiang, Xueqing Li, Yan Huang, and Jingliang Peng. 2020. Video-Based Human Motion Capture Data Retrieval via MotionSet Network. IEEE Access , Vol. 8 (2020), 186212--186221. https://doi.org/10.1109/ACCESS.2020.3030258Google ScholarCross Ref
- Jan Sedmidubsky, Petra Budikova, Vlastislav Dohnal, and Pavel Zezula. 2020. Motion Words: A Text-like Representation of 3D Skeleton Sequences. In 42nd European Conference on Information Retrieval (ECIR). Springer, 527--541. https://doi.org/10.1007/978--3-030--45439--5_35Google ScholarDigital Library
- Jan Sedmidubsky, Petr Elias, Petra Budikova, and Pavel Zezula. 2021. Content-Based Management of Human Motion Data: Survey and Challenges. IEEE Access , Vol. 9 (2021), 64241--64255. https://doi.org/10.1109/ACCESS.2021.3075766Google ScholarCross Ref
- Jan Sedmidubsky, Petr Elias, and Pavel Zezula. 2019. Searching for variable-speed motions in long sequences of motion capture data. Information Systems , Vol. 80 (2019), 148--158. https://doi.org/10.1016/j.is.2018.04.002Google ScholarCross Ref
- Sijie Song, Cuiling Lan, Junliang Xing, Wenjun Zeng, and Jiaying Liu. 2018. Spatio-Temporal Attention-Based LS™ Networks for 3D Action Recognition and Detection. IEEE Transactions on Image Processing , Vol. 27, 7 (2018), 3459--3471. https://doi.org/10.1109/TIP.2018.2818328Google ScholarCross Ref
- Yingying Wang and Michael Neff. 2015. Deep signatures for indexing and retrieval in large motion databases. In 8th ACM SIGGRAPH Conference on Motion in Games (MIG). ACM , 37--45. https://doi.org/10.1145/2822013.2822024Google ScholarDigital Library
- Shuangyuan Wu, Zhaoqi Wang, and Shihong Xia. 2009. Indexing and retrieval of human motion data by a hierarchical tree. In ACM Symposium on Virtual Reality Software and Technology (VRST). ACM , 207--214. https://doi.org/10.1145/1643928.1643974Google ScholarDigital Library
Index Terms
- Efficient Indexing of 3D Human Motions
Recommendations
Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalDue to recent advances in pose-estimation methods, human motion can be extracted from a common video in the form of 3D skeleton sequences. Despite wonderful application opportunities, effective and efficient content-based access to large volumes of such ...
Hierarchical indexing structure for 3d human motions
MMM'07: Proceedings of the 13th international conference on Multimedia Modeling - Volume Part IContent-based retrieval of 3D human motion capture data has significant impact in different fields such as physical medicine, rehabilitation, and animation. This paper develops an efficient indexing approach for 3D motion capture data, supporting ...
Motion Words: A Text-Like Representation of 3D Skeleton Sequences
Advances in Information RetrievalAbstractThere is a growing amount of human motion data captured as a continuous 3D skeleton sequence without any information about its semantic partitioning. To make such unsegmented and unlabeled data efficiently accessible, we propose to transform them ...
Comments