ABSTRACT
Along with the success of deep learning are extensive models and huge amounts of data. Therefore, efficiency in deep learning is one of the most concerning problems. Many methods have been proposed to reduce the complexity of models and have achieved promising results. In this paper, we look at this problem from the data perspective. By leveraging the strengths of Tensor methods in data processing as well as the efficiency of deep learning models, we aim to reduce costs in many aspects such as storage and computation. We present a data-driven deep learning approach for high-dimensional data classification problems. Specifically, we use Tucker Decomposition, a tensor decomposition method, to factorize the large, complex structure raw data into small factors and use them as the input of lightweight deep model architecture. We evaluate our approach on high-dimensional, complex datasets such as video classification on the Jester dataset and 3D object classification on the ModelNet dataset. Our proposal achieves competitive results at a reasonable cost.
- Samiran Das. 2021. Hyperspectral image, video compression using sparse tucker tensor decomposition. IET Image Processing 15, 4 (2021), 964–973.Google ScholarCross Ref
- Carlos Esteves. 2020. Learning Equivariant Representations. CoRR abs/2012.02771 (2020). arXiv:2012.02771https://arxiv.org/abs/2012.02771Google Scholar
- Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. 2020. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1580–1589.Google ScholarCross Ref
- Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).Google Scholar
- Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh. 2018. Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet?. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 6546–6555.Google ScholarCross Ref
- Kohei Hayashi, Taiki Yamaguchi, Yohei Sugawara, and Shin-ichi Maeda. 2019. Exploring unexplored tensor network decompositions for convolutional neural networks. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
- Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, 2019. Searching for mobilenetv3. In Proceedings of the IEEE/CVF international conference on computer vision. 1314–1324.Google ScholarCross Ref
- Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Tamara G Kolda and Brett W Bader. 2009. Tensor decompositions and applications. SIAM review 51, 3 (2009), 455–500.Google ScholarDigital Library
- Okan Kopuklu, Neslihan Kose, Ahmet Gunduz, and Gerhard Rigoll. 2019. Resource efficient 3d convolutional neural networks. In Proceedings of the IEEE/CVF international conference on computer vision workshops. 0–0.Google ScholarCross Ref
- Jean Kossaifi, Aran Khanna, Zachary Lipton, Tommaso Furlanello, and Anima Anandkumar. 2017. Tensor contraction layers for parsimonious deep nets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 26–32.Google ScholarCross Ref
- Jean Kossaifi, Zachary C Lipton, Arinbjörn Kolbeinsson, Aran Khanna, Tommaso Furlanello, and Anima Anandkumar. 2020. Tensor regression networks. The Journal of Machine Learning Research 21, 1 (2020), 4862–4882.Google ScholarDigital Library
- Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, and Maja Pantic. 2016. Tensorly: Tensor learning in python. arXiv preprint arXiv:1610.09555 (2016).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).Google Scholar
- Xingyu Liu, Joon-Young Lee, and Hailin Jin. 2019. Learning video representations from correspondence proposals. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 4273–4281.Google ScholarCross Ref
- Zishu Liu, Wei Song, Yifei Tian, Sumi Ji, Yunsick Sung, Long Wen, Tao Zhang, Liangliang Song, and Amanda Gozho. 2020. VB-Net: voxel-based broad learning network for 3D object classification. Applied Sciences 10, 19 (2020), 6735.Google ScholarCross Ref
- Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. 2018. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV). 116–131.Google ScholarDigital Library
- Joanna Materzynska, Guillaume Berger, Ingo Bax, and Roland Memisevic. 2019. The jester dataset: A large-scale video dataset of human gestures. In Proceedings of the IEEE/CVF international conference on computer vision workshops. 0–0.Google ScholarCross Ref
- Daniel Maturana and Sebastian Scherer. 2015. Voxnet: A 3d convolutional neural network for real-time object recognition. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 922–928.Google ScholarDigital Library
- Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440 (2016).Google Scholar
- Quoc Tran Ngoc. 2022. Hierarchical Tucker Tensor Regression: A Case Study on Classification. In Proceedings of the Future Technologies Conference. Springer, 179–195.Google Scholar
- Alexander Novikov, Dmitrii Podoprikhin, Anton Osokin, and Dmitry P Vetrov. 2015. Tensorizing neural networks. Advances in neural information processing systems 28 (2015).Google Scholar
- Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520.Google ScholarCross Ref
- Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning. PMLR, 6105–6114.Google Scholar
- Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, and Yunhe Wang. 2022. GhostNetv2: enhance cheap operation with long-range attention. Advances in Neural Information Processing Systems 35 (2022), 9969–9982.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).Google Scholar
- Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, and Qinghua Hu. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11534–11542.Google ScholarCross Ref
- Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4820–4828.Google ScholarCross Ref
- Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. Advances in neural information processing systems 29 (2016).Google Scholar
- Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912–1920.Google Scholar
- Miao Yin, Yang Sui, Siyu Liao, and Bo Yuan. 2021. Towards efficient tensor decomposition-based dnn model compression with optimization framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10674–10683.Google ScholarCross Ref
- Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6848–6856.Google ScholarCross Ref
- Shuaifeng Zhi, Yongxiang Liu, Xiang Li, and Yulan Guo. 2017. LightNet: A Lightweight 3D Convolutional Neural Network for Real-Time 3D Object Recognition.. In 3DOR@ Eurographics.Google Scholar
- Hua Zhou, Lexin Li, and Hongtu Zhu. 2013. Tensor regression with applications in neuroimaging data analysis. J. Amer. Statist. Assoc. 108, 502 (2013), 540–552.Google ScholarCross Ref
Index Terms
- High Dimensional Data Classification Approach with Deep Learning and Tucker Decomposition
Recommendations
Speeding Up Deep Convolutional Neural Networks Based on Tucker-CP Decomposition
ICMLT '20: Proceedings of the 2020 5th International Conference on Machine Learning TechnologiesConvolutional neural networks (CNNs) have made great success in computer vision tasks. But the computational complexity of CNNs is huge, which makes CNNs run slowly especially when computational resources are limited. In this paper, we propose a scheme ...
A Tucker Deep Computation Model for Mobile Multimedia Feature Learning
Special Section on Deep Learning for Mobile Multimedia and Special Section on Best Papers from ACM MMSys/NOSSDAV 2016Recently, the deep computation model, as a tensor deep learning model, has achieved super performance for multimedia feature learning. However, the conventional deep computation model involves a large number of parameters. Typically, training a deep ...
Hyperspectral image compression based on lapped transform and Tucker decomposition
In this paper, we present a hyperspectral image compression system based on the lapped transform and Tucker decomposition (LT-TD). In the proposed method, each band of a hyperspectral image is first decorrelated by a lapped transform. The transformed ...
Comments