ABSTRACT
Mobile-centric AI applications put forward high requirements for resource-efficiency of model inference. Input filtering is a promising approach to eliminate the redundancy in the input so as to reduce the cost of inference. Previous efforts have tailored effective solutions for many applications, but left two essential questions unanswered: (1) theoretical filterability of an inference workload to guide the application of input filtering techniques, thereby avoiding the trial-and-error cost for resource-constrained mobile applications; (2) robust discriminability of feature embedding to allow input filtering to be widely effective for diverse inference tasks and input content. To answer these questions, we first provide a generic formalization of the input filtering problem and theoretically compare the hypothesis complexity of inference models and their input filters to understand the optimization potential of applying input filtering. Then we propose the first end-to-end learnable input filtering framework that covers most state-of-the-art methods and surpasses them in feature embedding with robust discriminability. Based on our framework, we design and implement an input filtering system InFi supporting six input modalities. InFi is the first to support text and sensor signal inputs and model partitioning deployments widely adopted by under-resourced mobile systems. Comprehensive evaluations confirm our theoretical results and show that InFi outperforms strong baselines in applicability, accuracy, and efficiency, owing to its generality and end-to-end learnability. InFi can achieve 8.5X throughput and save 95% bandwidth, while keeping over 90% accuracy, for a video analytics app on mobile platforms.
- Pranav Adarsh, Pratibha Rathi, and Manoj Kumar. 2020. YOLO v3-Tiny: Object Detection and Recognition using one stage improved model. In 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). IEEE, 687--694.Google ScholarCross Ref
- Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, et al. 2016. Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning. PMLR, 173--182.Google Scholar
- Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge Luis Reyes-Ortiz, et al. 2013. A public domain dataset for human activity recognition using smartphones.. In Esann, Vol. 3. 3.Google Scholar
- Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, and Maja Pantic. 2020. Toward fast and accurate human pose estimation via soft-gated skip connections. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). IEEE, 8--15.Google ScholarDigital Library
- Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G Andersen, Michael Kaminsky, and Subramanya Dulloor. 2019. Scaling Video Analytics on Constrained Edge Nodes. In Proceedings of Machine Learning and Systems, A. Talwalkar, V. Smith, and M. Zaharia (Eds.), Vol. 1. 406--417. https://proceedings.mlsys.org/paper/2019/file/85d8ce590ad8981ca2c8286f79f59954-Paper.pdfGoogle Scholar
- Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. 2019. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019).Google Scholar
- Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR) 41, 3 (2009), 1--58.Google ScholarDigital Library
- Dimitris Chatzopoulos, Carlos Bermejo, Zhanpeng Huang, and Pan Hui. 2017. Mobile augmented reality survey: From where we are to where we go. IEEE Access 5 (2017), 6917--6950.Google ScholarCross Ref
- Tiffany Yu-Han Chen, Lenin Ravindranath, Shuo Deng, Paramvir Bahl, and Hari Balakrishnan. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems. 155--168.Google Scholar
- Guillaume Chevalier. 2016. LSTMs for human activity recognition.Google Scholar
- François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251--1258.Google ScholarCross Ref
- Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, and Scott Yang. 2016. Structured prediction theory based on factor graph complexity. Advances in Neural Information Processing Systems 29 (2016), 2514--2522.Google Scholar
- Yaqi Duan, Chi Jin, and Zhiyuan Li. 2021. Riskbounds and rademacher complexity in batch reinforcement learning. In International Conference on Machine Learning. PMLR, 2892--2902.Google Scholar
- Andrew Gardner, Jinko Kanno, Christian A Duncan, and Rastko Selmic. 2014. Measuring distance between unordered sets of different sizes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 137--143.Google ScholarDigital Library
- Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D Cubuk, Quoc V Le, and Barret Zoph. 2021. Simple copy-paste is a strong data augmentation method for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2918--2928.Google ScholarCross Ref
- Iman Ghosh. 2020. https://www.visualcapitalist.com/aiot-when-ai-meets-iot-technology/Google Scholar
- Tobias Glasmachers. 2017. Limits of end-to-end learning. In Asian Conference on Machine Learning. PMLR, 17--32.Google Scholar
- Yuan Gong, Yu-An Chung, and James Glass. 2021. AST: Audio Spectrogram Transformer. In Proc. Interspeech 2021. 571--575. Google ScholarCross Ref
- Peizhen Guo, Bo Hu, Rui Li, and Wenjun Hu. 2018. FoggyCache: Cross-device approximate computation reuse. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. 19--34.Google ScholarDigital Library
- Peizhen Guo and Wenjun Hu. 2018. Potluck: Cross-application approximate deduplication for computation-intensive mobile applications. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. 271--284.Google ScholarDigital Library
- Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), Vol. 2. IEEE, 1735--1742.Google ScholarDigital Library
- Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. 123--136.Google ScholarDigital Library
- Nick Harvey, Christopher Liaw, and Abbas Mehrabian. 2017. Nearly-tight VC-dimension bounds for piecewise linear neural networks. In Conference on learning theory. PMLR, 1064--1068.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European conference on computer vision (ECCV). 784--800.Google ScholarDigital Library
- Geoffrey Hinton, Nitish Srivastava, and Kevin Swersky. 2012. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14, 8 (2012), 2.Google Scholar
- Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. 2020. spaCy: Industrial-strength Natural Language Processing in Python. Google ScholarCross Ref
- Kevin Hsieh, Ganesh Ananthanarayanan, Peter Bodik, Shivaram Venkataraman, Paramvir Bahl, Matthai Philipose, Phillip B Gibbons, and Onur Mutlu. 2018. Focus: Querying large video datasets with low latency and low cost. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 269--286.Google Scholar
- I Indrawan, I Bayupati, and Desy Purnami Singgih Putri. 2018. Markerless Augmented Reality Utilizing Gyroscope to Demonstrate the Position of Dewata Nawa Sanga. International Journal of Interactive Mobile Technologies 12, 1 (2018).Google Scholar
- Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Tuo Zhao. 2020. SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2177--2190.Google ScholarCross Ref
- Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: scalable adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. 253--266.Google ScholarDigital Library
- Daniel Kang, Peter Bailis, and Matei Zaharia. [n.d.]. BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics. Proceedings of the VLDB Endowment 13, 4 ([n. d.]).Google Scholar
- Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. Proceedings of the VLDB Endowment 10, 11 (2017).Google ScholarDigital Library
- Michael J Kearns, Umesh Virkumar Vazirani, and Umesh Vazirani. 1994. An introduction to computational learning theory. MIT press.Google Scholar
- Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746--1751. Google ScholarCross Ref
- Gregory Koch, Richard Zemel, Ruslan Salakhutdinov, et al. 2015. Siamese neural networks for one-shot image recognition. In ICML deep learning workshop, Vol. 2. Lille.Google Scholar
- Brian Kulis et al. 2012. Metric learning: A survey. Foundations and trends in machine learning 5, 4 (2012), 287--364.Google Scholar
- Laura Leal-Taixé, Cristian Canton-Ferrer, and Konrad Schindler. 2016. Learning by tracking: Siamese CNN for robust target association. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 33--40.Google ScholarCross Ref
- En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. 2019. Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Wireless Communications 19, 1 (2019), 447--457.Google ScholarCross Ref
- Yuanqi Li, Arthi Padmanabhan, Pengzhan Zhao, Yufei Wang, Guoqing Harry Xu, and Ravi Netravali. 2020. Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (Virtual Event, USA) (SIGCOMM '20). Association for Computing Machinery, New York, NY, USA, 359--376. Google ScholarDigital Library
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.Google ScholarCross Ref
- TensorFlow Lite. 2021. https://www.tensorflow.org/liteGoogle Scholar
- Cihang Liu, Lan Zhang, Zongqian Liu, Kebin Liu, Xiangyang Li, and Yunhao Liu. 2016. Lasagna: Towards Deep Hierarchical Understanding and Searching over Mobile Sensing Data. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking (New York City, New York) (MobiCom '16). Association for Computing Machinery, New York, NY, USA, 334--347. Google ScholarDigital Library
- Jianhui Liu and Qi Zhang. 2019. Code-partitioning offloading schemes in mobile edge computing for augmented reality. Ieee Access 7 (2019), 11222--11236.Google ScholarCross Ref
- Chunjie Luo, Fan Zhang, Cheng Huang, Xingwang Xiong, Jianan Chen, Lei Wang, Wanling Gao, Hainan Ye, Tong Wu, Runsong Zhou, et al. 2018. AIoT bench: towards comprehensive benchmarking mobile and embedded device intelligence. In International Symposium on Benchmarking, Measuring and Optimization. Springer, 31--35.Google Scholar
- Marcin Marszałek, Ivan Laptev, and Cordelia Schmid. 2009. Actions in Context. In IEEE Conference on Computer Vision & Pattern Recognition.Google Scholar
- Arnab Neelim Mazumder, Jian Meng, Hasib-Al Rashid, Utteja Kallakuri, Xin Zhang, Jae-Sun Seo, and Tinoosh Mohsenin. 2021. A survey on the optimization of neural network accelerators for micro-AI on-device inference. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 11, 4 (2021), 532--547.Google ScholarCross Ref
- mindspore ai. 2021. https://github.com/mindspore-ai/mindsporeGoogle Scholar
- TensorFlow Lite Hosted Models. 2021. https://www.tensorflow.org/lite/guide/hosted_modelsGoogle Scholar
- Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. 2018. Foundations of machine learning. MIT press.Google ScholarDigital Library
- Seyed Ali Osia, Ali Shahin Shamsabadi, Sina Sajadmanesh, Ali Taheri, Kleomenis Katevas, Hamid R Rabiee, Nicholas D Lane, and Hamed Haddadi. 2020. A hybrid learning architecture for privacy-preserving mobile analytics. IEEE Internet of Things Journal 7, 5 (2020), 4505--4518.Google ScholarCross Ref
- D Osokin. 2019. Real-time 2D multi-person pose estimation on CPU: Lightweight OpenPose. In ICPRAM 2019-Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods. 744--748.Google ScholarCross Ref
- Karol J. Piczak. [n.d.]. ESC: Dataset for Environmental Sound Classification. In Proceedings of the 23rd Annual ACM Conference on Multimedia (Brisbane, Australia, 2015-10-13). ACM Press, 1015--1018. Google ScholarDigital Library
- Aditya Prakash, Kashyap Chitta, and Andreas Geiger. 2021. Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7077--7087.Google ScholarCross Ref
- Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510--4520.Google ScholarCross Ref
- Sefik Ilkin Serengil and Alper Ozpinar. 2020. LightFace: A Hybrid Deep Face Recognition Framework. In 2020 Innovations in Intelligent Systems and Applications Conference (ASYU). IEEE, 23--27. Google ScholarCross Ref
- Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou. 2020. MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2158--2170.Google ScholarCross Ref
- Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, and Lior Wolf. 2014. Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1701--1708.Google ScholarDigital Library
- Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2820--2828.Google ScholarCross Ref
- tensorflow. 2021. https://github.com/tensorflow/tensorflowGoogle Scholar
- Xianzhong Tian, Juan Zhu, Ting Xu, and Yanjun Li. 2021. Mobility-included DNN partition offloading from mobile devices to edge clouds. Sensors 21, 1 (2021), 229.Google ScholarCross Ref
- An Tran and Loong-Fah Cheong. 2017. Two-stream Flow-guided Convolutional Attention Networks for Action Recognition. In The IEEE International Conference on Computer Vision Workshop (ICCVW).Google Scholar
- NVIDIA Jetson TX2. 2021. https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-tx2/Google Scholar
- Leslie Valiant. 2013. Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World. Basic Books (AZ).Google Scholar
- VN Vapnik and A Ya Chervonenkis. 1971. On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability & Its Applications 16, 2 (1971), 264--280.Google ScholarCross Ref
- Junjue Wang, Ziqiang Feng, Zhuo Chen, Shilpa George, Mihir Bala, Padmanabhan Pillai, Shao-Wen Yang, and Mahadev Satyanarayanan. 2018. Bandwidth-efficient live video analytics for drones via edge computing. In 2018 IEEE/ACM Symposium on Edge Computing (SEC). IEEE, 159--173.Google ScholarCross Ref
- Xiaofei Wang, Yiwen Han, Chenyang Wang, Qiyang Zhao, Xu Chen, and Min Chen. 2019. In-edge ai: Intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network 33, 5 (2019), 156--165.Google ScholarDigital Library
- Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In European conference on computer vision. Springer, 499--515.Google ScholarCross Ref
- Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning. PMLR, 2048--2057.Google Scholar
- Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, and Xuanzhe Liu. 2018. Deepcache: Principled cache for mobile deep vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking. 129--144.Google ScholarDigital Library
- Mu Yuan, Lan Zhang, Xiang-Yang Li, and Hui Xiong. 2020. Comprehensive and efficient data labeling via adaptive model scheduling. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1858--1861.Google ScholarCross Ref
- Mu Yuan, Lan Zhang, Xiang-Yang Li, Lin-Zhuo Yang, and Hui Xiong. 2022. Adaptive Model Scheduling for Resource-Efficient Data Labeling. ACM Trans. Knowl. Discov. Data 16, 4, Article 71 (jan 2022), 22 pages. Google ScholarDigital Library
- Wuyang Zhang, Zhezhi He, Luyang Liu, Zhenhua Jia, Yunxin Liu, Marco Gruteser, Dipankar Raychaudhuri, and Yanyong Zhang. 2021. Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking. 201--214.Google ScholarDigital Library
- Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 665--674.Google ScholarDigital Library
- Li Zhou, Hao Wen, Radu Teodorescu, and David HC Du. 2019. Distributing deep neural networks with containerized partitions at the edge. In 2nd {USENIX} Workshop on Hot Topics in Edge Computing (HotEdge 19).Google Scholar
- zzh8829. 2021. https://github.com/zzh8829/yolov3-tf2Google Scholar
Recommendations
Designs of fractional delay filter, Nyquist filter, lowpass filter and diamond-shaped filter
In this paper, the designs of fractional delay filter, Nyquist filter, lowpass filter and diamond-shaped filter are presented. First, the relation between Nyquist filter and fractional delay filter is investigated such that the design tools of one ...
Kalman filter/smoother-based design and implementation of digital IIR filters
Highlights- A Kalman filter framework for finding the optimal response of digital IIR filters is proposed.
AbstractRecently, a unified framework was proposed for forward-backward filtering and penalized least-squares optimization. It was shown that forward-backward filtering can be presented as instances of penalized least-squares optimization. In ...
Hybrid structures for low-complexity variable fractional-delay FIR filters
This paper proposes a pair of new structures for implementing low-complexity odd-order and even-order variable fractional-delay (VFD) FIR filters using hybrid structures with both even-order and odd-order subfilters. For odd-order VFD filter design, the ...
Comments