Skip to main content

Advertisement

Log in

Generalized self-cueing real-time attention scheduling with intermittent inspection and image resizing

  • Published:
Real-Time Systems Aims and scope Submit manuscript

Abstract

This paper proposes a generalized self-cueing real-time attention scheduling framework for DNN-based visual machine perception pipelines on resource-limited embedded platforms. Self-cueing means we identify subframe-level regions of interest in a scene internally by exploiting temporal correlations among successive video frames as opposed to externally via a cueing sensor. One limitation of our original self-cueing-and-inspection strategy (Liu et al. in Proceedings of the 28th IEEE real-time and embedded technology and applications symposium (RTAS), 2022b) lies in its lack of computational efficiency under high workloads, like busy traffic scenarios where a large number of objects are identified and separately inspected. We extend the conference publication by integrating image resizing with intermittent inspection and task batching in attention scheduling. The extension enhances the original algorithm by accelerating the processing of large objects by reducing their resolution at the cost of only a negligible degradation in accuracy, thereby achieving a higher overall object inspection throughput. After extracting partial regions around objects of interest, using an optical flow-based tracking algorithm, we allocate computation resources (i.e. DNN inspection) to them in a criticality-aware manner using a generalized batched proportional balancing algorithm (GBPB), to minimize a concept of generalized system uncertainty. It saves computational resources by inspecting low-priority regions intermittently at low frequencies and inspecting large objects at low resolutions. We implement the system on an NVIDIA Jetson Xavier platform and extensively evaluate its performance using a real-world driving dataset from Waymo. The proposed GBPB algorithm consistently outperforms the previous BPB algorithm that only uses intermittent inspection and a set of baselines. The performance gain of GBPB is larger in facing more significant resource constraints (i.e., lower sampling intervals or busy traffic scenarios) because its multi-dimensional scheduling strategy achieves better resource allocation of machine perception.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. This operation is used to align the inspection times among objects to trigger more batching opportunities.

  2. \(L/x_i\) is an integer since both L and \(x_i\) are powers of 2 multiples of the minimum non-zero element in \({\mathcal {C}}\) and \(x_i\le L\).

  3. We base on the assumption that it is beneficial to slice the image and run the inspection tasks at the sub-frame level.

  4. Without loss of generality, we assume that \(\left\lfloor \frac{w_N {\tilde{x}}^*_{{\hat{i}}}}{w_{{\hat{i}}}} \right\rfloor \ge 1\) and \(\frac{{\tilde{x}}^*_{{\hat{i}}}}{w_{{\hat{i}}}}\) is an integer; otherwise, we can just take the largest i with non-zero value of this equation and leave out the remaining objects.

  5. The specific definition of the metric will be given later.

  6. We want to remind that the object size is not identical to the object target size because the target size not only depends on the object size, but also the object motion.

  7. https://github.com/ultralytics/yolov5.

  8. https://github.com/Cartucho/mAP.

References

  • Amert T, Otterness N, Yang M, et al (2017) GPU scheduling on the nvidia tx2: hidden details revealed. In: 2017 IEEE real-time systems symposium (RTSS), IEEE, pp 104–115

  • Amert T, Tong Z, Voronov S, et al (2021) Timewall: enabling time partitioning for real-time multicore+ accelerator platforms. In: 2021 IEEE real-time systems symposium (RTSS), IEEE, pp 455–468

  • Bastani F, Madden S (2021) Multiscope: efficient video pre-processing for exploratory video analytics. CoRR abs/2103.14695. arXiv:2103.14695

  • Bateni S, Liu C (2018) Apnet: approximation-aware real-time neural network. In: 2018 IEEE real-time systems symposium (RTSS), IEEE, pp 67–79

  • Bewley A, Ge Z, Ott L, et al (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), IEEE, pp 3464–3468

  • Buckler M, Bedoukian P, Jayasuriya S, et al (2018) Eva$^2$: Exploiting temporal redundancy in live computer vision. In: 2018 ACM/IEEE 45th annual international symposium on computer architecture (ISCA), IEEE, pp 533–546

  • Capodieci N, Cavicchioli R, Bertogna M, et al (2018) Deadline-based scheduling for gpu with preemption support. In: 2018 IEEE real-time systems symposium (RTSS), IEEE, pp 119–130

  • Cavigelli L, Degen P, Benini L (2017) Cbinfer: change-based inference for convolutional neural networks on video data. In: Proceedings of the 11th international conference on distributed smart cameras, pp 1–8

  • Chin T, Ding R, Marculescu D (2019) Adascale: Towards real-time video object detection using adaptive scaling. In: Talwalkar A, Smith V, Zaharia M (eds) Proceedings of machine learning and systems 2019, MLSys 2019, Stanford, CA, USA, March 31–April 2, 2019. mlsys.org

  • Grana C, Borghesani D, Cucchiara R (2010) Optimized block-based connected components labeling with decision trees. IEEE Trans Image Process 19(6):1596–1609

    Article  MathSciNet  MATH  Google Scholar 

  • Heo S, Cho S, Kim Y, et al (2020) Real-time object detection system with multi-path neural networks. In: 2020 IEEE real-time and embedded technology and applications symposium (RTAS), IEEE, pp 174–187

  • Heo S, Jeong S, Kim H (2022) Rtscale: Sensitivity-aware adaptive image scaling for real-time object detection. In: 34th euromicro conference on real-time systems (ECRTS 2022), Schloss Dagstuhl-Leibniz-Zentrum für Informatik

  • Holte R, Mok A, Rosier L, et al (1989) The pinwheel: A real-time scheduling problem. In: Proceedings of the 22nd Hawaii international conference of system science, pp 693–702

  • Hu Y, Liu S, Abdelzaher T, et al (2021) On exploring image resizing for optimizing criticality-based machine perception. In: 2021 IEEE 27th international conference on embedded and real-time computing systems and applications (RTCSA), IEEE, pp 169–178

  • Hu Y, Liu S, Abdelzaher T, et al (2022) Real-time task scheduling with image resizing for criticality-based machine perception. Real-Time Systems pp 1–26

  • Jang W, Jeong H, Kang K, et al (2020) R-tod: Real-time object detector with minimized end-to-end delay for autonomous driving. In: In Proc. IEEE Real-time Systems Symposium (RTSS)

  • Ji M, Yi S, Koo C, et al (2022) Demand layering for real-time dnn inference with minimized memory usage. In: 2022 IEEE real-time systems symposium (RTSS), IEEE, pp 291–304

  • Kang W, Lee K, Lee J, et al (2021) Lalarand: Flexible layer-by-layer cpu/gpu scheduling for real-time dnn tasks. In: 2021 IEEE real-time systems symposium (RTSS), IEEE, pp 329–341

  • Kang D, Lee S, Chwa HS, et al (2022a) Rt-mot: Confidence-aware real-time scheduling framework for multi-object tracking tasks. In: 2022 IEEE real-time systems symposium (RTSS), IEEE, pp 318–330

  • Kang W, Chung S, Kim JY, et al (2022b) Dnn-sam: Split-and-merge dnn execution for real-time object detection. In: 2022 IEEE 28th real-time and embedded technology and applications symposium (RTAS), IEEE, pp 160–172

  • Kannan T, Hoffmann H (2021) Budget rnns: Multi-capacity neural networks to improve in-sensor inference under energy budgets. In: 2021 IEEE 27th real-time and embedded technology and applications symposium (RTAS), IEEE, pp 143–156

  • Kroeger T, Timofte R, Dai D, et al (2016) Fast optical flow using dense inverse search. In: European conference on computer vision, Springer, pp 471–488

  • Kumar AR, Ravindran B, Raghunathan A (2019) Pack and detect: Fast object detection in videos using region-of-interest packing. In: Proceedings of the ACM India joint international conference on data science and management of data, pp 150–156

  • Lee S, Nirjon S (2020a) Fast and scalable in-memory deep multitask learning via neural weight virtualization. In: Proceedings of the 18th international conference on mobile systems, applications, and services, pp 175–190

  • Lee S, Nirjon S (2020b) Subflow: A dynamic induced-subgraph strategy toward real-time dnn inference and training. In: 2020 IEEE real-time and embedded technology and applications symposium (RTAS), IEEE, pp 15–29

  • Li X, Yin F, Zhang X, et al (2021) Adaptive scaling for archival table structure recognition. In: Lladós J, Lopresti D, Uchida S (eds) 16th International Conference on Document Analysis and Recognition, ICDAR 2021, Lausanne, Switzerland, September 5-10, 2021, Proceedings, Part I, Lecture Notes in Computer Science, vol 12821. Springer, pp 80–95

  • Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755

  • Liu S, Yao S, Fu X, et al (2020a) On removing algorithmic priority inversion from mission-critical machine inference pipelines. In: In Proc. IEEE real-time systems symposium (RTSS)

  • Liu S, Yao S, Li J et al (2020) Giobalfusion: a global attentional deep learning framework for multisensor information fusion. Proc ACM Interactive Mob Wearable Ubiquitous Technol 4(1):1–27

    Article  Google Scholar 

  • Liu S, Yao S, Fu X, et al (2021) Real-time task scheduling for machine perception in intelligent cyber-physical systems. IEEE Trans Comput

  • Liu L, Dong Z, Wang Y, et al (2022a) Prophet: Realizing a predictable real-time perception pipeline for autonomous vehicles. In: 2022 IEEE real-time systems symposium (RTSS), IEEE, pp 305–317

  • Liu S, Fu X, Wigness M, et al (2022b) Self-cueing real-time attention scheduling in criticality-aware visual machine perception. In: Proceedings of the 28th IEEE real-time and embedded technology and applications symposium (RTAS)

  • Liu S, Wang T, Guo H, et al (2022c) Multi-view scheduling of onboard live video analytics to minimize frame processing latency. In: 2022 IEEE 42nd international conference on distributed computing systems (ICDCS), pp 503–514

  • Liu S, Wang T, Li J, et al (2022d) Adamask: Enabling machine-centric video streaming with adaptive frame masking for dnn inference offloading. In: Proceedings of the 30th ACM international conference on multimedia, pp 3035–3044

  • Mao H, Kong T, Dally WJ (2018) Catdet: cascaded tracked detector for efficient object detection from video. arXiv:1810.00434

  • Minnehan B, Savakis A (2019) Cascaded projection: End-to-end network compression and acceleration. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10,715–10,724

  • Najibi M, Singh B, Davis L (2019) Autofocus: Efficient multi-scale inference. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 9744–9754

  • Razavi K, Luthra M, Koldehofe B, et al (2022) Fa2: fast, accurate autoscaling for serving deep learning inference with sla guarantees. In: 2022 IEEE 28th real-time and embedded technology and applications symposium (RTAS), IEEE, pp 146–159

  • Redmon J, Divvala S, Girshick R, et al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  • Restuccia F, Biondi A (2021) Time-predictable acceleration of deep neural networks on FPGA SOC platforms. In: 2021 IEEE real-time systems symposium (RTSS), IEEE, pp 441–454

  • Song Z, Fu B, Wu F, et al (2020) Drq: dynamic region-based quantization for deep neural network acceleration. In: 2020 ACM/IEEE 47th annual international symposium on computer architecture (ISCA), IEEE, pp 1010–1021

  • Soyyigit A, Yao S, Yun H (2022) Anytime-lidar: deadline-aware 3D object detection. In: 2022 IEEE 28th international conference on embedded and real-time computing systems and applications (RTCSA), IEEE, pp 31–40

  • Sun P, Kretzschmar H, Dotiwalla X, et al (2020) Scalability in perception for autonomous driving: Waymo open dataset. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2446–2454

  • Torralba A (2009) How many pixels make an image? Vis Neurosci 26(1):123–131

    Article  MathSciNet  Google Scholar 

  • Wang S, Lu H, Deng Z (2019) Fast object detection in compressed video. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7104–7113

  • Wu J, Subasharan V, Tran T, et al (2022) MRIM: enabling mixed-resolution imaging for low-power pervasive vision tasks. In: IEEE international conference on pervasive computing and communications, PerCom 2022, Pisa, Italy, March 21–25, 2022. IEEE, pp 44–53

  • Xiang Y, Kim H (2019) Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference. In: 2019 IEEE real-time systems symposium (RTSS), IEEE, pp 392–405

  • Xu M, Zhu M, Liu Y, et al (2018) Deepcache: principled cache for mobile deep vision. In: Proceedings of the 24th annual international conference on mobile computing and networking, pp 129–144

  • Yang Z, Nahrstedt K, Guo H, et al (2021) Deeprt: a soft real time scheduler for computer vision applications on the edge. arXiv:2105.01803

  • Yao S, Zhao Y, Shao H, et al (2018) Fastdeepiot: towards understanding and optimizing neural network execution time on mobile and embedded devices. In: Proceedings of the 16th ACM conference on embedded networked sensor systems, pp 278–291

  • Yao S, Hao Y, Zhao Y, et al (2020a) Scheduling real-time deep learning services as imprecise computations. In: Proc. IEEE international conference on embedded and real-time computing systems and applications (RTCSA)

  • Yao S, Li J, Liu D, et al (2020b) Deep compressive offloading: Speeding up neural network inference by trading edge computation for network latency. In: Proceedings of the international conference on embedded networked sensor systems (SenSys)

  • Zhang S, Lin W, Lu P, et al (2017) Kill two birds with one stone: boosting both object detection accuracy and speed with adaptive patch-of-interest composition. In: 2017 IEEE international conference on multimedia & expo workshops (ICMEW), IEEE, pp 447–452

  • Zhou Y, Moosavi-Dezfooli SM, Cheung NM, et al (2018) Adaptive quantization for deep neural network. In: Thirty-Second AAAI conference on artificial intelligence

  • Zhu X, Wang Y, Dai J, et al (2017a) Flow-guided feature aggregation for video object detection. In: Proceedings of the IEEE international conference on computer vision, pp 408–417

  • Zhu X, Xiong Y, Dai J, et al (2017b) Deep feature flow for video recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2349–2358

Download references

Acknowledgements

Research reported in this paper was sponsored in part by the U.S. DEVCOM Army Research Laboratory under Cooperative Agreement W911NF-17-20196, NSF CNS 20-38817, IBM (IIDAI), and the Boeing Company. The views and conclusions contained in this document are those of the author(s) and should not be interpreted as representing the official policies of the U.S. DEVCOM Army Research Laboratory or the U.S. government. The U.S. government is authorized to reproduce and distribute reprints for government purposes notwithstanding any copyright notation hereon.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tarek Abdelzaher.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, S., Fu, X., Hu, Y. et al. Generalized self-cueing real-time attention scheduling with intermittent inspection and image resizing. Real-Time Syst 59, 302–343 (2023). https://doi.org/10.1007/s11241-023-09396-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11241-023-09396-z

Keywords

Navigation