QAOVDetect: A Novel Syllogistic Model with Quantized and Anchor Optimized Approach to Assist Visually Impaired for Animal Detection using 3D Vision

Manjari, Kanak; Verma, Madhushi; Singal, Gaurav; Kumar, Neeraj

doi:10.1007/s12559-022-10020-8

QAOVDetect: A Novel Syllogistic Model with Quantized and Anchor Optimized Approach to Assist Visually Impaired for Animal Detection using 3D Vision

Published: 08 June 2022

Volume 14, pages 1269–1286, (2022)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Kanak Manjari ORCID: orcid.org/0000-0002-2257-8474¹,
Madhushi Verma¹,
Gaurav Singal² &
…
Neeraj Kumar³

264 Accesses
3 Citations
Explore all metrics

Abstract

In developing countries, stray animals can be frequently encountered on the roads, pathways, campuses, and other places. Due to this, the visually impaired (VI) are at more risk than the sighted ones. To gain more security and safety, they need a solution to deal with their problem. This experimentation aims to develop a hardware–software integrated solution that can detect stray animals. Along with the detection, the solution should also provide the distance of those objects from the user and alert them if it is getting closer. To put this experiment together, Jetson NANO and ZED mini camera have been chosen for processing and image capturing to make the solution mobile and accurate as they are leading hardware devices. A novel approach involving quantization and anchor-optimization has been proposed using Single-Shot Detector (SSD) Resnet 50 FPN as the base model. The model has been compressed by quantization to reduce the inference time, and anchor optimization has been done to compensate for the accuracy loss faced during quantization. We have performed experimentation by training the original model, anchor-optimized model, and quantization plus anchor-optimized model using batch sizes 64 and 8. This experimentation has been done to understand the effect of anchor-optimization and quantization on the base model and the effect of the batch size used for training on different model versions. The performance of all the models with applied quantization and anchor-optimization for both batch sizes 64 and 8 has been noted in mAP. The mAP of the quantized plus anchor-optimized model trained using batch size 64 was the highest, i.e. 93.5%. It can also be concluded that we can achieve the light-weight model with the best performance by balancing quantization and anchor-optimization to make it suitable for an edge device using batch size 64.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Ajantha Vijayakumar & Subramaniyaswamy Vairavasundaram

Abbreviations

SSD:: Single-Shot Detector
FPN:: Feature Pyramid Network
L:: Loss Function
σ :: Sigmoid Activation function
ao:: Anchor Optimization
q:: Quantization
act quant:: Activation Quantization
wt quant:: Weight Quantization
uint:: Unsigned INT
Org + BS64:: SSD Resnet50 v1 FPN 64
Org + AO + BS64:: SSDResnet50v1
FPN + Anchor:: Optimization 64
Org + AO + Q + BS64:: SSDResnet50v1
FPN + Anchor:: Optimization + Quantization 64
Org + BS8:: SSD Resnet50 v1 FPN 8
Org + AO + BS8:: SSDResnet50v1
FPN + Anchor:: Optimization 8
Org + AO + Q + BS8:: SSDResnet50v1
FPN + Anchor:: Optimization + Quantization 8

References

Khan MA, Paul P, Rashid M, Hossain M, Ahad MAR. An AI-based visual aid with integrated reading assistant for the completely blind. IEEE Transactions on Human-Machine Systems. 2020;50(6):507–17.
Article Google Scholar
Duh PJ, Sung YC, Chiang LYF, Chang YJ, Chen KW. V-Eye: a vision-based navigation system for the visually impaired. IEEE Trans on Multimedia. 2020.
Chang WJ, Chen LB, Sie CY, Yang CH. An artificial intelligence edge computing-based assistive system for visually impaired pedestrian safety at Zebra crossings. IEEE Trans Consum Electron. 2020.
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft coco: common objects in context. In: European conference on computer vision. Springer; 2014. p. 740–755.
Kumar SA, Mahesh G. IoT in smart healthcare system. In: Internet of things for healthcare technologies. Springer; 2021. p. 1–19.
Banupriya N, Saranya S, Swaminathan R, Harikumar S, Palanisamy S. Animal detection using deep learning algorithm. J Crit Rev. 2020;7(1):434–9.
Google Scholar
Bonneau M, Vayssade JA, Troupe W, Arquet R. Outdoor animal tracking combining neural network and time-lapse cameras. Comput Electron Agric. 2020;168:105150.
Chen RC, Liu QE, Liao CY. Using deep learning to track stray animals with mobile device. J Comput. 2021;32(1):95–101.
Google Scholar
Shepley A, Falzon G, Meek P, Kwan P. Automated location invariant animal detection in camera trap images using publicly available data sources. Ecol Evol. 2021;11(9):4494–506.
Article Google Scholar
Singh A, Pietrasik M, Natha G, Ghouaiel N, Brizel K, Ray N. Animal detection in man-made environments. In: The IEEE Winter Conference on Applications of Computer Vision. IEEE; 2020. p. 1438–1449.
Sharma R, Pasi N, Shanu S. An automated animal classification system: a transfer learning approach. In: 5th International Conference on Next Generation Computing Technologies (NGCT-2019). SSRN; 2020.
Chen X, Guhl J. Industrial robot control with object recognition based on deep learning. Procedia CIRP. 2018;76:149–54.
Article Google Scholar
Trejo S, Martinez K, Flores G. Depth map estimation methodology for detecting free-obstacle navigation areas. In: 2019 International Conference on Unmanned Aircraft Systems (ICUAS). IEEE; 2019. p. 916–922.
Varma V, Adarsh S, Ramachandran K, Nair BB. Real time detection of speed hump/bump and distance estimation with deep learning using GPU and ZED stereo camera. Prog Comput Sci. 2018;143:988–97.
Article Google Scholar
Guin S, Hablani R, Gupta R. Object detection and distance estimation of a mobile robot with stereo vision. HELIX. 2018;8(5):4051–5.
Article Google Scholar
Manjari K, Verma M, Singal G. A travel aid for visually impaired: R-Cane. In: International Conference on Smart City and Informatization. Springer; 2019a. p. 404–417.
Manjari K, Verma M, Singal G. CREATION: computational constrained travel aid for object detection in outdoor environment. In: 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS). IEEE; 2019b. p. 247–254.
Manjari K, Verma M, Singal G. A survey on assistive technology for visually impaired. Internet of Things. 2020;11:100188.
Ahmad M, Abdullah M, Han D. Small object detection in aerial imagery using retinanet with anchor optimization. In: 2020 International Conference on Electronics, Information, and Communication (ICEIC). IEEE; 2020. p. 1–3.
Zlocha M, Dou Q, Glocker B. Improving RetinaNet for CT lesion detection with dense masks from weak RECIST labels. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer; 2019. p. 402–410.
Gupta M, Agrawal P. Compression of deep learning models for text: a survey. ACM Transactions on Knowledge Discovery from Data (TKDD). 2022;16(4):1–55.
Article Google Scholar
Ofori M, El-Gayar O, O’Brien A, Noteboom C. A deep learning model compression and ensemble approach for weed detection. In: Proceedings of the 55th Hawaii International Conference on System Sciences. 2022.
Jacob B, Kligys S, Chen B, Zhu M, Tang M, Howard A, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE; 2018. p. 2704–2713.
Wu H, Judd P, Zhang X, Isaev M, Micikevicius P. Integer quantization for deep learning inference: principles and empirical evaluation. arXiv e-prints arXiv–2004. 2020.
Li R, Wang Y, Liang F, Qin H, Yan J, Fan R. Fully quantized network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE; 2019. p. 2810–2819.
Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C. A survey on deep transfer learning. In: International conference on artificial neural networks. Springer; 2018. p. 270–279.
Weiss K, Khoshgoftaar TM, Wang D. A survey of transfer learning. Journal of Big Data. 2016;3(1):1–40.
Article Google Scholar
Islam MM, Sadi MS, Zamli KZ, Ahmed MM. Developing walking assistants for visually impaired people: a review. IEEE Sens J. 2019;19(8):2814–28.
Article Google Scholar
Habib A, Islam MM, Kabir MN, Mredul MB, Hasan M. Staircase detection to guide visually impaired people: a hybrid approach. Revue d’Intelligence Artificielle. 2019;33(5):327–34.
Article Google Scholar
Rahman MM, Islam MM, Ahmmed S. “BlindShoe”: an electronic guidance system for the visually impaired people. Journal of Telecommunication, Electronic and Computer Engineering (JTEC). 2019;11(2):49–54.
Alam MN, Islam MM, Habib MA, Mredul MB. Staircase detection systems for the visually impaired people: a review. International Journal of Computer Science and Information Security (IJCSIS). 2018;16(12):13–8.
Google Scholar
Mahmud S, Sourave RH, Islam M, Lin X, Kim JH. A vision based voice controlled indoor assistant robot for visually impaired people. In: 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS). IEEE; 2020. p. 1–6.
Khanom M, Sadi MS, Islam MM. A comparative study of walking assistance tools developed for the visually impaired people. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT). IEEE; 2019. p. 1–5.
Islam MM, Sadi MS. Path hole detection to assist the visually impaired people in navigation. In: 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT). IEEE; 2018. p. 268–273.
Kandalan RN, Namuduri K. Techniques for constructing indoor navigation systems for the visually impaired: a review. IEEE Transactions on Human-Machine Systems. 2020;50(6):492–506.
Article Google Scholar
Islam MM, Sadi MS, Bräunl T. Automated walking guide to enhance the mobility of visually impaired people. IEEE Transactions on Medical Robotics and Bionics. 2020;2(3):485–96.
Article Google Scholar
Rahman MM, Islam MM, Ahmmed S, Khan SA. Obstacle and fall detection to guide the visually impaired people with real time monitoring. SN Computer Science. 2020;1:1–10.
Article Google Scholar
Rahman MA, Sadi MS, Islam MM, Saha P. Design and development of navigation guide for visually impaired people. In: 2019 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON). IEEE; 2019. p. 89–92.
Kamal MM, Bayazid AI, Sadi MS, Islam MM, Hasan N. Towards developing walking assistants for the visually impaired people. In: 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC). IEEE; 2017. p. 238–241.
Trucco E, Verri A. Introductory techniques for 3-D computer vision. vol. 201. Prentice Hall Englewood Cliffs; 1998.
Szeliski R. Computer vision: algorithms and applications. Springer Science & Business Media; 2010.
Ortiz LE, Cabrera EV, Gonçalves LM. Depth data error modeling of the ZED 3D vision sensor from stereolabs. ELCVIA: electronic letters on computer vision and image analysis. 2018;17(1):0001–15.
Targ S, Almeida D, Lyman K. ResNet in ResNet: generalizing residual architectures. arXiv e-prints arXiv–1603. 2016.
Nahshan Y, Chmiel B, Baskin C, Zheltonozhskii E, Banner R, Bronstein AM, et al. Loss aware post-training quantization. arXiv e-prints arXiv–1911. 2019.
Mishchenko Y, Goren Y, Sun M, Beauchene C, Matsoukas S, Rybakov O, et al. Low-bit quantization and quantization-aware training for small-footprint keyword spotting. In: 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE; 2019. p. 706–711.
Henderson P, Ferrari V. End-to-end training of object class detectors for mean average precision. In: Asian Conference on Computer Vision. Springer; 2016. p. 198–213.
Rahman MA, Wang Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In: International symposium on visual computing. Springer; 2016. p. 234–244.
Nayak P, Zhang D, Chai S. Bit efficient quantization for deep neural networks. arXiv preprint arXiv:191004877. 2019.
Krishnamoorthi R. Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv e-prints arXiv–1806. 2018.

Download references

Acknowledgements

The authors would like to thank The Blind Relief Association, Delhi, for providing the opportunity to interact and get valuable feedback from blind persons. We would also like to thank Department for International Development (DFID) UK for organizing Assistive Technology Exhibition in collaboration with Skill Council for Persons with Disability (ScPWD) and Assistech IIT Delhi that helped us in interacting with the blind, partially blind, and sighted persons.

Author information

Authors and Affiliations

Bennett University, Greater Noida, India
Kanak Manjari & Madhushi Verma
Netaji Subhas University of Technology, Delhi, India
Gaurav Singal
Thapar Institute of Engineering & Technology, Patiala, India
Neeraj Kumar

Authors

Kanak Manjari
View author publications
You can also search for this author in PubMed Google Scholar
Madhushi Verma
View author publications
You can also search for this author in PubMed Google Scholar
Gaurav Singal
View author publications
You can also search for this author in PubMed Google Scholar
Neeraj Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This paper is the result of the hard work of all the authors. The research problem was defined by all. Then, they contributed to developing the model, testing and analysing the results obtained from experimentation. All of the authors contributed to the writing of the paper as well. Finally, all the authors checked and approved the final manuscript.

Corresponding author

Correspondence to Kanak Manjari.

Ethics declarations

Ethical Approval

All applicable international, national, and/or institutional guidelines for the care and use of animals were followed. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.

Informed Consent

Informed consent was obtained from all participants included in the study.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Manjari, K., Verma, M., Singal, G. et al. QAOVDetect: A Novel Syllogistic Model with Quantized and Anchor Optimized Approach to Assist Visually Impaired for Animal Detection using 3D Vision. Cogn Comput 14, 1269–1286 (2022). https://doi.org/10.1007/s12559-022-10020-8

Download citation

Received: 02 March 2021
Accepted: 18 April 2022
Published: 08 June 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s12559-022-10020-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

QAOVDetect: A Novel Syllogistic Model with Quantized and Anchor Optimized Approach to Assist Visually Impaired for Animal Detection using 3D Vision

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Informed Consent

Conflicts of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

QAOVDetect: A Novel Syllogistic Model with Quantized and Anchor Optimized Approach to Assist Visually Impaired for Animal Detection using 3D Vision

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Informed Consent

Conflicts of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation