Skip to main content

An Approach for Object Recognition in Videos for Vocabulary Extraction

  • Conference paper
  • First Online:
Nature of Computation and Communication (ICTCC 2023)

Abstract

English is the most common language globally, and it is increasingly important. English has been compiled in most online documents, information, and contents. However, with a considerable vocabulary, learning English is difficult for many people to remember. Therefore, many modern technologies have been proposed to support English learning, such as English learning technology through word-matching games to help children become excited and easily approach English from an early age. In addition, translation tools can help users look up vocabularies, antonyms, synonyms, and examples. This study presents a method to support learning English via object detection in videos, images, or even live-stream videos in real-time using deep learning architectures such as You Look Only Once (YOLO) - one of the finest families of object detection models with state-of-the-art performances. The method to obtain an mAP is 55.6 with 17GFlops. The results are vocabulary, meaning, and making sentences with that. Our method has good accuracy in data of 2786 images belonging to 59 classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://electrek.co/2017/04/29/elon-musk-tesla-plan-level-5-full-autonomous-driving/.

  2. 2.

    https://blog.waymo.com/2019/08/introducing-waymos-suite-of-custom.html.

  3. 3.

    https://roboflow.com.

  4. 4.

    https://www.sciencedirect.com/topics/earth-and-planetary-sciences/image-classification.

  5. 5.

    https://kikaben.com/object-detection-vs-image-classification/#chapter-1.

  6. 6.

    https://www.mathworks.com/discovery/object-detection.html.

  7. 7.

    https://kikaben.com/object-detection-vs-image-classification/chapter-1.

  8. 8.

    https://www.mathworks.com/discovery/object-detection.html.

  9. 9.

    https://github.com/ultralytics/yolov5.

References

  1. Liu, H., Aderon, C., Wagon, N., Liu, H., MacCall, S., Gan, Y.: Deep learning-based automatic player identification and logging in American football videos. arXiv preprint arXiv:2204.13809 (2022)

  2. Zou, S., et al.: TOD-CNN: an effective convolutional neural network for tiny object detection in sperm videos. arXiv preprint arXiv:2204.08166 (2022)

  3. Zhao, W., et al.: A survey of semen quality evaluation in microscopic videos using computer assisted sperm analysis. arXiv preprint arXiv:2202.07820 (2022)

  4. Gu, Y., Liao, X., Qin, X.: YouTube-GDD: a challenging gun detection dataset with rich contextual information. arXiv preprint arXiv:2203.04129 (2022)

  5. Yin, Q., et al.: Detecting and tracking small and dense moving objects in satellite videos: a benchmark. IEEE Trans. Geosci. Remote Sens. 60, 1–18 (2022). https://doi.org/10.1109/TGRS.2021.3130436

  6. Zhu, X., Dai, J., Yuan, L., Wei, Y.: Towards high performance video object detection. arXiv preprint arXiv:1711.11577 (2017)

  7. Tang, P., Wang, C., Wang, X., Liu, W., Zeng, W., Wang, J.: Object detection in videos by high quality object linking. arXiv preprint arXiv:1801.09823 (2018)

  8. He, F., Gao, N., Jia, J., Zhao, X., Huang, K.: QueryProp: object query propagation for high-performance video object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, pp. 834–842 (2022). https://doi.org/10.1609/aaai.v36i1.19965

  9. Han, M., Wang, Y., Chang, X., Qiao, Y.: Mining inter-video proposal relations for video object detection (2020). https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123660426.pdf

  10. Kolarova, S.T.V., et al.: Autonomous driving (2016). https://www.ifmo.de/files/publications_content/2016/ifmo_2016_Autonomous_Driving_2035_en.pdf

  11. Advantech Co., Ltd.: The future of intelligent surveillance (2012). https://advcloudfiles.advantech.com/ecatalog/MyAdvantech/MyAdvantech_No_11_eng.pdf

  12. Han, H., et al.: Real-time robust video object detection system against physical-world adversarial attacks. arXiv preprint arXiv:2208.09195 (2022)

  13. Schofield, D., et al.: Chimpanzee face recognition from videos in the wild using deep learning. Sci. Adv. 5(9), eaaw0736 (2019). https://www.science.org/doi/abs/10.1126/sciadv.aaw0736

  14. Ardianto, S., Hang, H.M., Cheng, W.H.: Fast vehicle detection and tracking on fisheye traffic monitoring video using CNN and bounding box propagation. arXiv preprint arXiv:2207.01183 (2022), to be published in International Conference on Image Processing (ICIP) 2022, Bordeaux, France

  15. Raskar, P.S., Shah, S.K.: Real time object-based video forgery detection using YOLO (V2) (2021). https://doi.org/10.1016/j.forsciint.2021.110979

  16. Jiang, C., et al.: Object detection from UAV thermal infrared images and videos using YOLO models (2022). https://doi.org/10.1016/j.jag.2022.102912

  17. Torresani, G.B.L., Shi, J.: Object detection in video with spatiotemporal sampling networks (2018). https://openaccess.thecvf.com/content_ECCV_2018/papers/Gedas_Bertasius_Object_Detection_in_ECCV_2018_paper.pdf

  18. Deng, H., et al.: Object guided external memory network for video object detection (2019). https://ieeexplore.ieee.org/document/9011008

  19. Oh, S.W., University, Y., Lee, J.Y., Research, A., Xu, N., Research, A., Kim, S.J., University, Y.: Video object segmentation using space-time memory networks (2019). https://openaccess.thecvf.com/content_ICCV_2019/papers/Oh_Video_Object_Segmentation_Using_Space-Time_Memory_Networks_ICCV_2019_paper.pdf

  20. Fan, Q., Tang, C.K., Tai, Y.W.: Few-shot video object detection (2021). https://www.researchgate.net/publication/351278547_Few-Shot_Video_Object_Detection#pf9

  21. Ultralytics: Ultralytics yolov5. https://github.com/ultralytics/yolov5. Accessed 27 Sep 2023

Download references

Acknowledgement

This study is funded in part by the Can Tho University, Code: THS2022-15.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hai Thanh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Le, A.B.N. et al. (2024). An Approach for Object Recognition in Videos for Vocabulary Extraction. In: Cong Vinh, P., Mahfooz Ul Haque, H. (eds) Nature of Computation and Communication. ICTCC 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 586. Springer, Cham. https://doi.org/10.1007/978-3-031-59462-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-59462-5_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-59461-8

  • Online ISBN: 978-3-031-59462-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics