skip to main content
10.1145/3704522.3704557acmotherconferencesArticle/Chapter ViewFull TextPublication PagesnsyssConference Proceedingsconference-collections
short-paper
Open access

Short Paper: Detecting and Decoding: A YOLO-Transformer Hybrid Model for Bangla License Plate Recognition

Published: 03 January 2025 Publication History

Abstract

Automatic Number Plate Recognition (ANPR) systems have become critical for traffic management and law enforcement, especially in regions with unique script challenges such as Bangladesh. This paper presents a comprehensive approach to detecting and extracting text from Bangladeshi license plates using a combination of state-of-the-art object detection models and text extraction techniques. We employ three versions of the YOLO (You Only Look Once) model—YOLOv5, YOLOv8, and YOLOv10—to detect Bangladeshi license plates with remarkable accuracy rates of 96.5%, 96.8%, and 96.87%, respectively. For text extraction from the detected plates, we leverage a Transformer-based model, achieving an overall text recognition accuracy of 89.96%. Our results demonstrate that YOLOv10 marginally outperforms its predecessors in detection accuracy, while the text extraction performance remains consistent across all detection models. This study offers a robust solution for Bangladeshi license plate recognition, paving the way for further improvements in regional ANPR systems. We have also developed a dataset by merging previously available datasets with our own collected(40,000) vehicle images. To address the issue of class imbalance, we supplemented the dataset by generating synthetic data(60,000).

1 Introduction

Automatic Number Plate Recognition (ANPR) systems have gained significant attention, particularly in the context of intelligent transportation systems, with widespread implementation in various countries. These systems play a pivotal role in tasks such as traffic law enforcement, traffic monitoring, and vehicle park management. Beyond conventional applications, ANPR systems are instrumental in facilitating tasks like toll collection, entrance and exit management in vehicle parks, and enforcing security measures in restricted areas such as military campsites and protected sanctuaries. Their versatile utility extends to fraud prevention and heightened security measures in specific regions, aiding in locating missing vehicles or those associated with criminal activities.
The deployment of ANPR systems significantly reduces the need for extensive human labor, time, and resources that would otherwise be required for similar tasks. Moreover, manual intervention in such activities introduces the risk of erroneous interpretations, while reading license plates of moving vehicles efficiently poses practical challenges for human operators.
The unique challenges in the Bangladeshi ANPR landscape stem from the variability in license plate designs and the scarcity of labeled data. Traditional approaches often fall short in delivering consistent and accurate results in such dynamic and diverse scenarios.
Major contribution of this research:
Hybrid Architecture: The combination of YOLO and Transformers showcases the novelty of a hybrid architecture, where the strengths of object detection and sequence-based tasks are seamlessly integrated for comprehensive ANPR.
Dataset Enrichment: We collected 40,000 real-world images from various locations across Bangladesh to address the data scarcity of Bangladeshi vehicle images. Despite this, we encountered certain edge cases for which real-world data was unavailable. To cover these cases, we generated 60,000 synthetic images.

2 Related Work

The Automatic Number Plate Recognition (ANPR) system has been a focus of research for many years, with researchers around the world exploring various methods to enhance its development. Abdullah et al. [1] utilized YOLOv3 for license plate detection and ResNet-20 for character recognition. Their dataset consisted of 1,500 license plate images and 6,400 character images for training the localization and recognition models, respectively. They reported an accuracy of 92.7%. However, their approach only targeted plates from the Dhaka Metropolitan Area, limiting its ability to generalize to other cities. Dhar et al. [11] proposed a Shape Validation Technique for license plate detection, followed by tilt correction and Connected Component Analysis to segment text, characters, and digits. For recognition, they employed an Adaboost Classifier using two key features: Histogram of Oriented Gradients (HOG) and Local Binary Pattern (LBP). Their dataset included 2,800 images across 14 different classes, achieving an accuracy of 97.2%. Sarif et al. [27] proposed a system that uses YOLOv3 for license plate localization and a custom segmentation algorithm to extract text, characters, and digits from the plates. These segmented elements were then fed into a CNN model for recognition, achieving a 97.5% accuracy. However, their model was tested on only 16 different classes, which is insufficient for real-world scenarios involving Bangladeshi vehicle license plates. Additionally, the dataset primarily consisted of private vehicles from Dhaka, making their claims less robust when applied to license plates of commercial vehicles or those from other regions. Saif et al. [26] proposed using the YOLOv3 model for both number plate localization and recognition. Their dataset, however, was limited to just 1,050 images of private vehicles. While they reported an accuracy of 99.5%, this claim does not hold for commercial vehicle license plates, which were not included in their dataset. Additionally, their accuracy measurement was based on a binary evaluation of the entire license plate, rather than a more granular, character-level approach.
Kumari et al. [18] proposed an approach that applies image preprocessing techniques followed by Contour Tracing and Edge Detection for license plate localization. For character segmentation and recognition, they utilized neural network models, aiming to enhance the accuracy of the overall system. Ahmed et al. [3] and Choudhary et al. [9] primarily focused on the recognition aspect of license plates. In [3], Ahmed et al. employed Horizontal and Vertical Projection along with Gray Level Occurrence to extract readable text from plates. In contrast, Choudhary et al. [10] used a combined CNN-LSTM model for character segmentation and recognition, achieving a claimed success rate of 99.64%. Venkateswari et al. [31] focused on license plate localization, utilizing the highest Horizontal and Vertical histogram values to extract the Region of Interest (ROI) for accurate plate detection. In [30], Surekha et al. reported achieving an accuracy of 97%. They performed several image preprocessing operations and compared Morphological Processing with Edge Processing for license plate area extraction. For character extraction, they utilized Connected Component Analysis and recognized the characters using a supervised learning model.
Most of the proposed systems are not well-suited for Bangladeshi vehicle license plates, as many are tailored to specific regions, languages, and types of license plates. However, some prior work has been conducted specifically for Bangladeshi license plates. For instance, Nooruddin et al. [21] proposed utilizing color features in conjunction with MinPool and MaxPool features to enhance license plate detection. Amin et al. [5] proposed a system that combines Edge Detection, Binary Thresholding, and Hough Transformation for license plate localization, followed by Optical Character Recognition (OCR) for recognizing text in the Bangla language. However, their approach has not achieved notable accuracy and lacks generalizability across different contexts. In their paper, Baten et al. [8] proposed a method that leverages a unique feature of the Bangla language known as "Matra" along with Connected Component Analysis for text detection and segmentation. They then employed Template Matching for the recognition phase. However, they provided limited information regarding their dataset and the accuracy of their approach. Abedin et al. [2] proposed using Contour Properties for both license plate detection and character segmentation, followed by a CNN model for character recognition. They reported an overall accuracy of 92% with a processing time of 0.11 seconds. However, their dataset primarily consisted of private vehicles, and they did not account for all vehicle categories or focus on performance under night conditions. Rahman et al. [23] concentrated solely on the recognition task, requiring manual extraction of license plates and individual characters from the images. They then utilized a CNN model to recognize the characters. Their dataset consisted of 1,750 images, which involved considerable effort to compile.
In [7], Azam et al. focused primarily on noise removal from images to enhance the detection of license plate regions, achieving a detection accuracy of 94%. Their approach included the use of a frequency domain mask to eliminate rain strokes, a contrast enhancement method, Radon transform for tilt correction, and an image entropy-based technique to filter license plate regions. Hossain et al. [13] developed a system based on various image processing operations, utilizing the Sobel edge operator, dilation, erosion, boundary features, and horizontal and vertical projection to extract license plate regions. They then divided the extracted plate region into two halves, using boundary features for character segmentation and Template Matching for recognition. However, their system struggles with ambiguous character recognition and images tilted beyond 10 degrees. They claimed 90% accuracy. Chowdhury et al. [10] extracted the license plate region using color information and segmented it into two halves based on centroid data, followed by character extraction using bounding box parameters. They used a Support Vector Machine (SVM) for character recognition and claimed a 99.3% accuracy rate. However, their system was limited to private vehicle images and struggled when the license plate was out of focus or when the image quality was not ideal. Furthermore, their testing was restricted to only 14 classes, limiting its applicability. In [15], Islam et al. used Horizontal and Vertical projections along with geometric properties to extract license plate regions after preprocessing. Character localization was performed using Connected Component Analysis and bounding box techniques. For character recognition, they employed an SVM model using features extracted with Histogram of Oriented Gradients (HOG). While they achieved high recognition accuracy, their system did not account for non-ideal conditions. It failed when image resolution was low and struggled to detect license plates from commercial vehicles. Ahsan et al. [4] proposed a system that uses Template Matching to localize the license plate region, employs Spatial Super Resolution techniques to enhance image quality, and utilizes the Bounding Box method for character segmentation. They used AlexNet for character recognition, achieving a high accuracy of 98.2%. However, they did not provide details about the number of classes AlexNet was trained on. Additionally, the Template Matching technique often struggles to detect targets when the license plate is tilted in the image.
Quadri et al. [22] employed a Smearing algorithm to extract the license plate region, followed by row and column segmentation for Optical Character Recognition (OCR) to recognize the text from the plate. Shidore et al. [28] utilized the Sobel Filter, Morphological Operations, Connected Component Analysis, and Vertical Projection Analysis for license plate detection. They employed a SVM for character recognition. Lekhana et al. [19] presented an approach that combines Spectral Analysis with Connected Component Analysis for detecting license plate regions, followed by the use of a SVM for character recognition. Astari et al. [6] reported achieving significant accuracy in their paper, where they proposed a system utilizing color features and a hybrid classifier combining a Decision Tree and a SVM for license plate detection and recognition. Wang et al. [33] employed Image Processing techniques for the license plate localization and segmentation stages, and used a Convolutional Neural Network (CNN) model for character recognition. Jain et al. [16] utilized Image Processing techniques with Sobel Edge Detection for license plate localization, followed by Optical Character Recognition (OCR) to recognize the characters on the license plate. Lin et al. [20] employed the YOLOv2 model for vehicle and license plate localization, used classic Image Processing operations for segmentation, and implemented a custom LPR-CNN model for character recognition.

3 Dataset

The Bangladesh Road Transport Authority (BRTA) serves as the regulatory agency tasked with overseeing, managing, and enforcing discipline and safety in the country’s road transport sector. In 2012, BRTA launched a new vehicle license plate system called the Retro-Reflective License Plate, widely known as the digital license plate, as part of its digitalization efforts. Since its rollout, it has become mandatory for vehicles to display this license plate on their rear.
The digital license plates are classified into two categories: one for private vehicles and the other for trading vehicles. Private vehicle plates have a white background with black text Fig. 1a, while trading vehicle plates feature a green background with black text Fig. 1b. Each plate contains two separate rows of text, characters, and numbers.
Figure 1:
Figure 1: BRTA Standard Vehicle Registration Plate Structure
In the top row, the first word indicates the district where the vehicle was registered. The optional second word identifies the area if the vehicle is registered in a metropolitan zone. The only character in this row, separated by a hyphen, denotes the category of the vehicle.
In the bottom row, the first two digits represent the vehicle’s class registration number, followed by four additional digits separated by a hyphen, which together constitute the vehicle’s serial number. It is mandatory for the license plates to display information in the Bangla language.
We collected a comprehensive dataset of vehicle and license plate images specific to Bangladeshi vehicles, along with their corresponding annotations.
The dataset was significantly enriched by contributions from Hossain et al. [14], which includes a combination of images sourced from Nooruddin et al. [21] and additional images collected by the authors. The first subset of this dataset comprises approximately 2,800 images designed for vehicle localization, while the second subset contains around 4,000 license plate images, which were cropped from the initial dataset for focused analysis.
Figure 2:
Figure 2: Image for License Plate Detection
Another dataset was introduced by Shomee et al. [29] they compiled a detailed dataset comprising 1,928 images for vehicle localization Fig. 2 and an additional 2,662 license plate images. The second subset includes 720 synthetic images and 1,942 manually cropped images, which were derived from the localization dataset.
We combined these two datasets, along with their annotations, and integrated them with our own collected images to create a more comprehensive dataset for vehicle and license plate recognition tasks. For localization, both datasets included bounding box annotations for license plates. However, text extraction posed a greater challenge due to a mismatch in the number of annotation classes across the datasets.
Figure 3:
Figure 3: Synthetic License Plate KImages
We collected 40,000 vehicle images from various regions of Bangladesh, each annotated for license plate detection and text extraction. Upon merging all available data, the dataset revealed a significant imbalance, with approximately 75% of the vehicles registered in the Dhaka metropolitan area. Addressing this imbalance with real-world data proved challenging, so we generated 70,000 synthetic license plate images (Fig. 3) to ensure a more representative distribution from other districts, improving the overall dataset diversity. For synthetic data generation, we primarily adhered to the BRTA’s standard vehicle registration plate structure. However, recognizing that many vehicles in Bangladesh do not comply with the proper BRTA format (Fig. 4), we also generated a subset of synthetic images featuring irregular license plates to better reflect real-world variations.
Figure 4:
Figure 4: Some Non-compliant License Plates

4 Methodology

Our model is designed with two key components. First, it detects the license plate within an image, and then it extracts the text from the detected area.

4.1 License Plate Detection

For the detection task, numerous image processing techniques have been utilized. However, a significant limitation of these approaches, including edge detection methods, is their sensitivity to specific conditions. Our objective was to design a more robust solution capable of functioning effectively across diverse scenarios, including variations in lighting, weather conditions, and license plate deformations.
In our approach, we explored and implemented various deep learning models, including state-of-the-art object detection frameworks. Specifically, we fine-tuned R-CNN [12] and Faster R-CNN [25], as well as multiple versions of the YOLO (You Only Look Once) algorithm [24], including versions 5, 8, and 10 [32]. After extensive evaluation, we selected YOLOv5 for further use due to several key advantages it offered.
Figure 5:
Figure 5: License plate detection

4.2 Text Extraction

Extracting text from the license plate region is essential for the effective development of an Automatic Number Plate Recognition (ANPR) system. Many prior studies have concentrated on segmenting individual characters in the license plate using different techniques, followed by separate recognition of each segment. However, this method is often inefficient and introduces unnecessary complexity to the process.
Figure 6:
Figure 6: OCR Error
We evaluated several OCR engines, but their performance proved insufficient for real-world implementation within our pipeline (Fig. 6). As an alternative, we employed an object detection-based approach, where each digit and district name is treated as a separate class. While this approach shows significant promise, its performance still falls short of the desired standards. (Fig. 7)
Figure 7:
Figure 7: Error in Object Detection-based Text Extraction
We subsequently explored a different approach, given the recent advancements in document understanding models that have demonstrated exceptional performance in real-world applications. Models such as LayoutLM [34] have significantly transformed the landscape of information and text extraction from highly distorted and deformed documents. Among these advancements, the most groundbreaking innovation is the DONUT [17] model, which has shown remarkable capabilities in this domain.
The Document Understanding Transformer (DONUT) is a model based on the Swin Transformer architecture, designed to perform three key document-related tasks:
Document classification
Information extraction
Visual question-answering
In this study, we focus on the second task—information extraction—to specifically address the challenge of extracting text from detected license plates. To this end, we have adapted the information extraction component of the DONUT model and fine-tuned it using our curated dataset. This approach enables the architecture to effectively extract license plate information from images.

5 Results

To conduct the experiments we used NVIDIA A4000 and NVIDIA A5000 GPU. The machine has 32GB RAM and the CPU is Intel(R) Core(TM) i7-14700K.
Table 1:
ModelAccuracyPrecisionRecallF1 ScoreIOU
R-CNN91.5690.3491.2290.7878.32
Faster R-CNN91.3789.890.590.1578.56
YOLOv596.598.6497.7298.1785.62
YOLOv896.899.2697.498.3285.63
YOLOv1096.8799.0197.898.4986.7
Table 1: Model Evaluation Metrics
Table 1 presents a comparison of various object detection models fine-tuned for the specific task of license plate detection. The results indicate that YOLOv10 outperformed the other models in terms of accuracy. Therefore, if accuracy is the primary consideration, YOLOv10 emerges as the optimal choice for this application.
However, we used YOLOv5 in the deployment because the inference time is better. The comparison is shown in Table. 2
Table 2:
ModelTime (ms)
R-CNN41.25
Faster R-CNN37.33
YOLOv519.3
YOLOv824.7
YOLOv1022.46
Table 2: Inference Time
In object detection tasks, the performance metric utilized is Intersection over Union (IoU). A prediction is considered a positive result if the IoU exceeds 50%, while predictions with an IoU below this threshold are classified as negative results. And with these results, we calculated the precision, recall and F1 score.
Table 3:
ModelAccuracy
OCR(EasyOCR)34.24
OCR(Tesseract)32.33
Object Detection(YOLOv5)49.65
Object Detection(YOLOv8)53.06
Donut [17]89.96
Table 3: Accuracy of Text Extraction
As demonstrated in Table 3, the DONUT model significantly outperformed other approaches in real-world scenarios. Additionally, the DONUT model exhibits impressive speed, requiring only 200 ms to extract license plate numbers from images when using an NVIDIA A5000 GPU, while it takes approximately 1.5 to 2 seconds on an average CPU.

6 Discussion

The results of this study highlight the effectiveness of the YOLOv5+ DONUT hybrid model for automatic number plate recognition (ANPR) in the context of Bangladeshi vehicles. By leveraging the strengths of both models—YOLOv5’s robust object detection capabilities and the DONUT model’s efficiency in extracting textual information—we were able to achieve high levels of accuracy and speed in image-based license plate recognition tasks.
Table 4:
ModelNightR/G WD/A LPDD
[23]   
[1]   
[11]    
[13]   
[10]    
[4]   
[29]  
[14]  
YOLO+Donut
Table 4: Comparison With Other Approaches
R/G W - Rainy/Gloomy Weather
D/A LP - Deformed/Abnormal License Plate
DD - Diverser Dataset
Table 4 demonstrates that our hybrid model consistently outperforms other approaches across all challenging aspects. While the accuracy of our model is slightly lower in comparison, it excels in challenging real-world conditions. Specifically, it performs robustly under varying lighting conditions (both day and night), and adverse weather, including rainy or overcast environments. Moreover, our model effectively detects and extracts text from deformed or non-compliant license plates. A key strength of our approach lies in its extensive use of real-world data, complemented by synthetic data to ensure balanced representation. This comprehensive design has proven superior to alternative methods in real-world scenarios.

7 Conclusion

This paper primarily seeks to identify the optimal solution for an automatic license plate recognition system specifically designed for vehicles in Bangladesh. To achieve this objective, we conducted a series of experiments with several state-of-the-art models, assessing their performance in various scenarios.
Among the models evaluated, we were pleasantly surprised by the exceptional results yielded by the DONUT model, which demonstrated significant efficacy in this domain. Consequently, we developed a hybrid system that integrates YOLOv5 with the DONUT model. This hybrid approach strikes an optimal balance between accuracy and inference speed, making it particularly suitable for our application.
Currently, our model operates exclusively on still images, effectively extracting license plate information. However, we envision future enhancements that will enable our system to process video feeds, allowing for real-time recognition and display of results. This advancement would significantly enhance the practical applicability of our automatic license plate recognition system in real-world settings.

References

[1]
Sohaib Abdullah, Md Mahedi Hasan, and Sheikh Muhammad Saiful Islam. 2018. YOLO-based three-stage network for Bangla license plate recognition in Dhaka metropolitan city. In 2018 International Conference on Bangla Speech and Language Processing (ICBSLP). IEEE, 1–6.
[2]
Md Zainal Abedin, Atul Chandra Nath, Prashengit Dhar, Kaushik Deb, and Mohammad Shahadat Hossain. 2017. License plate recognition system based on contour properties and deep learning model. In 2017 IEEE region 10 humanitarian technology conference (R10-HTC). IEEE, 590–593.
[3]
Abdullah Khalid Ahmed, Mohammed Qasim Taha, and Ahmed Shamil Mustafa. 2018. On-road automobile license plate recognition using co-occurrence matrix. (2018).
[4]
Nur-A Alam, Mominul Ahsan, Md Abdul Based, and Julfikar Haider. 2021. Intelligent system for vehicles number plate detection and recognition using convolutional neural networks. Technologies 9, 1 (2021), 9.
[5]
Md Ruhul Amin, Noor Mohammad, and Md Abu Naser Bikas. 2014. An automatic number plate recognition of Bangladeshi vehicles. International Journal of Computer Applications 93, 15 (2014), 16293–5999.
[6]
Amir Hossein Ashtari, Md Jan Nordin, and Mahmood Fathy. 2014. An Iranian license plate recognition system based on color features. IEEE transactions on intelligent transportation systems 15, 4 (2014), 1690–1705.
[7]
Samiul Azam and Md Monirul Islam. 2016. Automatic license plate detection in hazardous condition. Journal of Visual Communication and Image Representation 36 (2016), 172–186.
[8]
Raiyan Abdul Baten, Zunaid Omair, and Urmita Sikder. 2014. Bangla license plate reader for metropolitan cities of Bangladesh using template matching. In 8th International Conference on Electrical and Computer Engineering. IEEE, 776–779.
[9]
Namrata Choudhary and Kirti Jain. 2019. License plate recognition using combination of CNNLSTM. The International journal of analytical and experimental modal analysis 11, IX (2019).
[10]
Md Burhan Uddin Chowdhury, Prashengit Dhar, and Sunanda Guha. 2020. Detection and recognition of Bangladeshi license plate. International Journal 9, 3 (2020).
[11]
Prashengit Dhar, Md Zainal Abedin, Razuan Karim, Mohammad Shahadat Hossain, et al. 2019. Bangladeshi license plate recognition using adaboost classifier. In 2019 Joint 8th International Conference on Informatics, Electronics & Vision (ICIEV) and 2019 3rd International Conference on Imaging, Vision & Pattern Recognition (icIVPR). IEEE, 342–347.
[12]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580–587.
[13]
Mohammad Jaber Hossain, Md Hasan Uzzaman, and AFMS Saif. 2018. Bangla digital number plate recognition using template matching for higher accuracy and less time complexity. International Journal of Computer Applications 975 (2018), 8887.
[14]
Syed Nahin Hossain, Md Zahim Hassan, and Md Masum Al Masba. 2022. Automatic license plate recognition system for bangladeshi vehicles using deep neural network. In Proceedings of the International Conference on Big Data, IoT, and Machine Learning: BIM 2021. Springer, 91–102.
[15]
Rashedul Islam, Md Rafiqul Islam, and Kamrul Hasan Talukder. 2020. An efficient method for extraction and recognition of bangla characters from vehicle license plates. Multimedia Tools and Applications 79, 27 (2020), 20107–20132.
[16]
Kartikeya Jain, Tanupriya Choudhury, and Nirbhay Kashyap. 2017. Smart vehicle identification system using OCR. In 2017 3rd international conference on computational intelligence & communication technology (CICT). IEEE, 1–6.
[17]
Geewook Kim, Teakgyu Hong, Moonbin Yim, JeongYeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, and Seunghyun Park. 2022. Ocr-free document understanding transformer. In European Conference on Computer Vision. Springer, 498–517.
[18]
Sweta Kumari, Leeza Gupta, and Prena Gupta. 2017. Automatic license plate recognition using OpenCV and neural network. International Journal of Computer Science Trends and Technology (IJCST) 5, 3 (2017), 114–118.
[19]
GC Lekhana and R Srikantaswamy. 2012. Real time license plate recognition system. International Journal of Advanced Technology & Engineering Research 2, 4 (2012), 5–9.
[20]
Cheng-Hung Lin, Yong-Sin Lin, and Wei-Chen Liu. 2018. An efficient license plate recognition system using convolution neural networks. In 2018 IEEE International Conference on Applied System Invention (ICASI). IEEE, 224–227.
[21]
Sheikh Nooruddin, Falguni Ahmed Sharna, and Sk Md Masudul Ahsan. 2020. A bangladeshi license plate detection system based on extracted color features. In 2020 23rd international conference on computer and information technology (ICCIT). IEEE, 1–6.
[22]
Muhammad Tahir Qadri and Muhammad Asif. 2009. Automatic number plate recognition system for vehicle identification using optical character recognition. In 2009 international conference on education technology and computer. IEEE, 335–338.
[23]
MM Shaifur Rahman, Moin Mostakim, Mst Shamima Nasrin, and Md Zahangir Alom. 2019. Bangla license plate recognition using convolutional neural networks (CNN). In 2019 22nd International Conference on Computer and Information Technology (ICCIT). IEEE, 1–6.
[24]
J Redmon. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition.
[25]
Shaoqing Ren. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:https://arXiv.org/abs/1506.01497 (2015).
[26]
Nazmus Saif, Nazir Ahmmed, Sayem Pasha, Md Saif Khan Shahrin, Md Mahmudul Hasan, Salekul Islam, and Abu Shafin Mohammad Mahdee Jameel. 2019. Automatic license plate recognition system for bangla license plates using convolutional neural network. In TENCON 2019-2019 IEEE Region 10 Conference (TENCON). IEEE, 925–930.
[27]
Md Mesbah Sarif, Tanmoy Sarkar Pias, Tanjina Helaly, Md Sohel Rana Tutul, and Md Nymur Rahman. 2020. Deep learning-based bangladeshi license plate recognition system. In 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT). IEEE, 1–6.
[28]
MM Shidore and SP Narote. 2011. Number plate recognition for indian vehicles. IJCSNS International Journal of Computer Science and Network Security 11, 2 (2011), 143–146.
[29]
Homaira Huda Shomee and Ataher Sams. 2021. License plate detection and recognition system for all types of bangladeshi vehicles using multi-step deep learning model. In 2021 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 01–07.
[30]
P Surekha, Pavan Gurudath, R Prithvi, and VG Ananth. 2018. AUTOMATIC LICENSE PLATE RECOGNITION USING IMAGE PROCESSING AND NEURAL NETWORK. ICTACT Journal on Image & Video Processing 8, 4 (2018).
[31]
P Venkateswari, E Jebitha Steffy, and N Muthukumaran. 2018. License Plate cognizance by Ocular Character Perception’. International Research Journal of Engineering and Technology 5, 2 (2018), 536–542.
[32]
Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, and Guiguang Ding. 2024. Yolov10: Real-time end-to-end object detection. arXiv preprint arXiv:https://arXiv.org/abs/2405.14458 (2024).
[33]
Chuin-Mu Wang and Jian-Hong Liu. 2015. License plate recognition system. In 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD). IEEE, 1708–1710.
[34]
Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, and Ming Zhou. 2020. Layoutlm: Pre-training of text and layout for document image understanding. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 1192–1200.

Index Terms

  1. Short Paper: Detecting and Decoding: A YOLO-Transformer Hybrid Model for Bangla License Plate Recognition

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      NSysS '24: Proceedings of the 11th International Conference on Networking, Systems, and Security
      December 2024
      278 pages
      ISBN:9798400711589
      DOI:10.1145/3704522

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 January 2025

      Check for updates

      Author Tags

      1. ANPR
      2. Bangla License Plate
      3. YOLO
      4. Donut
      5. Transformers

      Qualifiers

      • Short-paper

      Conference

      NSysS '24

      Acceptance Rates

      Overall Acceptance Rate 12 of 44 submissions, 27%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 217
        Total Downloads
      • Downloads (Last 12 months)217
      • Downloads (Last 6 weeks)131
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media