This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
In recent years, neural networks have been widely studied by researchers. Among them, the convolutional neural network, which has high adaptability and excellent recognition ability, has been widely used in such fields as classification and recognition and target detection [1–3]. Examples include Italian wine and iris classification [4], malicious code variant detection [5], and operational target threat assessment [6]. Because of the advantages of abundant hardware resources, powerful parallel computing capabilities, and configurable devices, FPGA has become an ideal choice for neural network implementation platforms. Therefore, many researchers have gone about studying the method of accelerating neural network by using FPGA [7–9].
There are researches on deep learning target detection algorithms at home and abroad, which not only reduces the amount of algorithm calculations and improves the efficiency of target detection in actual research applications, but also promotes the development of development platforms.
The rapid development of deep learning promotes the research process of target detection algorithms. In [10], Yi et al. introduced the adaptive strategy into the basic PNN model and proposed the adaptive neural network, which solved the problem of transformer fault diagnosis better. In [11], the learning-based intelligent optimization algorithm (LIOA) is studied, which has a certain learning ability and achieves better optimization behavior. Having studied the target detection algorithms, Safaei found that the training time of Fast-RCNN algorithm detection was reduced by 9.5 h, which opened up a new research path for target detection algorithms [12]. The advantages of the YOLO algorithm have promoted the use of YOLO in the field of license plate recognition systems [13, 14].
The rapid development of deep learning is inseparable from the support of engineering implementation technology. With the expansion of deep learning network systems, the amount of calculation of data information processing has increased, and the traditional CPU processors can no longer meet the demand for calculations. Abd E1-Maksoud et al. selected ASIC processors, FPGA processors, and GPU processors as the research objects. By comparing and analyzing the characteristics and functions of several processors, it was concluded that FPGA processors are more suitable for development in target detection systems’ application [15].
In [16], the monitoring method of neural network is used to realize the supervision and control of maglev train. In [4], an adaptive control method based on neural network is proposed to stabilize the air gap of nonlinear maglev train. All these are the combination of neural network and modern transportation system. License plate recognition is one of the important research topics in the application of target detection in the field of intelligent transportation, which has a wide range of practical application prospects [17]. Traditional license plate recognition algorithms have low accuracy and slow speed, and the recognition rate is easily affected by the environment. However, networks such as CNN and RPNet are used in license plate recognition systems with high accuracy and fast speed [18, 19]. The facts have proved that the advantages of the license plate recognition system using deep neural networks are very clear.
The research of this paper provides the following contributions:
(i) A license plate recognition algorithm, FAST-LPRNET, based on convolutional neural network was proposed
(ii) This algorithm can simultaneously complete license plate detection and segmentation-free recognition steps
(iii) The neural network hardware environment was successfully deployed on FPGA and the experiment was completed
(iv) A large number of experimental results show that this method is a fast and accurate license plate recognition algorithm
2. Overview of Convolutional Neural Networks
2.1. Network Structure
The core structure of the convolutional neural network consists of three parts, namely, the convolutional layer, the pooling layer, and the fully connected layer. The structure of the convolutional neural network is shown in Figure 1, each part of the convolutional neural network can be a single layer or multiple layers and can train images or convert data in image format. Taking the image format as input, the main function of the convolutional layer is to extract features, the function of the pooling layer is to reduce the amount of data processing while retaining useful information, and the function of the fully connected layer is to transform the feature map into the final desired output.
[figure omitted; refer to PDF]
The superparameters are based on experience and experiment, including learning rate and momentum. In the algorithm, the use of 3 × 3 step length 2 convolution can maximize the computational density and have fast computing power while having an appropriate receptive field [23]. The use of 8 × 1 step length 1 and 1 × 8 step length 4 is to change the layout of the feature graph because the feature becomes 8 × 32 after the 3 × 3 convolution. In order to facilitate and accelerate, using only one 8 × 1 and one 1 × 8 turns into a 1 × 7 feature map layout.
The convolution layer is used to extract the feature information of the input, which is composed of several convolution units. The parameters of each convolution unit are optimized by the backpropagation algorithm. Through the regular movement of the receptive field to the input image, the convolution operation is performed to extract the feature of the corresponding region. The bottom convolution is extracted to the low-level feature, and the high-level convolution can extract the deep feature. Convolution layer has the characteristics of local receptive field and weight sharing, which can reduce the parameters in the network. The pseudocode of the fast-LPRNet algorithm is shown in Figure 4.
[figure omitted; refer to PDF]
The image features extracted by convolution operation are linear, but the real samples are often nonlinear. The activation function is introduced so that each pixel can be represented by any value from 0 to 1 to simulate more subtle changes. Activation functions are generally nonlinear, continuously differentiable, and monotonic. The network uses the ReLU function as the activation function. Compared with the sigmoid function and the tanh function, the ReLu function has the characteristics of unilateral suppression and sparse activation [24]. That is, when x > 0, the gradient is constant to 1, and there is no gradient dissipation problem. When x < 0, the output is 0. After training, the more neurons 0, the sparser the network, and the more representative the extracted features can alleviate the problem of overfitting to some extent. The disadvantage of this activation function is that the forced sparse processing will cause the model to fail to learn more effective features, so pay attention to the setting of the learning rate during training to prevent too many neurons’ necrosis.
3.2. Implementation of Fast-LPRNet
3.2.1. Shared-Weight Classifier
Two shared-weight classifiers with convolution kernel of 1 × 8 step size of 4 are used to perform convolution operation on the feature map, and two feature maps with size of 1 × 7 are obtained. Their channel numbers are 33 and 35, respectively. The layout of the 1 × 7 feature map is just in line with the layout of seven characters in the license plate. The 33 channels correspond to 32 Chinese characters and one unidentified character of the license plate. The 35 channels correspond to 24 letters plus 10 numbers and one unidentified character on the license plate. Classifier1 is responsible for recognizing the Chinese characters of the license plate; classifier2 is responsible for recognizing the remaining characters, and then combining these two results to get the final recognition result.
The fully connected layer classifier is a commonly used classifier. It summarizes the results of the convolutional layer, pooling layer, activation function, etc., and performs work similar to template matching again, abstracting the probability of the existence of the feature of the number of fc neurons size; then it goes through a layer of fc and then classifies the features output by the previous layer of fc to obtain each feature. Generally speaking, the original license plate image is extracted by convolution layer and then classified by full-connection layer. However, the full-connection layer classifier also has defects, such as the balance of the dataset affects the performance of the classifier, and the operation speed is slow due to the large number of parameters. This can be avoided by using the right reshared convolution layer instead of the classifier, because each convolution classifier is a complete convolution process on the license plate and has the same interest in the seven regions on the license plate. At the same time, because of the weight sharing, the number of parameters is less than the full-connection layer, and after a convolution, it can output six-character prediction results, and the reasoning speed is much faster than the full-connection layer.
3.2.2. Convolution Layer instead of Pooling Layer
The convolution kernel with step length of 2 is used to achieve the purpose of downsampling, replacing the role of the pooling layer. Based on ResNet, the convolution layer with step size of 2 is used to replace the pooling layer with size of 2 to realize the downsampling operation [25]. The main significance of pooling layer is invariant, which includes translation invariance, scale invariance and rotation invariance. At the same time, pooling downsampling makes the high-level features have larger receptive fields. Experiments show that invariance has a small effect on performance, while expanding receptive field can be replaced by CNN + ReLU with stride = 2, and the performance can be basically consistent or even slightly better. For the pooling layer and the convolution layer with step size of 2, the pooling layer is a prior downsampling method; that is, it is believed that the downsampling rules are determined. For the convolution layer of stride = 2, its parameters are obtained by learning, the sampling rules are uncertain, and it is possible to learn more important features. The convolution diagram is shown in Figure 5.
[figure omitted; refer to PDF]
In addition, different interference is added between the upper, lower, and diagonal characters of the license plate image. As shown in Figure 11, after testing, the network can still accurately identify the license plate characters.
[figure omitted; refer to PDF]
Deep-LPRNet is used to train CCPD data. CCPD dataset is a public dataset of Chinese automobile license plate, which is commonly used in the research of license plate recognition [27, 28]. At present, there are more than 250,000 license plate images, including a total of nine categories, as shown in Figures 13 and 14 shows the sample images in the CCPD dataset.
[figure omitted; refer to PDF]
Select CCPD-Base images after cutting and resize (Figure 15) to network training. CCPD-Base has nearly 200,000 images, selecting 120,000 as training sets and 80,000 as test sets, using the PyTorch framework for training. Other hyperparameters are basically the same as when training Fast-LPRNet. When the network is trained to the convergence state, the recognition accuracy reaches more than 90% on the test set (Figure 16), indicating that the network based on this structure has strong scalability and robustness and has strong practical application value.
[figure omitted; refer to PDF][figure omitted; refer to PDF]5. Conclusion
This paper studies the algorithm of license plate recognition network based on FPGA. Through the combination of software and hardware design, the Fast-LPRNet network is proposed on the basis of CNN, and the license plate segmentation is improved and optimized. Finally, the feasibility of the algorithm is verified. The average frame rate is 5.45 frames/second, and the accuracy of recognizing 30 license plate images is 100%. The deep-LPRNet network is added. The recognition test is carried out on the CCPD-Base dataset, and the recognition accuracy is as high as 90%.
In this study, the combination of convolutional neural network and FPGA development is our innovation, but there is still room for improvement in FPGA hardware. In the future work, we hope to optimize the algorithm structure and build the optimized hardware structure on the FPGA. In addition, it will be combined with some other new metaheuristic algorithms, such as monarch butterfly optimization (MBO) [29], earthworm optimization algorithm (EWA) [30], elephant herding optimization (EHO) [31], moth search (MS) algorithm [32], Slime mould algorithm (SMA) [33], Harris’s hawks optimization (HHO) [34], and so on. This combination will further improve the performance of Fast-LPRNet.
Acknowledgments
This study was funded by Changsha Science and Technology Project (no. kq1901139) and Science and Technology Project of Hunan Province (no. 2016WK2023).
[1] M. W. Lung, C. W. Chun, C. T. Wang, Y. H. Lin, "Recycling waste classification using optimized convolutional neural network," Resources, Conservation and Recycling, vol. 164, 2021.
[2] S. T. Sara, M. M. Hasan, A. Ahmad, S. Shatabda, "Convolutional neural networks with image representation of amino acid sequences for protein function prediction," Computational Biology and Chemistry, vol. 92,DOI: 10.1016/j.compbiolchem.2021.107494, 2021.
[3] N. Nahla, E. Mohammed, V. Serestina, "Face expression recognition using convolution neural network (CNN) models," International Journal of Grid Computing & Applications, vol. 11 no. 4, 2020.
[4] Y. Sun, J. Xu, G. Lin, N. Sun, "Adaptive neural network control for maglev vehicle systems with time-varying mass and external disturbance," Neural Computing and Applications,DOI: 10.1007/s00521-021-05874-2, 2021.
[5] Z. Cui, F. Xue, X. Cai, "Detection of malicious code variants based on deep learning," IEEE Transactions on Industrial Informatics, vol. 14 no. 7, pp. 3187-3196, DOI: 10.1109/tii.2018.2822680, 2018.
[6] L. Wang, L. Guo, H. Duan, "A threat assessment model and algorithm based on Elman_Adaboost strong predictor," Acta Electronica Sinica, vol. 40 no. 5, pp. 901-906, 2012.
[7] X. Wang, C. Li, J. Song, "Motion image processing system based on multi core FPGA processor and convolutional neural Network," Microprocessors and Microsystems, vol. 82,DOI: 10.1016/j.micpro.2021.103923, 2021.
[8] R. A. E. El-Din, O. Marwa, M. H. El∼Din, "Accelerating DNA pairwise sequence alignment using FPGA and a customized convolutional neural network," Computers and Electrical Engineering, vol. 92, 2021.
[9] A. Ghani, "Accelerating retinal fundus image classification using artificial neural networks (ANNs) and reconfigurable hardware (FPGA)," Electronics, vol. 8 no. 8,DOI: 10.3390/electronics8121522, 2019.
[10] J.-H. Yi, J. Wang, G.-G. Wang, "Improved probabilistic neural networks with self-adaptive strategies for transformer fault diagnosis problem," Advances in Mechanical Engineering, vol. 8 no. 1, pp. 109-118, DOI: 10.1177/1687814015624832, 2016.
[11] W. Li, G.-G. Wang, A. H. Gandomi, "A survey of learning-based intelligent optimization algorithms," Archives of Computational Methods in Engineering, vol. 28, 2021.
[12] A. Safaei, "System-on-a-Chip (SoC)-Based hardware acceleration for an online sequential extreme learning machine (OS-elm)," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 38 no. 11, pp. 2127-2138, 2019.
[13] A. Hendry, R.-C. Chen, "Automatic License Plate Recognition via sliding-window darknet-YOLO deep learning," Image and Vision Computing, vol. 87, pp. 47-56, DOI: 10.1016/j.imavis.2019.04.007, 2019.
[14] L. Rayson, A. Zanlorensi Luiz, R. Gonçalves Gabriel, T. Eduardo, S. W. Robson, M. David, "An efficient and layout‐independent automatic license plate recognition system based on the YOLO detector," IET Intelligent Transport Systems, vol. 15 no. 4, pp. 483-503, 2021.
[15] A. J. Abd El-Maksoud, A. A. Abd El-Kader, B. G. Hassan, N. G. Rihan, M. F. Tolba, L. A. Said, A. G. Radwan, M. F. Abu-Elyazeed, "FPGA implementation of sound encryption system based on fractional-order chaotic systems," Microelectronics Journal, vol. 90, pp. 323-335, DOI: 10.1016/j.mejo.2019.05.005, 2019.
[16] Z. Cui, F. Xue, X. Cai, "Detection of malicious code variants based on deep learning," IEEE Transactions on Industrial Informatics, vol. 14 no. 7,DOI: 10.1109/tii.2018.2822680, 2018.
[17] Y. Yang, D. Li, Z. Duan, "Chinese vehicle license plate recognition using kernel‐based extreme learning machine with deep convolutional features," IET Intelligent Transport Systems, vol. 12 no. 3, pp. 213-219, DOI: 10.1049/iet-its.2017.0136, 2018.
[18] P. Marzuki, A. R. Syafeeza, Y. C. Wong, N. A. Hamid, A. N. Alisa, M. M. Ibrahim, "A design of license plate recognition system using convolutional neural network," International Journal of Electrical and Computer Engineering (IJECE), vol. 9 no. 3, pp. 2196-2204, DOI: 10.11591/ijece.v9i3.pp2196-2204, 2019.
[19] Z. Xu, W. Yang, A. Meng, N. Lu, H. Huang, C. Ying, L. Huang, "Towards end-to-end license plate detection and recognition: a large dataset and baseline," Proceedings of the European Conference on Computer Vision,DOI: 10.1007/978-3-030-01261-8_16, .
[20] R. Pires de Lima, K. Marfurt, "Convolutional neural network for remote-sensing scene classification: transfer learning analysis," Remote Sensing, vol. 12 no. 1,DOI: 10.3390/rs12010086, 2019.
[21] Y. Yang, C. Xu, F. Dong, X. Wang, "A new multi-scale convolutional model based on multiple attention for image classification," Applied Sciences, vol. 10 no. 1,DOI: 10.3390/app10010101, 2019.
[22] H. Li, P. Wang, C. Shen, "Toward end-to-end car license plate detection and recognition with deep neural networks," Journal of Robotics & Machine Learning,DOI: 10.1109/tits.2018.2847291, 2019.
[23] X. Ding, X. Zhang, N. Ma, "RepVGG: Making VGG-style ConvNets great again," 2021. arXiv
[24] X. Liu, D. Guo, L. Cong, "An improvement of activation function in convolutional neural networks," Journal of Testing Technology, vol. 33 no. 2, pp. 121-125, 2019.
[25] S. Liu, G. Tian, Y. Xu, "A novel scene classification model combining ResNet based transfer learning and data augmentation with a filter," Neurocomputing, vol. 338, pp. 191-206, DOI: 10.1016/j.neucom.2019.01.090, 2019.
[26] H. Phan-Xuan, T. Le-Tien, S. Nguyen-Tan, "FPGA platform applied for facial expression recognition system using convolutional neural networks," Procedia Computer Science, vol. 151, pp. 651-658, DOI: 10.1016/j.procs.2019.04.087, 2019.
[27] S. Wu, W. Zhai, Y. Cao, "PixTextGAN: structure aware text image synthesis for license plate recognition," IET Image Processing, vol. 13 no. 14, pp. 2744-2752, DOI: 10.1049/iet-ipr.2018.6588, 2019.
[28] X. Zhou, Y. Gao, C. Li, C. Yang, "An end-to-end license plate recognition method based on multi-objective optimization and multi-task learning," Control Theory and Applications, vol. 38 no. 5, pp. 676-688, 2021.
[29] Y. Feng, S. Deb, G.-G. Wang, H. Alavi Amir, "Monarch butterfly optimization: a comprehensive review," Expert Systems with Applications, vol. 168,DOI: 10.1016/j.eswa.2020.114418, 2021.
[30] I. Ghosh, P. K. Roy, "Application of earthworm optimization algorithm for solution of optimal power flow," Proceedings of the 2019 International Conference on Opto-Electronics and Applied Optics (Optronix),DOI: 10.1109/optronix.2019.8862335, .
[31] J. Li, H. Lei, A. H. Alavi, G.-G. Wang, "Elephant herding optimization: variants, hybrids, and applications," Mathematics, vol. 8 no. 9,DOI: 10.3390/math8091415, 2020.
[32] G.-G. Wang, "Moth search algorithm: a bio-inspired metaheuristic algorithm for global optimization problems," Memetic Computing, vol. 10 no. 2, pp. 151-164, DOI: 10.1007/s12293-016-0212-3, 2018.
[33] K. Sirote, S. Apirat, P. Suttichai, "Multi-Objective optimal power flow problems based on Slime mould algorithm," Sustainability, vol. 13 no. 13, 2021.
[34] A. A. Heidari, S. Mirjalili, H. Faris, I. Aljarah, M. Mafarja, H. Chen, "Harris hawks optimization: algorithm and applications," Future Generation Computer Systems, vol. 97, pp. 849-872, DOI: 10.1016/j.future.2019.02.028, 2019.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2021 Zhichao Wang et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
The license plate recognition is an important part of the intelligent traffic management system, and the application of deep learning to the license plate recognition system can effectively improve the speed and accuracy of recognition. Aiming at the problems of traditional license plate recognition algorithms such as the low accuracy, slow speed, and the recognition rate being easily affected by the environment, a Convolutional Neural Network- (CNN-) based license plate recognition algorithm-Fast-LPRNet is proposed. This algorithm uses the nonsegment recognition method, removes the fully connected layer, and reduces the number of parameters. The algorithm—which has strong generalization ability, scalability, and robustness—performs license plate recognition on the FPGA hardware. Increaseing the depth of network on the basis of the Fast-LPRNet structure, the dataset of Chinese City Parking Dataset (CCPD) can be recognized with an accuracy beyond 90%. The experimental results show that the license plate recognition algorithm has high recognition accuracy, strong generalization ability, and good robustness.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer