Abstract. Traffic sign detection and recognition can be divided in three
main problems: Automatic location, detection and categorization of traffic
signs. Basically, most of the approaches in locating and detecting of
traffic signs are based on color information extraction. But the main issue
of using color information is to select the most proper color space
to assure robust color analysis without influencing the exterior environment.
Given the strong dependence on weather conditions, shadows and
time of the day some authors focus on shape-based sign detection (e.g.
Hough transform, ad-hoc models based on Canny edges or convex hulls).
Recognition of traffic signs has been addressed by a number of popular
classification techniques ranging from simple template matching (e.g.
cross-correlation similarity) to sophisticated Machine learning techniques
(e.g. support vector machines, boosting, random forest, etc.) are implemented
to assure a straightforward outcome necessary for a real end-user
system. Moreover, extending the traffic sign analysis from isolated frames
to videos can allow significant reduction in the number of false alarm ratios
as well as to increase the precision and the accuracy of the detection
and recognition process.
Keywords: Traffic sign detection and recognition(TSDR), Color-based
description,Shape-based description, Uncontrolled , Environments , Multiclass
Embedded and Intelligent Automated system for vehicles for safety in transportation
has been the limelight of research in the Computer Vision and Pattern
Recognition community for the more than three decades. In Pacilik 1, a time
line from the currently popular methods of Traffic Sign Detection and Recognition
System to the paramount study of it in Japan, 1984 can be outlined. For
years, researchers have been addressing the difficulties of detecting and recognizing
traffic signs. The most common automated systems belonging to traffic
signs detection and recognition comprise of one or two video cameras mounted
on the front of the vehicle (e.g. a geo van). Recently, some geo vans also have
another camera at the rear end and/or the side of the vehicle recording the signs
behind or alongside the vehicle. The cars are retrofitted with a PC system for
acquiring the videos, or specialized hardware for driving assistance applications.
Road signs have specific properties that distinguish them from other outdoor objects.
Operating systems for the automated recognition system of road signs are
designed to identify these properties. Traffic sign recognition systems have three
main parts: Location of the region of interest and color segmentation. Detection
by verification of the hypothesis of the presence of the sign (e.g. equilateral triangles,
circles, etc.). Categorization/Recognition of the type of traffic sign and
then detection of the signs from outdoor images is the most complex step in
the automated traffic sign recognition system 2. Many issues make the problem
of the automatic detection of traffic signs difficult such as changeable light
conditions which are difficult to control (lighting varies according to the time
of the day, season, cloud cover and other weather conditions); presence of other
objects on the road (traffic signs are often surrounded by other objects producing
partial occlusions, shadows, etc.). The research takes a precarious turn when
trying to think the possibilities that can cause false positives since the algorithm
has to take camera distance and view angle of the recorded image along with
possible deformation caused by external factors to the signs appearance into
account Hence, any robust automatic detection and recognition system must
provide straightforward results that are not affected by perspective distortion,
lighting changes, partial occlusions or shadows 3. Ideally, the system should
also provide additional information on the lack of visibility, poor conditions and
poor placement of traffic signs. This paper discusses the research progress of real
time Traffic Sign Detection and Recognition (TSDR). Also it reviews the papers
based on three main parts of TSDR namely color segmentation, shape detection
2 Core Back ground Study
2.1 Color Segmentation
Sign detection using color is based on the five typical colors defined in standard
traffic signs (red, blue, yellow, white and black). Robust color segmentation especially
considering non-homogeneous illumination is given priority, since errors
in segmentation may be propagated in the following steps of the system.
2.2 Shape Detection
Detection of Traffic Signs via its shape follows the defining algorithm of shape
detection i.e. to finding the contours and approximating it to reach a final decision
based on the number of contours. But there are some slight difficulties
in shaped based detection. Notable shape based detection problems for Traffic
Signs are discussed below. Traffic signs are mostly designed in basic shapes like
circle, triangle, pentagon etc. to make it easily visible. But similar shaped objects
also exist in the surroundings that are not traffic signs. Traffic Signs are
prone to physical damage and being obstructed from view. The size of traffic
signs compared to its real size depends on factors such as the distance between
the camera lens and traffic sign. The camera view might also be disoriented vertically
or horizontally. Moreover factors like small object size of traffic sign in
images and slanted angle of view create difficulties in the detection phase due to
change in aspect ratio. Variation in daylight or colors do not affect shape based
detection. Small roughly distinguishable traffic signs in images makes it rather
difficult to approximate contours and so robust edge detection and recognition
algorithms are necessary.
2.3 Learning Based Detection
Detection of Traffic Signs based on Deep Learning is a fairly new approach
as conventional means of Detection are far too static for real time detection.
Artificial Neural Network algorithms are widely used to collect a large data set
of traffic signs that have been preprocessed to detect objects accurately. More
training data results in further accuracy increase of the method. Drawbacks of
deep learning detection notably fuzzy logic and neural networks are that they
require a large amount of resources during the learning process. But the pros of
fast and accurate detection and recognition is well worth the time consuming
The detection phase results in the output of a number of detected shapes that
are referred to as “candidate objects” in some research papers. These candidate
objects contain the deciphered traffic sign shapes. Candidate objects are sent to
a recognizer and then to the classifier which comes to the decision whether the
input is rejected, false positives or actual traffic signs. The detected object are
identified according to their sign codes. A good recognition system must meet
some criteria to be called efficient. Some are mentioned below:
1. Ability to differentiate false positives and rejected objects from real ones in
a short amount of time
2. Robustness in defining size, position and geometrical status i.e. vertical or
horizontal orientation, of the traffic sign in the image
3. Wary of noise
4. Requiring Low Computational cost and time for real time applications
5. Ability to be trained with large dimensions of data set with prior information
on road signs to match with
3 Previous work
3.1 Review Based on Methods
Color information is the staple method used in segmentation of image 7-12.
Poor lighting, strong illumination and adverse weather conditions reduce the performance
of color information based road sign detection. These problems were
overcome by using Color models such as HSV 14, 9, 11, 25, 26, YUV 13
and CIECAM9715. Segmentation was done by Shadeed et al. 13 by implementing
the U and V chrominance channels of YUV space where U is regarded
as positive and V as negative for red colors. The hue channel of HSV color space,
in combination with YUV space information was used to segment red colored
traffic signs. Gao et al. 15 applied a quad-tree histogram method to segment
the image based on the hue and chroma values of the CIECAM97 color model.
The CIECAM97 color model consisting of hue and chroma values were used by
Gao et al. 15 to segment images implementing the quad-tree histogram method.
Thresholding is a common practice in segmentation. The hue channel of HSV
color space was thresholded by Malik et al. 9 for color segmentation of red traffic
signs. Some research papers prioritize shape information over color information.
Grayscale images are used in shape information method. Loy and Zelinksy 17
theorized a method to highlight points if interest that detect octagonal, square
and triangular traffic signs using local radial symmetry.
Extraction of feature vectors from segmented region of interest (ROI) is necessary
for the recognition process. Scale Invariant Feature Transform (SIFT),
SURF and Binary Robust Invariant Scalable Keypoints (BRISK) feature descriptor
were implemented by the authors in 19 along with the comparison
of these feature vectors in contrast with others. Histogram Oriented Gradient
(HOG) feature vector is another method in which 20 classified traffic signs.
Classifier models are created to train feature vectors that can distinguish between
different traffic signs in supervised learning paradigm. Classifiers such as
ANN 6, Adaboost 7,24, SVM 12,21, 23 are prominent candidates in the
recognition of traffic signs. A novel ROI extraction method, called High Contrast
Extraction in combination with occlusion robust recognition method based
on Extended Sparse Presentation Classification (ESRC) was introduced in the
recognition process in 20.
3.2 Review Based on Frameworks
Even though Colour Feature and Neural Networks 24 use HSV, noise reduction
etc. in pre-processing to minimalize error, there can be change in color of
the physical object i.e. the color of traffic sign might change after transition
to HSV and Color Segmentation. Color Probability Map and Artificial Neural
Networks27 method use minimum computational resource to provide with high
quality images to detect and validate which require high memory space.In Auto
Associative Neural Networks 28 change of orientation, weather conditions, lighting
and also speed of the vehicle will decrease accuracy in real time conditions.
Deep Convolutional Neural Networks, Real Time Detection and Recognition 29
accuracy is dependent solely on weather and illumination conditions along with
the number of data sets used to train the algorithm. Selective Search based
Convolutional Neural Network 30 allows data of different sizes; after color segmentation
it enables specific region searching with faster computation times but
decreases Speed of Deep Learning CNN algorithm as there are too many regions
to train and most of them are not real signs but false positives of detection which
needs to be further improved. Region Based Convolutional Neural Networks(RCNN)
and Region Proposal Network 31 are highly efficient since it shares convolutions
across individual proposals. It also performs bounding box regression
to further enhance the quality of the proposed regions. It enables fast and rapid
detection for detection and recognition phase. Even though it is very fast it is
a three stage training phase for R-CNN and it requires a large amount of space
along with sufficient GPU power. Random Forests and multiple features extraction
34 method is robust and tolerant to noise for using random forest compared
to other classifiers but the main drawback of this method is basic thresholding.
Census Transform and Multilevel Support Vector Machine 35 method displays
high illumination-invariant accuracy in detection and recognition but in case of
urban area its accuracy decreases. Principle Component Analysis (PCA) and
Multi-Layer Perception Network (MLP) using Morphological 38 classifications
are disadvantageous because it can not detect damaged signs. Adaptive neuro
fuzzy inference system(ANFIS) 43 method is independent of color segmentation
process. It reduces the computational cost and also produces a higher recognition
success rate but is extremely vulnerable to the illumination change. Bilateral
Chinese Transform (BCT), Vertex and Bisector Transform(VBT) 36 reduce the
ROI but their accuracy rate declines considerably.
3.3 Review Based on Experimental Results
In 28 Color Segmentation Accuracy is 93.3% and its detection rate for test data
sets if 14 out of 15 in daylight and 19 out of 20 in shadow respectively. Auto
Associative Neural Networks (AANN) has a 100% accuracy in daylight but 4 false
positives must be added to the 14 test data sets to recognize 14. For Shadow it
is 94.7% with 5 false positives added to the data set of 19 to detect 18 finally.
At 29 Training with only Positive samples i.e. real signs and after that mixing
training data with 25,000 samples of real signs and 78,000 false positives, the rate
of learning is 0.01 per 100000 iterations with accuracy of 92.63% where 39209
training data was used and 1000 random test cases were selected as input. In 30
1918 images were used as data sets for detection with 91.69% accuracy, 684 false
positives out of 20 million window frames in 1918 images were observed. 2520
images for classification in 32×32 size was taken from detection and randomly
used for training. Accuracy was observed at 93.77%. At 31 proposed method
had a 98.76% success rate. Compared to the propose Multi fusion Multi Classifier
method Complementary features had a success rate 98.65%, Multi-scale CNN
98.31%, SRGE 98.19%, Random forests 96.14%, LDA on HOG2 95.68%.
The environment for Traffic Sign Detection and Recognition system is an adverse
one with many unaccounted variables mostly physical. Some of the problems and
difficulties while the system is active and working are discussed below:
1. Recognition becomes difficult when the traffic signs are exposed to sunlight
and air resulting in the color of the signs to fade
2. Adverse conditions and natural disasters that affect visibility namely fog,
rain, storms etc.
3. Constant change in sunlight brightness depending of time, the change if
seasons and lack of light due to shadows created by other objects in the
4. Changing light conditions along with viewing geometry, illuminant color
and illumination geometry affect color information which is very sensitive
5. Presence of obstacles blocking the view of the camera such as buildings,
vehicles, pedestrians, vegetation etc.
6. Presence of objects shaped similar to traffic sign shapes like circular, triangular,
7. Physical damage, image distortion etc. may create false positives or negative
8. Size of the sign varies according to the distance between the camera and the
sign. Traffic signs may appear at a different angle of view due to the imaging
orientation and cause discrepancies to the calculation of the size
9. Acquired images often suffer from motion blur due to vibration and movement
of a running vehicle 24. Prediction of this motion blur beyond a certain
threshold is not feasible because vehicular movement has variant speed and
acceleration which is unknown to the recognition process. It is possible to
make an assertion about the movement of objects in the future if the motion
is continuous and unchanged.
10. Sign boards can appear to have bright white or nearwhite spots due to first
surface reflection from the light sources. In first surface reflection the light
is reflected prior to penetrating to a depth where certain wavelengths are
absorbed, thereby imparting a color associated with the sign. This is called
11. Real time application application makes it hard to maintain both accuracy
with constantly changing fps as the ROI changes every second.
12. In Night Time without high resolution camera real time application lead to
the presence of noise
13. Vandalism of sign boards by people who put stickers or write on them or
damage the signs by changing the pictograms within it making it unrecognizable.
14. Different countries use different colors and different pictograms, a standard
database for evaluation of existent classification methods is unavailable
It is extremely hard for the detection and recognition of road and traffic signs
frameworks to have high robustness of color segmentation, high insensitivity
to noise and brightness variations and invariant to geometrical effects such as
translation, inplane, outplane rotations and scaling changes in the real time
image. Due to the continuously changing frame it is quite difficult to produce a
result that is both nearly accurate and has a low computational time.
The primary purpose of the research is to develop a TSDR system in real time.
By selecting a proper threshold for color segmentation and shape based detection
then developing a fast classifier to make it applicable in real time is the main
objective of this research.