JCSTS 6(2): 30-36
Page | 31
autonomous cars, critical sensors include radar, sonar, and cameras. Long-range vehicle detection normally needs radar, but local
automobile detection can be done with sonar. Computer vision may play an important role in lane detection as well as redundant
object detection at intermediate distances. Radar works quite well for detecting automobiles but has difficulties differentiating
between different metal items and hence can report false positives on objects such as tin cans. Also, radar offers minimal
orientation information and has a bigger variation in the lateral position of objects, making localization problematic on acute
bends.
The efficacy of sonar is both reduced at fast speeds and, even at modest speeds, restricted to a working distance of about 2 meters.
Compared to sonar and radar, cameras provide a richer collection of characteristics for a fraction of the expense. By advancing
computer vision, cameras might serve as a dependable, redundant sensor for autonomous driving. Despite its potential, computer
vision has yet to occupy a substantial role in today’s self-driving automobiles. Classic computer vision systems just have not
delivered the robustness necessary for production-grade automotive; these techniques require substantial manual engineering,
road modeling, and special case handling. Considering the apparently unlimited variety of driving situations, environments, and
unanticipated impediments, the effort of scaling classic computer vision to robust, human-level performance would be enormous
and likely unachievable.
Deep learning, or neural networks, is an alternate method for computer vision. It offers tremendous potential as a remedy for the
inadequacies of standard computer vision. Recent research in the subject has enhanced the practicality of deep learning
applications to tackle complicated, real-world issues, and the industry has responded by expanding the use of such technologies.
Deep learning is data-focused, requiring extensive computing but little hand-engineering. In the last several years, an increase in
accessible storage and computation capabilities has enabled deep learning to achieve success in supervised perception tasks, such
as image detection. A neural network, after training for days or even weeks on a big data set, can be capable of inference in real-
time with a model size that is no greater than a few hundred MB [1]. State-of-the-art neural networks for computer vision require
huge training sets paired with extended networks capable of simulating such immense amounts of data. For example, the ILSRVC
data set, where neural networks obtain top performance, comprises 1.2 million images in over 1000 categories. By leveraging
expensive existing sensors that are already employed for self-driving applications, such as LIDAR and precise GPS,[2] and calibrating
them with cameras, we may produce a video data set comprising labeled lane markings and annotated cars with location and
relative speed. By constructing a labeled data set in all sorts of driving scenarios (rain, snow, night, day, etc.), we can test neural
networks on this data to see if they are resilient in every driving environment and situation for which we have training data. In this
study, we give an empirical assessment of the data set we collected. In addition, we discuss the neural network that we employed
for identifying lanes and automobiles, as illustrated in Figure 1.
2. Related Work
In the rapidly evolving landscape of autonomous driving, Computer Vision plays a pivotal role, albeit with certain limitations
necessitating complementary sensor fusion and road models for enhanced precision. Noteworthy studies have employed diverse
approaches, such as reinforcement learning in highway scenarios, where S. Nageshrao et al. demonstrated autonomous vehicles'
decision-making prowess. P. Chuan-Hsian and C. -S. Sea's research showcased Dark net outperforming Tensor Flow in vehicle
detection accuracy. J. Wang et al. addressed highway driving challenges through supervised and reinforcement learning,
incorporating LSTM for improved performance. G. Prabhakar et al. developed a deep learning system for obstacle detection, while
A. A. Hasanaath proposed a real-time road condition monitoring mechanism achieving high accuracy. Z. Wei's computer vision
system excelled in lane change detection, and K. Muhammad's survey offered insights into deep learning architectures' reliability
in autonomous driving. Additionally, studies by Yang et al., Dhawan et al., and Yi et al. focused on workload detection, traffic sign
classification, and personalized driving state recognition, respectively. Emphasizing the importance of road infrastructure, a study
targeted road markings' damage detection using computer vision, utilizing deep learning for improved F1-scores in Japanese and
Spanish images, albeit with a call for more extensive image collection for further advancements in the field.
3. Methodology
3.1 Real-time vehicle detection
Convolutional neural networks (CNNs) have had the largest success in image recognition in the previous 3 years. From these image
recognition systems, several detection networks were developed, leading to further advances in image detection. While the
advances have been startling, not much emphasis has been paid to the real-time detection speed necessary for applications. In
this study, we demonstrate a detection system capable of running at better than 10Hz Hz using nothing but a laptop GPU. Due to
the needs of highway driving, we need to verify that the system utilized can identify automobiles more than 100m away and can
work at rates greater than 10 Hz; this distance demands higher picture resolutions than are typically used, which in our instance
are 640 × 480. We employ the Over feat CNN detector, which is extremely scalable and replicates a sliding window detector in a
single forward pass in the network by efficiently recycling convolutional findings on each layer. Other detection techniques, such
as R-CNN, rely on selecting as many as 1000 candidate windows, where each is evaluated independently and does not reuse