SlideShare ist ein Scribd-Unternehmen logo
1 von 73
Downloaden Sie, um offline zu lesen
Camera-Based Road Lane
Detection by Deep Learning II
Yu Huang
Yu.huang07@gmail.com
Sunnyvale, California
Outline
• Ultra Fast Structure-aware Deep Lane Detection
• Learning Lightweight Lane Detection CNNs by
Self Attention Distillation
• Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
• Learning to Cluster for Proposal-Free Instance
Segmentation
• Lane Detection and Classification using
Cascaded CNNs
• PolyLaneNet: Lane Estimation via Deep
Polynomial Regression
• Lane Detection: Light Conditions Style Transfer
• End-to-end Lane Detection through
Differentiable Least-Squares Fitting
• Robust Lane Detection from Continuous Driving
Scenes Using Deep Neural Networks
• LineNet: a Zoomable CNN for Crowdsourced
HD Maps Modeling in Urban Environments
• Efficient Road Lane Marking Detection with
Deep Learning
• End to End Video Segmentation for Driving:
Lane Detection For Autonomous Car
• 3D-LaneNet: E2E 3D multiple lane detection
• End-to-End Lane Marker Detection via Row-
wise Classification
Ultra Fast Structure-aware Deep Lane Detection
• ArXiv 2004.11757
• Inspired by human perception, the recognition of lanes under severe occlusion and extreme
lighting conditions is mainly based on contextual and global information.
• Motivated by this observation, propose a novel, simple, yet effective formulation aiming at
extremely fast speed and challenging scenarios.
• Specifically, treat the process of lane detection as a row-based selecting problem using
global features.
• With the help of row-based selecting, formulation could significantly reduce the
computational cost.
• Using a large receptive field on global features, could also handle the challenging scenarios.
• Moreover, based on the formulation, also propose a structural loss to explicitly model the
structure of lanes.
• A light-weight version could even achieve 300+ frames per second with the same resolution,
which is at least 4x faster than previous state-of-the-art methods.
• code is available at https://github.com/cfzd/Ultra-Fast-Lane-Detection.
Ultra Fast Structure-aware Deep Lane Detection
Illustration of selecting on the left and right lane. In the right
part, the selecting of a row is shown in detail. Row anchors
are the predefined row locations, and formulation is defined
as horizontally selecting on each of row anchor. On the
right of the image, a background gridding cell is introduced
to indicate no lane in this row.
Ultra Fast Structure-aware Deep Lane Detection
Illustration of our formulation and conventional segmentation. Our formulation is
selecting locations (grids) on rows, while segmentation is classifying every pixel. The
dimensions used for classifying are also different, which is marked with red. The
proposed formulation significantly reduces the computational cost. Besides, the
proposed formulation uses global feature as input, which has larger receptive field
than segmentation, thus addressing the no-visual-clue problem
formulation Segmentation
Ultra Fast Structure-aware Deep Lane Detection
Overall architecture. The auxiliary branch is shown in the upper part, which is only valid when
training. The feature extractor is shown in the blue box. The classification-based prediction and
auxiliary segmentation task are illustrated in the green and orange boxes, respectively. The group
classification is conducted on each row anchor.
Ultra Fast Structure-aware Deep Lane Detection
Learning Lightweight Lane Detection CNNs by
Self Attention Distillation
• https://github.com/cardwing/Codes-for-Lane-Detection
• Without learning from much richer context, lane detection models often fail in challenging
scenarios, e.g., severe occlusion, ambiguous lanes, and poor lighting conditions.
• present a novel knowledge distillation approach, i.e., Self Attention Distillation (SAD), which
allows a model to learn from itself and gains substantial improvement without any additional
supervision or labels.
• Specifically, observe that attention maps extracted from a model trained to a reasonable
level would encode rich contextual information.
• The valuable contextual information can be used as a form of ‘free’ supervision for
further representation learning through performing top down and layer-wise attention
distillation within the network itself.
• SAD can be easily incorporated in any feedforward convolutional neural networks (CNN) and
does not increase the inference time.
• validate SAD on three popular lane detection benchmarks (TuSimple, CULane and BDD100K)
using lightweight models such as ENet, ResNet18 and ResNet-34.
Learning Lightweight Lane Detection CNNs by
Self Attention Distillation
Attention maps of the ENet [17] before and after applying self attention distillation. Here,
we extract the attention maps from the four stages/blocks following the design of ENet
model. Note that self attention distillation is added in the 40 K episodes.
Learning Lightweight Lane Detection CNNs by
Self Attention Distillation
Attention maps of the block 4 of the ENet model
using different mapping functions.
An instantiation of using SAD. E1∼E4 comprise the encoder of ENet, D1 and D2 comprise the decoder of
ENet. add a small network to predict the existence of lanes, denoted as P1. AT-GEN is the attention generator.
Learning Lightweight Lane Detection CNNs by
Self Attention Distillation
Attention maps of ENet with and without self attention distillation. Both networks with and
without SAD are trained up to 60K episodes. SAD is applied to ENet at 40K training episodes.
Learning Lightweight Lane Detection CNNs by
Self Attention Distillation
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
• https://github.com/koyeongmin/PINet
• Current methods have critical deficiencies such as the limited number of detectable lanes
and high false positive.
• In especial, high false positive can cause wrong and dangerous control.
• In this paper, propose a lane detection method for the arbitrary number of lanes using the
deep learning method, which has the lower number of false positives than other recent lane
detection methods.
• The architecture of the proposed method has the shared feature extraction layers and
several branches for detection and embedding to cluster lanes.
• The proposed method can generate exact points on the lanes, and cast a clustering problem
for the generated points as a point cloud instance segmentation problem.
• The proposed method is more compact because it generates fewer points than the original
image pixel size.
• proposed post processing method eliminates outliers successfully and increases the
performance notably.
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
The proposed framework. Given an input image, PINet
predict three value, confidence, offset, and feature. From
confidence and offset outputs, exact points on the lanes can
be predicted, and the feature output distinguishes the
predicted points into each instance. Finally, the post
processing module is applied, and it generates smooth lane.
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
The detailed network training procedure. It has three main parts. 512x256 size input data is compressed
by the resizing layer, and the compressed input is passed to feature extraction layer. Three output
branches are applied at end of each hourglass block, and they predict confidence, offset, and instance
feature for each grid. Loss function can be calculated from outputs of each hourglass block.
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
The hourglass block and
bottleneck layer architecture.
The hourglass block consist
three types of bottleneck
layers, same, up-sampling,
down-sampling. Output
branches are applied at end
of hourglass layer, and the
confidence output is
forwarded to next block.
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
The result of the post processing. (a)
is input image, and (b) is raw output
out PINet. In (b), the blue lane consist
of some outliers and other lane can
be distinguished. In (c), the result of
the proposed post processing
method, outliers are eliminated, and
only smooth longest lanes remain.
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
The explanation about the post processing. There are no
other point in the margin that is made by the straight line
connecting point S and A, but margin of point S and B
consist of 2 other points. As a result, point B is selected.
Key Points Estimation and Point Instance
Segmentation Approach for Lane Detection
Learning to Cluster for Proposal-Free Instance Segmentation
• https://github.com/GT-RIPL/L2C
• This work proposed a novel learning objective to train a deep neural network to perform
end-to-end image pixel clustering.
• applied the approach to instance segmentation, which is at the intersection of image
semantic segmentation and object detection.
• utilize the most fundamental property of instance labeling – the pairwise relationship
between pixels – as the supervision to formulate the learning objective, then apply it to train
a fully convolutional network (FCN) for learning to perform pixel-wise clustering.
• The resulting clusters can be used as the instance labeling directly.
• To support labeling of an unlimited number of instance, further formulate ideas from graph
coloring theory into the proposed learning objective.
• The evaluation on the Cityscapes dataset demonstrates strong performance and therefore
proof of the concept.
• Moreover, approach won the second place in the lane detection competition of 2017 CVPR
Autonomous Driving Challenge and was the top performer without using external data.
Learning to Cluster for Proposal-Free Instance Segmentation
address the labeling problem by formulating a novel learning objective. It guides the
fully convolutional networks to learn to perform instance labeling
Learning to Cluster for Proposal-Free Instance Segmentation
The example outputs of lane detection.
The colors represent different instance IDs.
The outputs for each pixel is a 6 + 1
dimensional vector, which represents the
probability distribution of this pixel being
assigned to a certain ID. learning
objective guides distribution function to
output a similar distribution for the pixels
on the same lane line, and vise versa.
During testing time, the pixel will be
assigned to an ID with highest probability.
Given a pair of pixels pi and pj , their corresponding
output distributions are denoted as Pi = f(pi) =
[ti,1..ti,n] and Pj = f(pj ) = [tj,1..tj,n], where n is the
number of indices available for labeling.
Learning to Cluster for Proposal-Free Instance Segmentation
The concept of how graph coloring is related to instance ID assignment
Learning to Cluster for Proposal-Free Instance Segmentation
The network architecture
Learning to Cluster for Proposal-Free Instance Segmentation
The visualization of the lane detection on Tusimple dataset (validation split). The red lines in top row are
predictions, while the green lines are the ground-truth. The second row shows the raw outputs from
network. The colors represent the assigned IDs.
Lane Detection and Classification using
Cascaded CNNs
• https://github.com/fabvio/Cascade-LD
• https://github.com/fabvio/TuSimple-lane-classes
• As many other computer vision based tasks, convolutional neural networks
(CNNs) represent the state-of-the-art technology to indentify lane
boundaries.
• However, the position of the lane boundaries w.r.t. the vehicle may not
suffice for a reliable positioning, as for path planning or localization
information regarding lane types may also be needed.
• In this work, present an end-to-end system for lane boundary identification,
clustering and classification, based on two cascaded neural networks, that
runs in real-time.
• To build the system, 14336 lane boundaries instances of the TuSimple
dataset for lane detection have been labelled using 8 different classes.
Lane Detection and Classification using
Cascaded CNNs
Lane Detection and Classification using
Cascaded CNNs
From top to bottom: original image, instance segmentation,
classification. For instance segmentation, different colors
represent different boundaries. For classification, green
represents dashed lanes, yellow double-dashed, red
continuous.
PolyLaneNet: Lane Estimation via Deep
Polynomial Regression
• https://github.com/lucastabelini/PolyLaneNet
• Since methods for lane detection have to work in real time (+30 FPS), they
not only have to be effective (i.e., have high accuracy) but they also have to
be efficient (i.e., fast).
• In this work, present a novel method for lane detection that uses as input an
image from a forward-looking camera mounted in the vehicle and outputs
polynomials representing each lane marking in the image, via deep
polynomial regression.
• The proposed method is shown to be competitive with existing state-of-the-
art methods in the TuSimple dataset, while maintaining its efficiency (115
FPS).
• Additionally, extensive qualitative results on two additional public datasets
are presented, alongside with limitations in the evaluation metrics used by
recent works for lane detection.
PolyLaneNet: Lane Estimation via Deep
Polynomial Regression
Overview of the proposal method. From left to right: the model
receives as input an image from a forward-looking camera and
outputs information about each lane marking in the image
PolyLaneNet: Lane Estimation via Deep
Polynomial Regression
Lane Detection in Low-light Conditions Using an Efficient
Data Enhancement : Light Conditions Style Transfer
• https://github.com/Chenzhaowei13/Light-Condition-Style-Transfer
• Although multi-task learning and contextual-information-based methods have
been proposed to solve lane detection, they either require additional manual
annotations or introduce extra inference overhead respectively.
• In this paper, propose a style-transfer-based data enhancement method, which
uses Generative Adversarial Networks (GANs) to generate images in low-light
conditions, that increases the environmental adaptability of the lane detector.
• solution consists of three parts: the proposed SIM-CycleGAN, light conditions style
transfer and lane detection network.
• It does not require additional manual annotations nor extra inference overhead.
• validated methods on the lane detection benchmark CULane using ERFNet.
• Empirically, lane detection model trained using method demonstrated adaptability
in low-light conditions and robustness in complex scenarios.
Lane Detection in Low-light Conditions Using an Efficient
Data Enhancement : Light Conditions Style Transfer
The main framework of our method. The proposed SIM-CycleGAN is shown on the left.
The generator GA transfer images from suitable light conditions to low-light conditions,
while the generator GB transfer in the opposite way. The discriminator DA and DB feed
the single scalar value(real or fake) back to generators. The middle section shows light
conditions style transfer from suitable light conditions to low-light conditions by the
trained SIM-CycleGAN. Lane detection model is shown on the right, whose baseline is
ERFNet. We add lane exist branch for better performance.
Lane Detection in Low-light Conditions Using an Efficient
Data Enhancement : Light Conditions Style Transfer
Generator architecture, composed of convolution layers,
residual blocks and deconvolution layers. Convolution layers
record the changing information of scale in the encoding
process and maps it to the corresponding operation in the
decoding process.
Lane Detection in Low-light Conditions Using an Efficient
Data Enhancement : Light Conditions Style Transfer
lane detection model architecture. The decoder outputs
probability maps of different lane markings, and the second
branch predicts the existence of lane.
Lane Detection in Low-light Conditions Using an Efficient
Data Enhancement : Light Conditions Style Transfer
The probability maps from our method and other methods. The brightness of the pixel indicates
the probability of this pixel belonging to lanes. It can be clearly seen from this figure, in low-light
conditions, the probability maps generated by our method is more pronounced and more accurate.
End-to-end Lane Detection
through Differentiable Least-Squares Fitting
• A method to train a lane detector in an e2e manner, directly regressing the lane parameters.
• The architecture consists of two components: a deep network that predicts a segmentation-like
weight map for each lane line, and a differentiable least-squares fitting module that returns for
each map the parameters of the best-fitting curve in the weighted least-squares sense.
• These parameters can subsequently be supervised with a loss function of choice.
• It relies on that it is possible to backpropagate through a least-squares fitting procedure.
• This leads to an end-to-end method where the features are optimized for the true task of
interest: the network implicitly learns to generate features that prevent instabilities during the
model fitting step, as opposed to two-step pipelines that need to handle outliers with heuristics.
• Additionally, the system is not just a black box but offers a degree of interpretability because the
intermediately generated segmentation-like weight maps can be inspected and visualized.
• Code: http://github.com/wvangansbeke/LaneDetection_End2End.
End-to-end Lane Detection
through Differentiable Least-Squares Fitting
• Lane detection is typically tackled with a two-step pipeline in which a segmentation mask of
the lane markings is predicted first, and a lane line model (like a parabola or spline) is fitted
to the post-processed mask next.
• The problem with such a two-step approach is that the parameters of the network are not
optimized for the true task of interest (estimating the lane curvature parameters) but for a
proxy task (segmenting the lane markings), resulting in sub-optimal performance.
Overview of the architecture
End-to-end Lane Detection
through Differentiable Least-Squares Fitting
Least-squares fitting.
Weighted least-squares fitting
End-to-end Lane Detection
through Differentiable Least-Squares Fitting
Robust Lane Detection from Continuous Driving
Scenes Using Deep Neural Networks
• https://github.com/qinnzou/Robust-Lane-Detection
• Most methods focus on detecting the lane from one single image, and often lead to
unsatisfactory performance in handling some extremely-bad situations such as
heavy shadow, severe mark degradation, serious vehicle occlusion, and so on.
• In fact, lanes are continuous line structures on the road.
• Consequently, the lane that cannot be accurately detected in one current frame may
potentially be inferred out by incorporating information of previous frames.
• To this end, investigate lane detection by using multiple frames of a continuous
driving scene, and propose a hybrid deep architecture by combining the
convolutional neural network (CNN) and the recurrent neural network (RNN).
• Specifically, information of each frame is abstracted by a CNN block, and the CNN
features of multiple continuous frames, holding the property of time-series, are then
fed into the RNN block for feature learning and lane prediction.
• Extensive experiments on two large-scale datasets demonstrate that, the proposed
method outperforms the competing methods in lane detection.
Robust Lane Detection from Continuous Driving
Scenes Using Deep Neural Networks
Architecture
Robust Lane Detection from Continuous Driving
Scenes Using Deep Neural Networks
Encoder network in (a) UNet-ConvLSTM and
(b) SegNet-ConvLSTM. Skip connections exist
between convolutional layers in encoder and
their matching layers in decoder.
Robust Lane Detection from Continuous Driving
Scenes Using Deep Neural Networks
Visual comparison of the lane-detection results. Row 1: ground truth. Row 2: SegNet. Row
3: U-Net. Row 4: SegNetConvLSTM. Row 5: U-Net-ConvLSTM. Row 6: original image.
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
• HD maps play an important role in modern traffic scenes.
• Development of HD maps coverage grows slowly
because of the cost limitation.
• To model HD maps, a CNN with a prediction layer and a
zoom module, called LineNet, is designed for SoA lane
detection in an unordered crowdsourced image dataset.
• TTLane, is a dataset for efficient lane detection in urban
road modeling applications.
• Combining LineNet and TTLane, a pipeline to model HD
maps with crowdsourced data.
• The maps can be constructed precisely even with
inaccurate crowdsourced data. Annotation of (a). dash lanes. (b). double lanes.
(c). occlusion segments. (d). road boundaries.
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
l Use a pre-trained ResNet model
with dilated convolution as the
feature extractor.
l A dilated convolution strategy
helps to increase receptive field,
which is essential when
detecting dashed lanes.
l The Line Prediction (LP) layer is
designed for accurate lane
positioning and classification.
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
Different branches’ outputs of the LP layer, with two samples.
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
• The Zoom Module is the second feature of LineNet.
• With this, LineNet can alter the FoV to an arbitrarily
size without changing network structure.
• It splits the data flow through the CNN into two
streams: (i) a thumbnail CNN; and (ii) a high-
resolution cropped CNN.
This figure illustrates the zooming process. Three
columns represent three different zoom levels
(more zoom levels can be added if necessary).
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
• To achieve nice and smooth lines, points were clustered together and fitted into lines.
• The clustering algorithm named DBSCAN was used with hierarchical distance(HDis).
• The line position from the LP layer was collected and combined with a zoom level.
• The combination is denoted as a tuple a = (x, y, z), where (x, y) is the image coordinate from
line position outputs, and z is the stage’s zoom ratio used to predict the line position.
Line points are gradually clustered
together from near to far.
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
road modeling
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
road modeling
LineNet: a Zoomable CNN for Crowdsourced High
Definition Maps Modeling in Urban Environments
road modeling
Efficient Road Lane Marking Detection with
Deep Learning
• A Lane Marking Detector (LMD) using a deep CNN to extract robust lane marking features.
• To improve its performance for lower complexity, the dilated convolution is adopted.
• A shallower and thinner structure is designed to decrease the computational cost.
• Post-processing algo to construct 3rd-order polynomial models to fit into the curved lanes.
Flowchart of the proposed LMD system.
Efficient Road Lane Marking Detection with
Deep Learning
Efficient Road Lane Marking Detection with
Deep Learning
End to End Video Segmentation for Driving :
Lane Detection For Autonomous Car
• Statistics show that unintended lane departure is a leading cause of worldwide motor vehicle
collisions, making lane detection the most promising and challenge task for self-driving.
• People are combining deep learning with computer vision to solve self-driving problems.
• a Global Convolution Networks (GCN) model is used to address both classification and
localization issues for semantic segmentation of lane.
• Using color-based segmentation is presented and the usability of the model is evaluated.
• A residual-based boundary refinement and Adam optimization is also used to achieve state-
of-art performance.
• As normal cars could not afford GPUs on the car, and training session for a particular road
could be shared by several cars.
• A real time video transfer system to get video from the car, get the model trained in edge
server (which is equipped with GPUs), and send the trained model back to the car.
End to End Video Segmentation for Driving :
Lane Detection For Autonomous Car
An overview of the whole pipeline.
End to End Video Segmentation for Driving :
Lane Detection For Autonomous Car
3D-LaneNet: E2E 3D multiple lane detection
• This network directly predicts the 3D layout of lanes in a road scene from a single image.
• It is a first attempt to address this task with on-board sensing instead of relying on pre-
mapped environments.
• 3D-LaneNet, applies two new concepts: intra-network inverse-perspective mapping (IPM)
and anchor-based lane representation.
• The intra- network IPM projection facilitates a dual-representation info. flow in both regular image-
view and top-view.
• An anchor-per-column output representation enables e2e approach replacing common heuristics
such as clustering and outlier rejection.
• It outputs in each longitudinal road slice, the confidence that a lane passes through the slice
and its 3D curve in camera coordinates.
• Each output is associated to an anchor in analogy to single-shot, anchor-based object
detection methods such as SSD and YOLO.
• It explicitly handles complex situations such as lane merges and splits.
3D-LaneNet: E2E 3D multiple lane detection
(a) Schematic illustration of
the end-to-end approach
and lane detection result
example on top-view. (b)
Projection of the result on
the original image.
3D-LaneNet: E2E 3D multiple lane detection
Camera position and road projection plane.
Assume known intrinsic camera parameters
(e.g. focal length, center of projection) .
Also assume that the camera is installed at zero degrees roll
relative to the local ground plane.
Lane centerlines are marked in blue and
delimiters in yellow dotted curves.
To define the task as detecting either the set
of lane centerlines and/or lane delimiters
given the image.
3D-LaneNet: E2E 3D multiple lane detection
The dual context module.
A main building block in the architecture is the projective
transformation layer. This layer is a specific realization,
with slight variations, of the spatial transformer module.
It performs a differentiable sampling of an input feature
map UI , corresponding spatially to the image plane, to an
output feature map UTcorresponding spatially to a virtual
top view of the scene. The differential sampling is achieved
through a grid for transforming an image to top-view.
The dual context module uses the projective transformation
layer to create highly descriptive feature maps. Info. flows
from multi-channel feature maps UI and VT corresponding to
image-view and top- view respectively.
3D-LaneNet: E2E 3D multiple lane detection
3D-LaneNet network architecture.
VGG16
3D-LaneNet: E2E 3D multiple lane detection
Output representation. Note that the number of
anchors (N ) equals the output layer width.
Note: Per anchor, the network outputs 3 types (t) of lane
descriptors (confidence and geometry), the first two (c1, c2)
represent lane centerlines and the third type (d) a lane
delimiter. Assigning 2 possible centerlines per anchor yields
the network support for merges and splits which may often
result in having the centerlines of two lanes coincide at Yref
and separating at different road positions. The topology of
lane delimiters is generally more complicated compared to
centerlines and it cannot capture all situations.
To define the anchors by equally spaced vertical (longitudinal)
lines in x-positions.
3D-LaneNet: E2E 3D multiple lane detection
Random synthetic data generation. (a) Surface (b) Road topology and curvature
(c) Road on surface (d) Rendered scenes.
3D-LaneNet: E2E 3D multiple lane detection
Examples of 3D lane centerline estimation results (with confidence > 0.5) on test images from the synthetic-3D-
lanes dataset. Ground truth (blue) and method result (red) shown in each image alongside a 3D visualization.
End-to-End Lane Marker Detection via Row-wise Classification
• The conventional approaches for the lane marker detection problem perform a pixel-
level dense prediction task followed by sophisticated post-processing that is inevitable
since lane markers are typically represented by a collection of line segments without
thickness.
• In this paper, propose a method performing direct lane marker vertex prediction in an
end-to-end manner, i.e., without any post-processing step that is required in the pixel-
level dense prediction task.
• Specifically, translate the lane marker detection problem into a row-wise classification
task, which takes advantage of the innate shape of lane markers but, surprisingly, has
not been explored well.
• In order to compactly extract sufficient information about lane markers which spread
from the left to the right in an image, devise a novel layer, utilized to successively
compress horizontal components so enables an end-to-end lane marker detection
system where the final lane marker positions are simply obtained via argmax operations
in testing time.
• Experimental results demonstrate on two popular lane marker detection benchmarks,
i.e., TuSimple and CULane.
End-to-End Lane Marker Detection via Row-wise Classification
The E2E-LMD framework for lane marker detection
End-to-End Lane Marker Detection via Row-wise Classification
The E2E-LMD architecture for lane marker detection. We extend general encoder-decoder architectures by adding successive
horizontal reduction modules for end-to-end lane marker detection. Numbers under each block denote spatial resolution and
channels. (a) Arrows with HRM denote a horizontal reduction module of (b). Arrows with Conv are output convolution with 1 × 1.
Dashed arrows denote the global average pooling with a fully connected layer. (b) HRM is utilized to compress the horizontal
representation. r denotes the pooling ratio for width part. Conv kernel size k is set as 3 except the last HRM layer which set as 1.
End-to-End Lane Marker Detection via Row-wise Classification
End-to-End Lane Marker Detection via Row-wise Classification
Camera-Based Road Lane Detection by Deep Learning II

Weitere ähnliche Inhalte

Was ist angesagt?

Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible
NAVER D2
 

Was ist angesagt? (20)

YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
U-Net (1).pptx
U-Net (1).pptxU-Net (1).pptx
U-Net (1).pptx
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep Learning
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
camera-based Lane detection by deep learning
camera-based Lane detection by deep learningcamera-based Lane detection by deep learning
camera-based Lane detection by deep learning
 
Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)Tutorial on Object Detection (Faster R-CNN)
Tutorial on Object Detection (Faster R-CNN)
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep Learning
 
[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible[G4]image deblurring, seeing the invisible
[G4]image deblurring, seeing the invisible
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Hough Transform By Md.Nazmul Islam
Hough Transform By Md.Nazmul IslamHough Transform By Md.Nazmul Islam
Hough Transform By Md.Nazmul Islam
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Object detection
Object detectionObject detection
Object detection
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
 
Machine Learning - Object Detection and Classification
Machine Learning - Object Detection and ClassificationMachine Learning - Object Detection and Classification
Machine Learning - Object Detection and Classification
 
Object tracking presentation
Object tracking  presentationObject tracking  presentation
Object tracking presentation
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 

Ähnlich wie Camera-Based Road Lane Detection by Deep Learning II

Ähnlich wie Camera-Based Road Lane Detection by Deep Learning II (20)

Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Lane Detection and Traffic Sign Recognition using OpenCV and Deep Learning fo...
Lane Detection and Traffic Sign Recognition using OpenCV and Deep Learning fo...Lane Detection and Traffic Sign Recognition using OpenCV and Deep Learning fo...
Lane Detection and Traffic Sign Recognition using OpenCV and Deep Learning fo...
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
IRJET- Automatic Traffic Sign Detection and Recognition using CNN
IRJET- Automatic Traffic Sign Detection and Recognition using CNNIRJET- Automatic Traffic Sign Detection and Recognition using CNN
IRJET- Automatic Traffic Sign Detection and Recognition using CNN
 
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
 
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
IRJET- Identification of Scene Images using Convolutional Neural Networks - A...
 
Identifying Parking Spots from Surveillance Cameras using CNN
Identifying Parking Spots from Surveillance Cameras using CNNIdentifying Parking Spots from Surveillance Cameras using CNN
Identifying Parking Spots from Surveillance Cameras using CNN
 
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTIONANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
ANALYSIS OF INSTANCE SEGMENTATION APPROACH FOR LANE DETECTION
 
Automatism System Using Faster R-CNN and SVM
Automatism System Using Faster R-CNN and SVMAutomatism System Using Faster R-CNN and SVM
Automatism System Using Faster R-CNN and SVM
 
project final ppt.pptx
project final ppt.pptxproject final ppt.pptx
project final ppt.pptx
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and Future
 
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
IMPROVEMENT IN IMAGE DENOISING OF HANDWRITTEN DIGITS USING AUTOENCODERS IN DE...
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Intelligent Traffic light detection for individuals with CVD
Intelligent Traffic light detection for individuals with CVDIntelligent Traffic light detection for individuals with CVD
Intelligent Traffic light detection for individuals with CVD
 
IEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsIEEE 2014 Matlab Projects
IEEE 2014 Matlab Projects
 
IEEE 2014 Matlab Projects
IEEE 2014 Matlab ProjectsIEEE 2014 Matlab Projects
IEEE 2014 Matlab Projects
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 
AaSeminar_Template.pptx
AaSeminar_Template.pptxAaSeminar_Template.pptx
AaSeminar_Template.pptx
 

Mehr von Yu Huang

Mehr von Yu Huang (20)

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous Driving
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and Segmentation
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VI
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at Baidu
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the Hood
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learning
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planning
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 

Kürzlich hochgeladen

AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
 

Kürzlich hochgeladen (20)

Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 

Camera-Based Road Lane Detection by Deep Learning II

  • 1. Camera-Based Road Lane Detection by Deep Learning II Yu Huang Yu.huang07@gmail.com Sunnyvale, California
  • 2. Outline • Ultra Fast Structure-aware Deep Lane Detection • Learning Lightweight Lane Detection CNNs by Self Attention Distillation • Key Points Estimation and Point Instance Segmentation Approach for Lane Detection • Learning to Cluster for Proposal-Free Instance Segmentation • Lane Detection and Classification using Cascaded CNNs • PolyLaneNet: Lane Estimation via Deep Polynomial Regression • Lane Detection: Light Conditions Style Transfer • End-to-end Lane Detection through Differentiable Least-Squares Fitting • Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks • LineNet: a Zoomable CNN for Crowdsourced HD Maps Modeling in Urban Environments • Efficient Road Lane Marking Detection with Deep Learning • End to End Video Segmentation for Driving: Lane Detection For Autonomous Car • 3D-LaneNet: E2E 3D multiple lane detection • End-to-End Lane Marker Detection via Row- wise Classification
  • 3. Ultra Fast Structure-aware Deep Lane Detection • ArXiv 2004.11757 • Inspired by human perception, the recognition of lanes under severe occlusion and extreme lighting conditions is mainly based on contextual and global information. • Motivated by this observation, propose a novel, simple, yet effective formulation aiming at extremely fast speed and challenging scenarios. • Specifically, treat the process of lane detection as a row-based selecting problem using global features. • With the help of row-based selecting, formulation could significantly reduce the computational cost. • Using a large receptive field on global features, could also handle the challenging scenarios. • Moreover, based on the formulation, also propose a structural loss to explicitly model the structure of lanes. • A light-weight version could even achieve 300+ frames per second with the same resolution, which is at least 4x faster than previous state-of-the-art methods. • code is available at https://github.com/cfzd/Ultra-Fast-Lane-Detection.
  • 4. Ultra Fast Structure-aware Deep Lane Detection Illustration of selecting on the left and right lane. In the right part, the selecting of a row is shown in detail. Row anchors are the predefined row locations, and formulation is defined as horizontally selecting on each of row anchor. On the right of the image, a background gridding cell is introduced to indicate no lane in this row.
  • 5. Ultra Fast Structure-aware Deep Lane Detection Illustration of our formulation and conventional segmentation. Our formulation is selecting locations (grids) on rows, while segmentation is classifying every pixel. The dimensions used for classifying are also different, which is marked with red. The proposed formulation significantly reduces the computational cost. Besides, the proposed formulation uses global feature as input, which has larger receptive field than segmentation, thus addressing the no-visual-clue problem formulation Segmentation
  • 6. Ultra Fast Structure-aware Deep Lane Detection Overall architecture. The auxiliary branch is shown in the upper part, which is only valid when training. The feature extractor is shown in the blue box. The classification-based prediction and auxiliary segmentation task are illustrated in the green and orange boxes, respectively. The group classification is conducted on each row anchor.
  • 7. Ultra Fast Structure-aware Deep Lane Detection
  • 8. Learning Lightweight Lane Detection CNNs by Self Attention Distillation • https://github.com/cardwing/Codes-for-Lane-Detection • Without learning from much richer context, lane detection models often fail in challenging scenarios, e.g., severe occlusion, ambiguous lanes, and poor lighting conditions. • present a novel knowledge distillation approach, i.e., Self Attention Distillation (SAD), which allows a model to learn from itself and gains substantial improvement without any additional supervision or labels. • Specifically, observe that attention maps extracted from a model trained to a reasonable level would encode rich contextual information. • The valuable contextual information can be used as a form of ‘free’ supervision for further representation learning through performing top down and layer-wise attention distillation within the network itself. • SAD can be easily incorporated in any feedforward convolutional neural networks (CNN) and does not increase the inference time. • validate SAD on three popular lane detection benchmarks (TuSimple, CULane and BDD100K) using lightweight models such as ENet, ResNet18 and ResNet-34.
  • 9. Learning Lightweight Lane Detection CNNs by Self Attention Distillation Attention maps of the ENet [17] before and after applying self attention distillation. Here, we extract the attention maps from the four stages/blocks following the design of ENet model. Note that self attention distillation is added in the 40 K episodes.
  • 10. Learning Lightweight Lane Detection CNNs by Self Attention Distillation Attention maps of the block 4 of the ENet model using different mapping functions. An instantiation of using SAD. E1∼E4 comprise the encoder of ENet, D1 and D2 comprise the decoder of ENet. add a small network to predict the existence of lanes, denoted as P1. AT-GEN is the attention generator.
  • 11. Learning Lightweight Lane Detection CNNs by Self Attention Distillation Attention maps of ENet with and without self attention distillation. Both networks with and without SAD are trained up to 60K episodes. SAD is applied to ENet at 40K training episodes.
  • 12. Learning Lightweight Lane Detection CNNs by Self Attention Distillation
  • 13. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection • https://github.com/koyeongmin/PINet • Current methods have critical deficiencies such as the limited number of detectable lanes and high false positive. • In especial, high false positive can cause wrong and dangerous control. • In this paper, propose a lane detection method for the arbitrary number of lanes using the deep learning method, which has the lower number of false positives than other recent lane detection methods. • The architecture of the proposed method has the shared feature extraction layers and several branches for detection and embedding to cluster lanes. • The proposed method can generate exact points on the lanes, and cast a clustering problem for the generated points as a point cloud instance segmentation problem. • The proposed method is more compact because it generates fewer points than the original image pixel size. • proposed post processing method eliminates outliers successfully and increases the performance notably.
  • 14. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection The proposed framework. Given an input image, PINet predict three value, confidence, offset, and feature. From confidence and offset outputs, exact points on the lanes can be predicted, and the feature output distinguishes the predicted points into each instance. Finally, the post processing module is applied, and it generates smooth lane.
  • 15. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection The detailed network training procedure. It has three main parts. 512x256 size input data is compressed by the resizing layer, and the compressed input is passed to feature extraction layer. Three output branches are applied at end of each hourglass block, and they predict confidence, offset, and instance feature for each grid. Loss function can be calculated from outputs of each hourglass block.
  • 16. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection The hourglass block and bottleneck layer architecture. The hourglass block consist three types of bottleneck layers, same, up-sampling, down-sampling. Output branches are applied at end of hourglass layer, and the confidence output is forwarded to next block.
  • 17. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection The result of the post processing. (a) is input image, and (b) is raw output out PINet. In (b), the blue lane consist of some outliers and other lane can be distinguished. In (c), the result of the proposed post processing method, outliers are eliminated, and only smooth longest lanes remain.
  • 18. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection The explanation about the post processing. There are no other point in the margin that is made by the straight line connecting point S and A, but margin of point S and B consist of 2 other points. As a result, point B is selected.
  • 19. Key Points Estimation and Point Instance Segmentation Approach for Lane Detection
  • 20. Learning to Cluster for Proposal-Free Instance Segmentation • https://github.com/GT-RIPL/L2C • This work proposed a novel learning objective to train a deep neural network to perform end-to-end image pixel clustering. • applied the approach to instance segmentation, which is at the intersection of image semantic segmentation and object detection. • utilize the most fundamental property of instance labeling – the pairwise relationship between pixels – as the supervision to formulate the learning objective, then apply it to train a fully convolutional network (FCN) for learning to perform pixel-wise clustering. • The resulting clusters can be used as the instance labeling directly. • To support labeling of an unlimited number of instance, further formulate ideas from graph coloring theory into the proposed learning objective. • The evaluation on the Cityscapes dataset demonstrates strong performance and therefore proof of the concept. • Moreover, approach won the second place in the lane detection competition of 2017 CVPR Autonomous Driving Challenge and was the top performer without using external data.
  • 21. Learning to Cluster for Proposal-Free Instance Segmentation address the labeling problem by formulating a novel learning objective. It guides the fully convolutional networks to learn to perform instance labeling
  • 22. Learning to Cluster for Proposal-Free Instance Segmentation The example outputs of lane detection. The colors represent different instance IDs. The outputs for each pixel is a 6 + 1 dimensional vector, which represents the probability distribution of this pixel being assigned to a certain ID. learning objective guides distribution function to output a similar distribution for the pixels on the same lane line, and vise versa. During testing time, the pixel will be assigned to an ID with highest probability. Given a pair of pixels pi and pj , their corresponding output distributions are denoted as Pi = f(pi) = [ti,1..ti,n] and Pj = f(pj ) = [tj,1..tj,n], where n is the number of indices available for labeling.
  • 23. Learning to Cluster for Proposal-Free Instance Segmentation The concept of how graph coloring is related to instance ID assignment
  • 24. Learning to Cluster for Proposal-Free Instance Segmentation The network architecture
  • 25. Learning to Cluster for Proposal-Free Instance Segmentation The visualization of the lane detection on Tusimple dataset (validation split). The red lines in top row are predictions, while the green lines are the ground-truth. The second row shows the raw outputs from network. The colors represent the assigned IDs.
  • 26. Lane Detection and Classification using Cascaded CNNs • https://github.com/fabvio/Cascade-LD • https://github.com/fabvio/TuSimple-lane-classes • As many other computer vision based tasks, convolutional neural networks (CNNs) represent the state-of-the-art technology to indentify lane boundaries. • However, the position of the lane boundaries w.r.t. the vehicle may not suffice for a reliable positioning, as for path planning or localization information regarding lane types may also be needed. • In this work, present an end-to-end system for lane boundary identification, clustering and classification, based on two cascaded neural networks, that runs in real-time. • To build the system, 14336 lane boundaries instances of the TuSimple dataset for lane detection have been labelled using 8 different classes.
  • 27. Lane Detection and Classification using Cascaded CNNs
  • 28. Lane Detection and Classification using Cascaded CNNs From top to bottom: original image, instance segmentation, classification. For instance segmentation, different colors represent different boundaries. For classification, green represents dashed lanes, yellow double-dashed, red continuous.
  • 29. PolyLaneNet: Lane Estimation via Deep Polynomial Regression • https://github.com/lucastabelini/PolyLaneNet • Since methods for lane detection have to work in real time (+30 FPS), they not only have to be effective (i.e., have high accuracy) but they also have to be efficient (i.e., fast). • In this work, present a novel method for lane detection that uses as input an image from a forward-looking camera mounted in the vehicle and outputs polynomials representing each lane marking in the image, via deep polynomial regression. • The proposed method is shown to be competitive with existing state-of-the- art methods in the TuSimple dataset, while maintaining its efficiency (115 FPS). • Additionally, extensive qualitative results on two additional public datasets are presented, alongside with limitations in the evaluation metrics used by recent works for lane detection.
  • 30. PolyLaneNet: Lane Estimation via Deep Polynomial Regression Overview of the proposal method. From left to right: the model receives as input an image from a forward-looking camera and outputs information about each lane marking in the image
  • 31. PolyLaneNet: Lane Estimation via Deep Polynomial Regression
  • 32. Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer • https://github.com/Chenzhaowei13/Light-Condition-Style-Transfer • Although multi-task learning and contextual-information-based methods have been proposed to solve lane detection, they either require additional manual annotations or introduce extra inference overhead respectively. • In this paper, propose a style-transfer-based data enhancement method, which uses Generative Adversarial Networks (GANs) to generate images in low-light conditions, that increases the environmental adaptability of the lane detector. • solution consists of three parts: the proposed SIM-CycleGAN, light conditions style transfer and lane detection network. • It does not require additional manual annotations nor extra inference overhead. • validated methods on the lane detection benchmark CULane using ERFNet. • Empirically, lane detection model trained using method demonstrated adaptability in low-light conditions and robustness in complex scenarios.
  • 33. Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer The main framework of our method. The proposed SIM-CycleGAN is shown on the left. The generator GA transfer images from suitable light conditions to low-light conditions, while the generator GB transfer in the opposite way. The discriminator DA and DB feed the single scalar value(real or fake) back to generators. The middle section shows light conditions style transfer from suitable light conditions to low-light conditions by the trained SIM-CycleGAN. Lane detection model is shown on the right, whose baseline is ERFNet. We add lane exist branch for better performance.
  • 34. Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer Generator architecture, composed of convolution layers, residual blocks and deconvolution layers. Convolution layers record the changing information of scale in the encoding process and maps it to the corresponding operation in the decoding process.
  • 35. Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer lane detection model architecture. The decoder outputs probability maps of different lane markings, and the second branch predicts the existence of lane.
  • 36. Lane Detection in Low-light Conditions Using an Efficient Data Enhancement : Light Conditions Style Transfer The probability maps from our method and other methods. The brightness of the pixel indicates the probability of this pixel belonging to lanes. It can be clearly seen from this figure, in low-light conditions, the probability maps generated by our method is more pronounced and more accurate.
  • 37. End-to-end Lane Detection through Differentiable Least-Squares Fitting • A method to train a lane detector in an e2e manner, directly regressing the lane parameters. • The architecture consists of two components: a deep network that predicts a segmentation-like weight map for each lane line, and a differentiable least-squares fitting module that returns for each map the parameters of the best-fitting curve in the weighted least-squares sense. • These parameters can subsequently be supervised with a loss function of choice. • It relies on that it is possible to backpropagate through a least-squares fitting procedure. • This leads to an end-to-end method where the features are optimized for the true task of interest: the network implicitly learns to generate features that prevent instabilities during the model fitting step, as opposed to two-step pipelines that need to handle outliers with heuristics. • Additionally, the system is not just a black box but offers a degree of interpretability because the intermediately generated segmentation-like weight maps can be inspected and visualized. • Code: http://github.com/wvangansbeke/LaneDetection_End2End.
  • 38. End-to-end Lane Detection through Differentiable Least-Squares Fitting • Lane detection is typically tackled with a two-step pipeline in which a segmentation mask of the lane markings is predicted first, and a lane line model (like a parabola or spline) is fitted to the post-processed mask next. • The problem with such a two-step approach is that the parameters of the network are not optimized for the true task of interest (estimating the lane curvature parameters) but for a proxy task (segmenting the lane markings), resulting in sub-optimal performance. Overview of the architecture
  • 39. End-to-end Lane Detection through Differentiable Least-Squares Fitting Least-squares fitting. Weighted least-squares fitting
  • 40. End-to-end Lane Detection through Differentiable Least-Squares Fitting
  • 41. Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks • https://github.com/qinnzou/Robust-Lane-Detection • Most methods focus on detecting the lane from one single image, and often lead to unsatisfactory performance in handling some extremely-bad situations such as heavy shadow, severe mark degradation, serious vehicle occlusion, and so on. • In fact, lanes are continuous line structures on the road. • Consequently, the lane that cannot be accurately detected in one current frame may potentially be inferred out by incorporating information of previous frames. • To this end, investigate lane detection by using multiple frames of a continuous driving scene, and propose a hybrid deep architecture by combining the convolutional neural network (CNN) and the recurrent neural network (RNN). • Specifically, information of each frame is abstracted by a CNN block, and the CNN features of multiple continuous frames, holding the property of time-series, are then fed into the RNN block for feature learning and lane prediction. • Extensive experiments on two large-scale datasets demonstrate that, the proposed method outperforms the competing methods in lane detection.
  • 42. Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks Architecture
  • 43. Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks Encoder network in (a) UNet-ConvLSTM and (b) SegNet-ConvLSTM. Skip connections exist between convolutional layers in encoder and their matching layers in decoder.
  • 44. Robust Lane Detection from Continuous Driving Scenes Using Deep Neural Networks Visual comparison of the lane-detection results. Row 1: ground truth. Row 2: SegNet. Row 3: U-Net. Row 4: SegNetConvLSTM. Row 5: U-Net-ConvLSTM. Row 6: original image.
  • 45. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments • HD maps play an important role in modern traffic scenes. • Development of HD maps coverage grows slowly because of the cost limitation. • To model HD maps, a CNN with a prediction layer and a zoom module, called LineNet, is designed for SoA lane detection in an unordered crowdsourced image dataset. • TTLane, is a dataset for efficient lane detection in urban road modeling applications. • Combining LineNet and TTLane, a pipeline to model HD maps with crowdsourced data. • The maps can be constructed precisely even with inaccurate crowdsourced data. Annotation of (a). dash lanes. (b). double lanes. (c). occlusion segments. (d). road boundaries.
  • 46. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments l Use a pre-trained ResNet model with dilated convolution as the feature extractor. l A dilated convolution strategy helps to increase receptive field, which is essential when detecting dashed lanes. l The Line Prediction (LP) layer is designed for accurate lane positioning and classification.
  • 47. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments Different branches’ outputs of the LP layer, with two samples.
  • 48. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments • The Zoom Module is the second feature of LineNet. • With this, LineNet can alter the FoV to an arbitrarily size without changing network structure. • It splits the data flow through the CNN into two streams: (i) a thumbnail CNN; and (ii) a high- resolution cropped CNN. This figure illustrates the zooming process. Three columns represent three different zoom levels (more zoom levels can be added if necessary).
  • 49. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments
  • 50. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments • To achieve nice and smooth lines, points were clustered together and fitted into lines. • The clustering algorithm named DBSCAN was used with hierarchical distance(HDis). • The line position from the LP layer was collected and combined with a zoom level. • The combination is denoted as a tuple a = (x, y, z), where (x, y) is the image coordinate from line position outputs, and z is the stage’s zoom ratio used to predict the line position. Line points are gradually clustered together from near to far.
  • 51. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments road modeling
  • 52. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments road modeling
  • 53. LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments road modeling
  • 54. Efficient Road Lane Marking Detection with Deep Learning • A Lane Marking Detector (LMD) using a deep CNN to extract robust lane marking features. • To improve its performance for lower complexity, the dilated convolution is adopted. • A shallower and thinner structure is designed to decrease the computational cost. • Post-processing algo to construct 3rd-order polynomial models to fit into the curved lanes. Flowchart of the proposed LMD system.
  • 55. Efficient Road Lane Marking Detection with Deep Learning
  • 56. Efficient Road Lane Marking Detection with Deep Learning
  • 57. End to End Video Segmentation for Driving : Lane Detection For Autonomous Car • Statistics show that unintended lane departure is a leading cause of worldwide motor vehicle collisions, making lane detection the most promising and challenge task for self-driving. • People are combining deep learning with computer vision to solve self-driving problems. • a Global Convolution Networks (GCN) model is used to address both classification and localization issues for semantic segmentation of lane. • Using color-based segmentation is presented and the usability of the model is evaluated. • A residual-based boundary refinement and Adam optimization is also used to achieve state- of-art performance. • As normal cars could not afford GPUs on the car, and training session for a particular road could be shared by several cars. • A real time video transfer system to get video from the car, get the model trained in edge server (which is equipped with GPUs), and send the trained model back to the car.
  • 58. End to End Video Segmentation for Driving : Lane Detection For Autonomous Car An overview of the whole pipeline.
  • 59. End to End Video Segmentation for Driving : Lane Detection For Autonomous Car
  • 60. 3D-LaneNet: E2E 3D multiple lane detection • This network directly predicts the 3D layout of lanes in a road scene from a single image. • It is a first attempt to address this task with on-board sensing instead of relying on pre- mapped environments. • 3D-LaneNet, applies two new concepts: intra-network inverse-perspective mapping (IPM) and anchor-based lane representation. • The intra- network IPM projection facilitates a dual-representation info. flow in both regular image- view and top-view. • An anchor-per-column output representation enables e2e approach replacing common heuristics such as clustering and outlier rejection. • It outputs in each longitudinal road slice, the confidence that a lane passes through the slice and its 3D curve in camera coordinates. • Each output is associated to an anchor in analogy to single-shot, anchor-based object detection methods such as SSD and YOLO. • It explicitly handles complex situations such as lane merges and splits.
  • 61. 3D-LaneNet: E2E 3D multiple lane detection (a) Schematic illustration of the end-to-end approach and lane detection result example on top-view. (b) Projection of the result on the original image.
  • 62. 3D-LaneNet: E2E 3D multiple lane detection Camera position and road projection plane. Assume known intrinsic camera parameters (e.g. focal length, center of projection) . Also assume that the camera is installed at zero degrees roll relative to the local ground plane. Lane centerlines are marked in blue and delimiters in yellow dotted curves. To define the task as detecting either the set of lane centerlines and/or lane delimiters given the image.
  • 63. 3D-LaneNet: E2E 3D multiple lane detection The dual context module. A main building block in the architecture is the projective transformation layer. This layer is a specific realization, with slight variations, of the spatial transformer module. It performs a differentiable sampling of an input feature map UI , corresponding spatially to the image plane, to an output feature map UTcorresponding spatially to a virtual top view of the scene. The differential sampling is achieved through a grid for transforming an image to top-view. The dual context module uses the projective transformation layer to create highly descriptive feature maps. Info. flows from multi-channel feature maps UI and VT corresponding to image-view and top- view respectively.
  • 64. 3D-LaneNet: E2E 3D multiple lane detection 3D-LaneNet network architecture. VGG16
  • 65. 3D-LaneNet: E2E 3D multiple lane detection Output representation. Note that the number of anchors (N ) equals the output layer width. Note: Per anchor, the network outputs 3 types (t) of lane descriptors (confidence and geometry), the first two (c1, c2) represent lane centerlines and the third type (d) a lane delimiter. Assigning 2 possible centerlines per anchor yields the network support for merges and splits which may often result in having the centerlines of two lanes coincide at Yref and separating at different road positions. The topology of lane delimiters is generally more complicated compared to centerlines and it cannot capture all situations. To define the anchors by equally spaced vertical (longitudinal) lines in x-positions.
  • 66. 3D-LaneNet: E2E 3D multiple lane detection Random synthetic data generation. (a) Surface (b) Road topology and curvature (c) Road on surface (d) Rendered scenes.
  • 67. 3D-LaneNet: E2E 3D multiple lane detection Examples of 3D lane centerline estimation results (with confidence > 0.5) on test images from the synthetic-3D- lanes dataset. Ground truth (blue) and method result (red) shown in each image alongside a 3D visualization.
  • 68. End-to-End Lane Marker Detection via Row-wise Classification • The conventional approaches for the lane marker detection problem perform a pixel- level dense prediction task followed by sophisticated post-processing that is inevitable since lane markers are typically represented by a collection of line segments without thickness. • In this paper, propose a method performing direct lane marker vertex prediction in an end-to-end manner, i.e., without any post-processing step that is required in the pixel- level dense prediction task. • Specifically, translate the lane marker detection problem into a row-wise classification task, which takes advantage of the innate shape of lane markers but, surprisingly, has not been explored well. • In order to compactly extract sufficient information about lane markers which spread from the left to the right in an image, devise a novel layer, utilized to successively compress horizontal components so enables an end-to-end lane marker detection system where the final lane marker positions are simply obtained via argmax operations in testing time. • Experimental results demonstrate on two popular lane marker detection benchmarks, i.e., TuSimple and CULane.
  • 69. End-to-End Lane Marker Detection via Row-wise Classification The E2E-LMD framework for lane marker detection
  • 70. End-to-End Lane Marker Detection via Row-wise Classification The E2E-LMD architecture for lane marker detection. We extend general encoder-decoder architectures by adding successive horizontal reduction modules for end-to-end lane marker detection. Numbers under each block denote spatial resolution and channels. (a) Arrows with HRM denote a horizontal reduction module of (b). Arrows with Conv are output convolution with 1 × 1. Dashed arrows denote the global average pooling with a fully connected layer. (b) HRM is utilized to compress the horizontal representation. r denotes the pooling ratio for width part. Conv kernel size k is set as 3 except the last HRM layer which set as 1.
  • 71. End-to-End Lane Marker Detection via Row-wise Classification
  • 72. End-to-End Lane Marker Detection via Row-wise Classification