SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Downloaden Sie, um offline zu lesen
Driving Behaviors for ADAS
and Autonomous Driving XII
Yu Huang
Yu.huang07@gmail.com
Sunnyvale, California
Outline
• SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via
Edge-enhanced Graph CNN (2)
• MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps
(3.15)
• PiP: Planning-informed Trajectory Prediction for Autonomous Drivin (3.25)
• Shared Cross-Modal Trajectory Prediction for Autonomous Driving (4.1)
• TPNet: Trajectory Proposal Network for Motion Prediction (4.26)
• VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments
(4.27)
• UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving (5.6)
• Robust Trajectory Forecasting for Multiple Intelligent Agents in Dynamic Scene (5.27)
• PnPNet: End-to-End Perception and Prediction with Tracking in the Loop (5.29)
• The Importance of Prior Knowledge in Precise Multimodal Prediction (6.4)
SCALE-Net: Scalable Vehicle Trajectory Prediction Network
under Random Number of Interacting Vehicles via Edge-
enhanced GCNN
• Predicting the future trajectory of surrounding vehicles in a randomly varying traffic level is one of
the most challenging problems in developing an autonomous vehicle.
• Since there is no pre-defined number of interacting vehicles participate in, the prediction network
has to be scalable with respect to the vehicle number in order to guarantee the consistency in
terms of both accuracy and computational load.
• The fully scalable trajectory prediction network, SCALE-Net, can ensure both higher prediction
performance and consistent computational load regardless of the number of surrounding vehicles.
• The SCALE- Net employs the Edge-enhance Graph Convolutional Neural Network (EGCN) for the
inter-vehicular interaction embedding network.
• Since the EGCN is inherently scalable with respect to the graph node (an agent in this study), the
model can be operated independently from the total number of vehicles considered.
• The experimental test shows that both computation time and prediction performance of the
SCALE-Net consistently outperform those of previous models regardless of the level of traffic
complexities
SCALE-Net: Scalable Vehicle Trajectory Prediction Network
under Random Number of Interacting Vehicles via Edge-
enhanced GCNN
Comparison between state input based- and scene input based- prediction
model on variation of computation time and accuracy per a single driving
scene with respect to the number of surrounding vehicles.
SCALE-Net: Scalable Vehicle Trajectory Prediction Network
under Random Number of Interacting Vehicles via Edge-
enhanced GCNN
Overall architecture of the SCALE-Net for interactive scalable trajectory prediction algorithm. Historical states of
the ego and surrounding vehicles, which is illustrated with dotted red line, is used as input parameter of the proposed
architecture. After passing through the EGCN based scene embedding layer and LSTM based trajectory predictor,
future trajectory of the surrounding vehicles are generated as shown in right-most figure with blue dotted line.
SCALE-Net: Scalable Vehicle Trajectory Prediction Network
under Random Number of Interacting Vehicles via Edge-
enhanced GCNN
Overall flow diagram of the EGCN layer for interaction embedding. Node number 4, indexed with 5, is updated
using blue-colored elements. Left: in edge-enhanced attention process, weight of the vehicles around vehicle number
4 is calculated using the relative states of the entire vehicles in order to generated weighted adjacency matrix, 𝑨𝒂𝒅𝒋.
Right: using 𝑨𝒂𝒅𝒋, node information of the vehicle 4 is updated by weight of GCN, 𝑾 𝒈𝒄𝒏.
SCALE-Net: Scalable Vehicle Trajectory Prediction Network
under Random Number of Interacting Vehicles via Edge-
enhanced GCNN
SCALE-Net: Scalable Vehicle Trajectory Prediction Network
under Random Number of Interacting Vehicles via Edge-
enhanced GCNN
Examples of trajectory prediction result in various
traffic level where green and transparent blue line is
predicted by SCALE-Net and V- LSTM, respectively
Critically interacting scene where the maneuver of the vehicles
highly depends on interaction effect from adjacent vehicles.
MotionNet: Joint Perception and Motion Prediction for
Autonomous Driving Based on Bird’s Eye View Maps
• The ability to reliably perceive the environmental states, particularly the existence of objects and their
motion behavior, is crucial for autonomous driving.
• an efficient deep model, called MotionNet, jointly perform perception and motion prediction from 3D
point clouds.
• MotionNet takes a sequence of LiDAR sweeps as input and outputs a bird’s eye view (BEV) map, which
encodes the object category and motion information in each grid cell.
• The backbone of MotionNet is a novel spatio- temporal pyramid network, which extracts deep spatial
and temporal features in a hierarchical fashion.
• To enforce the smoothness of predictions over both space and time, the training of MotionNet is
further regularized with novel spatial and temporal consistency losses.
• Extensive experiments show that the method overall outperforms the state-of-the-arts, including the
latest scene-flow- and 3D-object-detection-based methods.
• This indicates the potential value of the proposed method serving as a backup to the bounding-box-
based system, and providing complementary information to the motion planner in autonomous driving.
• Code is available at https://github.com/pxiangwu/MotionNet.
MotionNet: Joint Perception and Motion Prediction for
Autonomous Driving Based on Bird’s Eye View Maps
Top: MotionNet is a system based on bird’s eye view (BEV) map,
and performs perception and motion prediction jointly without
using bounding boxes.
It can potentially serve as a backup to the standard bounding-
box-based-system and provide complementary information for
motion planning.
Bottom: During testing, with (a) LiDAR data (BEV), given an
object (e.g., disabled person on a wheelchair, as illustrated in
(d)) that never appears in the training data, 3D object detection
tends to fail; see plots (b) and (c).
In contrast, MotionNet is still able to perceive the object and
forecast its motion; see plots (e) and (f), where the color
represents the category and the arrow denotes the future
displacement.
MotionNet: Joint Perception and Motion Prediction for
Autonomous Driving Based on Bird’s Eye View Maps
Overview of MotionNet. Given a sequence of LiDAR sweeps, first represent the raw point clouds into BEV maps,
which are essentially 2D images with multiple channels. Each pixel (cell) in a BEV map is associated with a
feature vector along the height dimension. then feed the BEV maps into the spatio-temporal pyramid network
(STPN) for feature extraction. The output of STPN is finally delivered to three heads: (1) cell classification,
which perceives the category of each cell, such as vehicle, pedestrian or background; (2) motion prediction,
which predicts the future trajectory of each cell; (3) state estimation, which estimates the current motion
status of each cell, such as static or moving. The final output is a BEV map, which includes both perception and
motion prediction information.
MotionNet: Joint Perception and Motion Prediction for
Autonomous Driving Based on Bird’s Eye View Maps
Spatio-temporal pyramid network. Each STC block
consists of two consecutive 2D convolutions
followed by one pseudo- 1D convolution. The
temporal pooling is applied to the temporal
dimension and squeezes it to length 1.
MotionNet: Joint Perception and Motion Prediction for
Autonomous Driving Based on Bird’s Eye View Maps
PiP: Planning-informed Trajectory Prediction for
Autonomous Driving
• It is critical to predict the motion of surrounding vehicles for self-driving planning,
especially in a socially compliant and flexible way.
• However, future prediction is challenging due to the interaction and uncertainty in
driving behaviors.
• planning-informed trajectory prediction (PiP) to tackle the prediction problem in the
multi-agent setting.
• differentiated from the traditional manner of prediction, which is only based on historical
information and decoupled with planning.
• By informing the prediction process with the planning of ego vehicle, it achieves the
state-of-the-art performance of multi- agent forecasting on highway datasets.
• Moreover, it enables a novel pipeline which couples the prediction and planning, by
conditioning PiP on multiple candidate trajectories of the ego vehicle, which is highly
beneficial for autonomous driving in interactive scenarios.
PiP: Planning-informed Trajectory Prediction for
Autonomous Driving
Comparison between the traditional prediction approach (left) and PiP (right) under a lane merging scenario. Assume the
ego vehicle (red) intends to merge to the left lane. It is required to predict the trajectories of surrounding vehicles (blue). To
alleviate the uncertainty led by future interaction, PiP incorporates the future plans (dotted red curve) of ego vehicle in
addition to the history tracks (grey curve). While the traditional prediction result is produced independently with the ego’s
future, PiP produces predictions one-to-one corresponding to the candidate future trajectories by enabling the novel
planning-prediction-coupled pipeline. Therefore, PiP evaluates the planning safety more precisely and achieves more
flexible driving behavior (solid red curve) compared with the traditional pipeline.
PiP: Planning-informed Trajectory Prediction for
Autonomous Driving
The overview of PiP architecture: PiP consists of 3 key modules, planning coupled, target fusion, and maneuver-based
decoding module. Each predicted target is firstly encoded in the planning coupled module by aggregating all information
within the target-centric area (blue square). A target tensor is then set up within the ego-vehicle-centric area (red square) by
placing the target encodings into the spatial gird based on their locations. Afterward, the target tensors are passed through
the target fusion module to learn the interdependency between tar- gets, and eventually, a fused target tensor is generated.
Finally, prediction of each target is decoded from corresponding fused target encoding in the maneuver-based decoding
module. The target vehicle marked is exemplified for planning coupled encoding and multi-modal trajectories decoding.
PiP: Planning-informed Trajectory Prediction for
Autonomous Driving
Shared Cross-Modal Trajectory Prediction for
Autonomous Driving
• A framework for predicting future trajectories of traffic agents in highly interactive
environments.
• On the basis of the fact that autonomous driving vehicles are equipped with various
types of sensors (e.g., LiDAR scanner, RGB camera, etc.), this work aims to get benefit
from the use of multiple input modalities that are complementary to each other.
• The proposed approach is composed of two stages. (i) feature encoding where to
discover motion behavior of the target agent wrt other directly and indirectly observable
influences. Extract such behaviors from multiple perspectives such as in top-down and
frontal view. (ii) cross-modal embedding where we embed a set of learned behavior
representations into a single cross-modal latent space.
• Construct a generative model and formulate the objective functions with an additional
regularizer specifically designed for future prediction.
• An extensive evaluation is conducted to show the efficacy of the proposed framework
using two benchmark driving datasets.
Shared Cross-Modal Trajectory Prediction for
Autonomous Driving
Given a sequence of images and past positions, the feature encoder analyzes internal, external, and social stimuli of
agents. The features generated from multiple sensory data (e.g., top-down view LiDAR and frontal view RGB) are used
to condition the generative model that aims to embed different input modalities into a single cross-modal latent
space. The following decoder predicts future trajectory in top-down or frontal view using the latent variable sampled
from the learned embedding space. Note that the dotted shapes and arrows are only visible at training time.
Shared Cross-Modal Trajectory Prediction for
Autonomous Driving
The detailed illustration of the feature encoder. Using the past image sequence, model spatio-temporal factors
given by external environments. The internal factors of the target agent is encoded from its past motion as well
as surrounding local perceptual context. In addition, consider the relative motion between the target and every
other interactive agents to construct the social interactions.
Shared Cross-Modal Trajectory Prediction for
Autonomous Driving
TPNet: Trajectory Proposal Network for
Motion Prediction
• Making accurate motion prediction of the surrounding traffic agents such as pedestrians, vehicles,
and cyclists is crucial for autonomous driving.
• Recent data-driven motion prediction methods have attempted to learn to directly regress the
exact future position or its distribution from massive amount of trajectory data.
• However, it remains difficult for these methods to provide multimodal predictions as well as
integrate physical constraints such as traffic rules and movable areas.
• This work is a two-stage motion prediction framework, Trajectory Proposal Network (TPNet).
• TPNet first generates a candidate set of future trajectories as hypothesis proposals, then makes
the final predictions by classifying and refining the proposals which meets the physical constraints.
• By steering the proposal generation process, safe and multimodal predictions are realized.
• Thus this framework effectively mitigates the complexity of motion prediction problem while
ensuring the multimodal output.
• Experiments on four large-scale trajectory prediction datasets, i.e. the ETH, UCY, Apollo and
Argoverse datasets, show that TPNet achieves the state-of-results.
TPNet: Trajectory Proposal Network for
Motion Prediction
The movement of traffic agents are often regularized by the
movable areas (white areas for vehicles and gray areas for
pedestrians), while there might be multiple plausible future
paths for the agents. Thus it requires the motion prediction
systems to be able to incorporate the traffic constraints and
output multimodal predictions. This framework generates
the predictions with different intentions under physical
constraints for both vehicles and pedestrians.
TPNet: Trajectory Proposal Network for
Motion Prediction
Framework of the Trajectory Proposal Network (TPNet). In the first stage, a rough end point is
regressed to reduce the searching space and then proposals are generated. In the second
stage, proposals are classified and refined to generate final predictions. The dotted proposals
are the proposals that lie outside of the movable area, which will be further punished.
TPNet: Trajectory Proposal Network for
Motion Prediction
Illustration of proposal generation. Proposals
are generated around the end point predicted
in the first stage. Îł is used to control the shape
of the proposal.
Illustration of multimodal proposal generation using
road information. The reference lines indicate the
possible center lane lines that the vehicle could dive in.
TPNet: Trajectory Proposal Network for
Motion Prediction
VTGNet: A Vision-based Trajectory Generation Network
for Autonomous Vehicles in Urban Environments
• Reliable navigation like expert human drivers in urban environments is a critical capability for
autonomous vehicles.
• Traditional methods for autonomous driving are implemented with many building blocks from
perception, planning and control, making them difficult to generalize to varied scenarios due to
complex assumptions and interdependencies.
• An end-to-end trajectory generation method based on imitation learning.
• It can extract spatiotemporal features from the front-view camera images for scene
understanding, then generate collision-free trajectories several seconds into the future.
• The network consists of three sub-networks, which are selectively activated for three common
driving tasks: keep straight, turn left and turn right.
• The experimental results suggest that under various weather and lighting conditions, the network
can reliably generate trajectories in different urban environments, such as turning at intersections
and slowing down for collision avoidance.
• Furthermore, by integrating the network into a navigation system, good generalization
performance is presented in an unseen simulated world for autonomous driving on different
types of vehicles, such as cars and trucks.
VTGNet: A Vision-based Trajectory Generation Network
for Autonomous Vehicles in Urban Environments
Different approaches for trajectory planning and decision-making for autonomous vehicles.
VTGNet: A Vision-based Trajectory Generation Network
for Autonomous Vehicles in Urban Environments
The architecture of VTGNet, which consists of a feature extractor and a trajectory generator. MobileNet V2 is used
as the feature extractor with 17 bottleneck convolutional layers. And the long short-term memory (LSTM) is used in
the decoder to process the spatiotemporal information. The output of the VTGNet is a vector of size 22×3 indicating
the trajectory in the future 22 frames (velocity and x,y positions in the body frame). Note that the width of the
network layers indicates the number of output channels.
VTGNet: A Vision-based Trajectory Generation Network
for Autonomous Vehicles in Urban Environments
Different baselines in this work. The feature extractor for these networks is the same as the one in
the proposed VTGNet. The size of the output features is shown above the layers.
VTGNet: A Vision-based Trajectory Generation Network
for Autonomous Vehicles in Urban Environments
UST: Unifying Spatio-Temporal Context for
Trajectory Prediction in Autonomous Driving
• Trajectory prediction has always been a challenging problem for autonomous driving, since it
needs to infer the latent intention from the behaviors and interactions from traffic participants.
• This problem is intrinsically hard, because each participant may behave differently under different
environments and interactions.
• This key is to effectively model interlaced influence from both spatial and temporal context.
• Existing work usually encodes these two types of context separately, which would lead to inferior
modeling of the scenarios.
• A unified approach to treat time-space dimensions equally for modeling spatio-temporal context.
• The module is simple and easy to implement within several lines of codes.
• In contrast to existing methods which heavily rely on RNN for temporal context and hand-crafted
structure for spatial context, it could auto-partition the spatio-temporal space to adapt the data.
• Test on two recently proposed trajectory prediction dataset ApolloScape and Argoverse.
• These encouraging results further validate the superiority of our approach.
UST: Unifying Spatio-Temporal Context for
Trajectory Prediction in Autonomous Driving
Illustration and representations of the trajectory prediction task. Blue, green, red colors show
trajectories for vehicles, bicycles, pedestrians respectively. (b) shows the common representation,
which represents the surrounding agents as sequences of positions in 2D spatial space. (c) shows
our proposed trajectory representation in a unified spatio-temporal space.
UST: Unifying Spatio-Temporal Context for
Trajectory Prediction in Autonomous Driving
• Design spatio-temporal point sets to represent the raw input;
• Treat the snapshot of status of agent n at time step t as a single point with metadata in a 3D space
spanned by 2D location and time;
• By this uniform representation, unify space and time into one representation, which eases the
subsequent context modeling task.
• To deal with such unordered and variable length data, the structure and operations of this feature
extractor should be deliberately designed to fit the nature of the data.
• Inspired by PointNet, take two key components for context extraction.
• Embedding: to map S-T points into a hidden representation, in which the spatial context and
temporal context are unified;
• Permutation Invariant Aggregator: form the global context feature, by default, we use max
pooling as the aggregator.
• Recursive Refinement: concatenate the global context feature to every individual feature, and
recursively apply the aforementioned steps. In the second step, the embedding is aware of
the status of individual agent and all the global context, thus could capture the interactions.
• Finally, feed the encoded spatio-temporal feature into a standard LSTM.
UST: Unifying Spatio-Temporal Context for
Trajectory Prediction in Autonomous Driving
(a) and (f) show activation patterns of two typical neurons in the pooled spatio-temporal features.
The number in other subfigures indicates the value of activation of this neuron of the case.
UST: Unifying Spatio-Temporal Context for
Trajectory Prediction in Autonomous Driving
UST: Unifying Spatio-Temporal Context for
Trajectory Prediction in Autonomous Driving
Robust Trajectory Forecasting for Multiple
Intelligent Agents in Dynamic Scene
• Trajectory forecasting, or trajectory prediction, of multiple interacting agents in dynamic scenes,
is an important problem for many applications, such as robotic systems and autonomous driving.
• The problem is a great challenge because of the complex interactions among the agents and their
inter- actions with the surrounding scenes.
• A method for the robust trajectory forecasting of multiple intelligent agents in dynamic scenes.
• The method consists of three major interrelated components: an interaction net for global
spatiotemporal interactive feature extraction, an environment net for decoding dynamic scenes
(i.e., the surrounding road topology of an agent), and a prediction net that combines the spa-
tiotemporal feature, the scene feature, the past trajectories of agents and some random noise for
the robust trajectory prediction of agents.
• Experiments on pedestrian-walking and vehicle-pedestrian heterogeneous datasets demonstrate
that the method outperforms the SOA prediction methods in terms of prediction accuracy.
Robust Trajectory Forecasting for Multiple
Intelligent Agents in Dynamic Scene
The method contains three components, a spatio-temporal interaction network, an
environment feature extraction network, and a trajectory prediction network.
Robust Trajectory Forecasting for Multiple
Intelligent Agents in Dynamic Scene
PnPNet: End-to-End Perception and Prediction
with Tracking in the Loop
• The problem of joint perception and motion forecasting in the context of self-driving
vehicles.
• Towards this goal, PnPNet, an end-to-end model that takes as input sequential sensor
data, and outputs at each time step object tracks and their future trajectories.
• The key component is a tracking module that generates object tracks online from
detections and exploits trajectory level features for motion forecasting.
• Specifically, the object tracks get updated at each time step by solving both the data
association problem and the trajectory estimation problem.
• Importantly, the whole model is end-to-end trainable and benefits from joint
optimization of all tasks.
• Validate PnPNet on two large-scale driving datasets, improvements over the state-of-the-
art with better occlusion recovery and more accurate future prediction.
PnPNet: End-to-End Perception and Prediction
with Tracking in the Loop
Three paradigms for perception and prediction. Traditional approach (a) adopts the modular design that
decomposes the stack into subtasks and solves them with individual models. End-to-end method (b) uses a
joint model to solve detection and prediction simultaneously, but performs tracking as post-processing. As
a result, the full temporal history contained in tracks is not used by detection and prediction. This
approach (c) brings tracking into the loop so that all tasks benefit from rich temporal context.
PnPNet: End-to-End Perception and Prediction
with Tracking in the Loop
PnPNet for end-to-end perception and prediction. The model consists of three modules that perform 3D object
detection, discrete-continuous tracking, and motion forecasting sequentially. To extract trajectory level actor
representations used for tracking and prediction, also equip the model with two explicit memories: one for
global sensor feature maps, and one for past object trajectories. Both memories get updated at each time step
with up-to-date sensor features and tracking results.
PnPNet: End-to-End Perception and Prediction
with Tracking in the Loop
The trajectory level object representation. Given an object trajectory, we first extract its sensor observation
and motion features at each time step, and then apply an LSTM network to model the temporal dynamics
PnPNet: End-to-End Perception and Prediction
with Tracking in the Loop
The Importance of Prior Knowledge in Precise
Multimodal Prediction
• Roads have well defined geometries, topologies, and traffic rules.
• While this has been widely exploited in motion planning methods to produce maneuvers that obey the
law, little work has been devoted to utilize these priors in perception and motion forecasting methods.
• This is a method to incorporate these structured priors as a loss function.
• In contrast to imposing hard constraints, this approach allows the model to handle non-compliant
maneuvers when those happen in the real world.
• Safe motion planning is the end goal, and thus a probabilistic characterization of the possible future
developments of the scene is key to choose the plan with the lowest expected cost.
• Towards this goal, design a framework that leverages REINFORCE to incorporate non- differentiable
priors over sample trajectories from a probabilistic model, thus optimizing the whole distribution.
• On real-world self-driving datasets containing complex road topologies and multi-agent interactions.
• Despite the importance of this evaluation, it has been often overlooked by previous perception and
motion forecasting works.
The Importance of Prior Knowledge in Precise
Multimodal Prediction
• Human driving behavior is highly structured: in the majority of scenarios, drivers will follow the
road topology and traffic rules.
• To leverage this informative prior, but not overly penalize non-compliant behavior, define a
flexible traffic-rule informed loss that is conditioned on ground-truth behavior.
• To this end, leverage a lane-graph representation where the nodes encode lane segments and the
edges represent relationships between lane segments such as adjacency, predecessor, and
successor (taking into account direction of traffic flow).
The Importance of Prior Knowledge in Precise
Multimodal Prediction
• It is more important to precisely characterize motion of vehicles that might interact with the SDV
(self-driving vehicle), rather than other traffic participants that do not influence the SDV behavior.
• Approximate area of interest with the SDV’s route (i.e. high-level command), which is defined as
union of all lane segments that the SDV can travel on to reach a preset goal, given the lane-graph.
• The horizon is set to be equal to the prediction horizon (5s), the target lane by a route planner.
• This gives a safe approximation over its future possible locations.
• Define pos. traj.s as those with > one waypoint falling within the SDV route, and neg. otherwise.
• Achieve high precision and high recall under this definition, taking into account if the ground-
truth trajectory intersects the route (positive) or not (negative).
The Importance of Prior Knowledge in Precise
Multimodal Prediction
Driving behaviors for adas and autonomous driving XII

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataYu Huang
 
Deep VO and SLAM IV
Deep VO and SLAM IVDeep VO and SLAM IV
Deep VO and SLAM IVYu Huang
 
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIPedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIYu Huang
 
Driving Behavior for ADAS and Autonomous Driving X
Driving Behavior for ADAS and Autonomous Driving XDriving Behavior for ADAS and Autonomous Driving X
Driving Behavior for ADAS and Autonomous Driving XYu Huang
 
Driving Behavior for ADAS and Autonomous Driving III
Driving Behavior for ADAS and Autonomous Driving IIIDriving Behavior for ADAS and Autonomous Driving III
Driving Behavior for ADAS and Autonomous Driving IIIYu Huang
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIIYu Huang
 
Driving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XIDriving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XIYu Huang
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IVYu Huang
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivYu Huang
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymoYu Huang
 
Depth Fusion from RGB and Depth Sensors IV
Depth Fusion from RGB and Depth Sensors  IVDepth Fusion from RGB and Depth Sensors  IV
Depth Fusion from RGB and Depth Sensors IVYu Huang
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep LearningYu Huang
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IIYu Huang
 
Driving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VIIDriving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VIIYu Huang
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learningYu Huang
 
Driving behaviors for adas and autonomous driving XIII
Driving behaviors for adas and autonomous driving XIIIDriving behaviors for adas and autonomous driving XIII
Driving behaviors for adas and autonomous driving XIIIYu Huang
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgYu Huang
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors IIIYu Huang
 
Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIYu Huang
 

Was ist angesagt? (20)

Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal Data
 
Deep VO and SLAM IV
Deep VO and SLAM IVDeep VO and SLAM IV
Deep VO and SLAM IV
 
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIPedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
 
Driving Behavior for ADAS and Autonomous Driving X
Driving Behavior for ADAS and Autonomous Driving XDriving Behavior for ADAS and Autonomous Driving X
Driving Behavior for ADAS and Autonomous Driving X
 
Driving Behavior for ADAS and Autonomous Driving III
Driving Behavior for ADAS and Autonomous Driving IIIDriving Behavior for ADAS and Autonomous Driving III
Driving Behavior for ADAS and Autonomous Driving III
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning III
 
Driving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XIDriving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XI
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xiv
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
 
Depth Fusion from RGB and Depth Sensors IV
Depth Fusion from RGB and Depth Sensors  IVDepth Fusion from RGB and Depth Sensors  IV
Depth Fusion from RGB and Depth Sensors IV
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep Learning
 
3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning II
 
Driving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VIIDriving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VII
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learning
 
Driving behaviors for adas and autonomous driving XIII
Driving behaviors for adas and autonomous driving XIIIDriving behaviors for adas and autonomous driving XIII
Driving behaviors for adas and autonomous driving XIII
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors III
 
Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving II
 

Ähnlich wie Driving behaviors for adas and autonomous driving XII

Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Journal Papers
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processingvivatechijri
 
Driving Behavior for ADAS and Autonomous Driving II
Driving Behavior for ADAS and Autonomous Driving IIDriving Behavior for ADAS and Autonomous Driving II
Driving Behavior for ADAS and Autonomous Driving IIYu Huang
 
Identification and classification of moving vehicles on road
Identification and classification of moving vehicles on roadIdentification and classification of moving vehicles on road
Identification and classification of moving vehicles on roadAlexander Decker
 
Volkova_DICTA_robust_feature_based_visual_navigation
Volkova_DICTA_robust_feature_based_visual_navigationVolkova_DICTA_robust_feature_based_visual_navigation
Volkova_DICTA_robust_feature_based_visual_navigationAnastasiia Volkova
 
Inter vehicular communication using packet network theory
Inter vehicular communication using packet network theoryInter vehicular communication using packet network theory
Inter vehicular communication using packet network theoryeSAT Publishing House
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningYu Huang
 
Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Conference Papers
 
Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...
Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...
Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...Shakas Technologies
 
Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...
Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...
Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...IRJET Journal
 
Driving Behavior for ADAS and Autonomous Driving V
Driving Behavior for ADAS and Autonomous Driving VDriving Behavior for ADAS and Autonomous Driving V
Driving Behavior for ADAS and Autonomous Driving VYu Huang
 
A cost-effective GPS-aided autonomous guided vehicle for global path planning
A cost-effective GPS-aided autonomous guided vehicle for global path planningA cost-effective GPS-aided autonomous guided vehicle for global path planning
A cost-effective GPS-aided autonomous guided vehicle for global path planningjournalBEEI
 
Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...
Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...
Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...CSCJournals
 
IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...
IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...
IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...IRJET Journal
 
A computer vision-based lane detection technique using gradient threshold and...
A computer vision-based lane detection technique using gradient threshold and...A computer vision-based lane detection technique using gradient threshold and...
A computer vision-based lane detection technique using gradient threshold and...IJECEIAES
 
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1NanubalaDhruvan
 
Taxi Demand Prediction using Machine Learning.
Taxi Demand Prediction using Machine Learning.Taxi Demand Prediction using Machine Learning.
Taxi Demand Prediction using Machine Learning.IRJET Journal
 
Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Papitha Velumani
 
Real time path planning based on
Real time path planning based onReal time path planning based on
Real time path planning based onjpstudcorner
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigationguest90654fd
 

Ähnlich wie Driving behaviors for adas and autonomous driving XII (20)

Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...
 
An Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image ProcessingAn Analysis of Various Deep Learning Algorithms for Image Processing
An Analysis of Various Deep Learning Algorithms for Image Processing
 
Driving Behavior for ADAS and Autonomous Driving II
Driving Behavior for ADAS and Autonomous Driving IIDriving Behavior for ADAS and Autonomous Driving II
Driving Behavior for ADAS and Autonomous Driving II
 
Identification and classification of moving vehicles on road
Identification and classification of moving vehicles on roadIdentification and classification of moving vehicles on road
Identification and classification of moving vehicles on road
 
Volkova_DICTA_robust_feature_based_visual_navigation
Volkova_DICTA_robust_feature_based_visual_navigationVolkova_DICTA_robust_feature_based_visual_navigation
Volkova_DICTA_robust_feature_based_visual_navigation
 
Inter vehicular communication using packet network theory
Inter vehicular communication using packet network theoryInter vehicular communication using packet network theory
Inter vehicular communication using packet network theory
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planning
 
Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...Real time vehicle counting in complex scene for traffic flow estimation using...
Real time vehicle counting in complex scene for traffic flow estimation using...
 
Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...
Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...
Adaptive Feature Fusion Networks for Origin-Destination Passenger Flow Predic...
 
Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...
Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...
Enhancing Autonomous Vehicle Applications with Advanced Lane Detection and Tr...
 
Driving Behavior for ADAS and Autonomous Driving V
Driving Behavior for ADAS and Autonomous Driving VDriving Behavior for ADAS and Autonomous Driving V
Driving Behavior for ADAS and Autonomous Driving V
 
A cost-effective GPS-aided autonomous guided vehicle for global path planning
A cost-effective GPS-aided autonomous guided vehicle for global path planningA cost-effective GPS-aided autonomous guided vehicle for global path planning
A cost-effective GPS-aided autonomous guided vehicle for global path planning
 
Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...
Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...
Performance Evaluation of GPSR Routing Protocol for VANETs using Bi-direction...
 
IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...
IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...
IRJET - A Review on Pedestrian Behavior Prediction for Intelligent Transport ...
 
A computer vision-based lane detection technique using gradient threshold and...
A computer vision-based lane detection technique using gradient threshold and...A computer vision-based lane detection technique using gradient threshold and...
A computer vision-based lane detection technique using gradient threshold and...
 
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING -- Part 1
 
Taxi Demand Prediction using Machine Learning.
Taxi Demand Prediction using Machine Learning.Taxi Demand Prediction using Machine Learning.
Taxi Demand Prediction using Machine Learning.
 
Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks
 
Real time path planning based on
Real time path planning based onReal time path planning based on
Real time path planning based on
 
Path Planning And Navigation
Path Planning And NavigationPath Planning And Navigation
Path Planning And Navigation
 

Mehr von Yu Huang

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingYu Huang
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...Yu Huang
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingYu Huang
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingYu Huang
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationYu Huang
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and PredictionYu Huang
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIYu Huang
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VYu Huang
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVYu Huang
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduYu Huang
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the HoodYu Huang
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)Yu Huang
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingYu Huang
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?Yu Huang
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingYu Huang
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingYu Huang
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningYu Huang
 
Lidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainLidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainYu Huang
 
Autonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksAutonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksYu Huang
 
3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image VYu Huang
 

Mehr von Yu Huang (20)

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous Driving
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and Segmentation
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VI
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at Baidu
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the Hood
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
 
Lidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainLidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rain
 
Autonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksAutonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucks
 
3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V
 

KĂźrzlich hochgeladen

Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniquesugginaramesh
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 

KĂźrzlich hochgeladen (20)

Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 

Driving behaviors for adas and autonomous driving XII

  • 1. Driving Behaviors for ADAS and Autonomous Driving XII Yu Huang Yu.huang07@gmail.com Sunnyvale, California
  • 2. Outline • SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge-enhanced Graph CNN (2) • MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps (3.15) • PiP: Planning-informed Trajectory Prediction for Autonomous Drivin (3.25) • Shared Cross-Modal Trajectory Prediction for Autonomous Driving (4.1) • TPNet: Trajectory Proposal Network for Motion Prediction (4.26) • VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments (4.27) • UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving (5.6) • Robust Trajectory Forecasting for Multiple Intelligent Agents in Dynamic Scene (5.27) • PnPNet: End-to-End Perception and Prediction with Tracking in the Loop (5.29) • The Importance of Prior Knowledge in Precise Multimodal Prediction (6.4)
  • 3. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge- enhanced GCNN • Predicting the future trajectory of surrounding vehicles in a randomly varying traffic level is one of the most challenging problems in developing an autonomous vehicle. • Since there is no pre-defined number of interacting vehicles participate in, the prediction network has to be scalable with respect to the vehicle number in order to guarantee the consistency in terms of both accuracy and computational load. • The fully scalable trajectory prediction network, SCALE-Net, can ensure both higher prediction performance and consistent computational load regardless of the number of surrounding vehicles. • The SCALE- Net employs the Edge-enhance Graph Convolutional Neural Network (EGCN) for the inter-vehicular interaction embedding network. • Since the EGCN is inherently scalable with respect to the graph node (an agent in this study), the model can be operated independently from the total number of vehicles considered. • The experimental test shows that both computation time and prediction performance of the SCALE-Net consistently outperform those of previous models regardless of the level of traffic complexities
  • 4. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge- enhanced GCNN Comparison between state input based- and scene input based- prediction model on variation of computation time and accuracy per a single driving scene with respect to the number of surrounding vehicles.
  • 5. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge- enhanced GCNN Overall architecture of the SCALE-Net for interactive scalable trajectory prediction algorithm. Historical states of the ego and surrounding vehicles, which is illustrated with dotted red line, is used as input parameter of the proposed architecture. After passing through the EGCN based scene embedding layer and LSTM based trajectory predictor, future trajectory of the surrounding vehicles are generated as shown in right-most figure with blue dotted line.
  • 6. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge- enhanced GCNN Overall flow diagram of the EGCN layer for interaction embedding. Node number 4, indexed with 5, is updated using blue-colored elements. Left: in edge-enhanced attention process, weight of the vehicles around vehicle number 4 is calculated using the relative states of the entire vehicles in order to generated weighted adjacency matrix, 𝑨𝒂𝒅𝒋. Right: using 𝑨𝒂𝒅𝒋, node information of the vehicle 4 is updated by weight of GCN, 𝑾 𝒈𝒄𝒏.
  • 7. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge- enhanced GCNN
  • 8. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge- enhanced GCNN Examples of trajectory prediction result in various traffic level where green and transparent blue line is predicted by SCALE-Net and V- LSTM, respectively Critically interacting scene where the maneuver of the vehicles highly depends on interaction effect from adjacent vehicles.
  • 9. MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps • The ability to reliably perceive the environmental states, particularly the existence of objects and their motion behavior, is crucial for autonomous driving. • an efficient deep model, called MotionNet, jointly perform perception and motion prediction from 3D point clouds. • MotionNet takes a sequence of LiDAR sweeps as input and outputs a bird’s eye view (BEV) map, which encodes the object category and motion information in each grid cell. • The backbone of MotionNet is a novel spatio- temporal pyramid network, which extracts deep spatial and temporal features in a hierarchical fashion. • To enforce the smoothness of predictions over both space and time, the training of MotionNet is further regularized with novel spatial and temporal consistency losses. • Extensive experiments show that the method overall outperforms the state-of-the-arts, including the latest scene-flow- and 3D-object-detection-based methods. • This indicates the potential value of the proposed method serving as a backup to the bounding-box- based system, and providing complementary information to the motion planner in autonomous driving. • Code is available at https://github.com/pxiangwu/MotionNet.
  • 10. MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps Top: MotionNet is a system based on bird’s eye view (BEV) map, and performs perception and motion prediction jointly without using bounding boxes. It can potentially serve as a backup to the standard bounding- box-based-system and provide complementary information for motion planning. Bottom: During testing, with (a) LiDAR data (BEV), given an object (e.g., disabled person on a wheelchair, as illustrated in (d)) that never appears in the training data, 3D object detection tends to fail; see plots (b) and (c). In contrast, MotionNet is still able to perceive the object and forecast its motion; see plots (e) and (f), where the color represents the category and the arrow denotes the future displacement.
  • 11. MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps Overview of MotionNet. Given a sequence of LiDAR sweeps, first represent the raw point clouds into BEV maps, which are essentially 2D images with multiple channels. Each pixel (cell) in a BEV map is associated with a feature vector along the height dimension. then feed the BEV maps into the spatio-temporal pyramid network (STPN) for feature extraction. The output of STPN is finally delivered to three heads: (1) cell classification, which perceives the category of each cell, such as vehicle, pedestrian or background; (2) motion prediction, which predicts the future trajectory of each cell; (3) state estimation, which estimates the current motion status of each cell, such as static or moving. The final output is a BEV map, which includes both perception and motion prediction information.
  • 12. MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps Spatio-temporal pyramid network. Each STC block consists of two consecutive 2D convolutions followed by one pseudo- 1D convolution. The temporal pooling is applied to the temporal dimension and squeezes it to length 1.
  • 13. MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird’s Eye View Maps
  • 14. PiP: Planning-informed Trajectory Prediction for Autonomous Driving • It is critical to predict the motion of surrounding vehicles for self-driving planning, especially in a socially compliant and flexible way. • However, future prediction is challenging due to the interaction and uncertainty in driving behaviors. • planning-informed trajectory prediction (PiP) to tackle the prediction problem in the multi-agent setting. • differentiated from the traditional manner of prediction, which is only based on historical information and decoupled with planning. • By informing the prediction process with the planning of ego vehicle, it achieves the state-of-the-art performance of multi- agent forecasting on highway datasets. • Moreover, it enables a novel pipeline which couples the prediction and planning, by conditioning PiP on multiple candidate trajectories of the ego vehicle, which is highly beneficial for autonomous driving in interactive scenarios.
  • 15. PiP: Planning-informed Trajectory Prediction for Autonomous Driving Comparison between the traditional prediction approach (left) and PiP (right) under a lane merging scenario. Assume the ego vehicle (red) intends to merge to the left lane. It is required to predict the trajectories of surrounding vehicles (blue). To alleviate the uncertainty led by future interaction, PiP incorporates the future plans (dotted red curve) of ego vehicle in addition to the history tracks (grey curve). While the traditional prediction result is produced independently with the ego’s future, PiP produces predictions one-to-one corresponding to the candidate future trajectories by enabling the novel planning-prediction-coupled pipeline. Therefore, PiP evaluates the planning safety more precisely and achieves more flexible driving behavior (solid red curve) compared with the traditional pipeline.
  • 16. PiP: Planning-informed Trajectory Prediction for Autonomous Driving The overview of PiP architecture: PiP consists of 3 key modules, planning coupled, target fusion, and maneuver-based decoding module. Each predicted target is firstly encoded in the planning coupled module by aggregating all information within the target-centric area (blue square). A target tensor is then set up within the ego-vehicle-centric area (red square) by placing the target encodings into the spatial gird based on their locations. Afterward, the target tensors are passed through the target fusion module to learn the interdependency between tar- gets, and eventually, a fused target tensor is generated. Finally, prediction of each target is decoded from corresponding fused target encoding in the maneuver-based decoding module. The target vehicle marked is exemplified for planning coupled encoding and multi-modal trajectories decoding.
  • 17. PiP: Planning-informed Trajectory Prediction for Autonomous Driving
  • 18. Shared Cross-Modal Trajectory Prediction for Autonomous Driving • A framework for predicting future trajectories of traffic agents in highly interactive environments. • On the basis of the fact that autonomous driving vehicles are equipped with various types of sensors (e.g., LiDAR scanner, RGB camera, etc.), this work aims to get benefit from the use of multiple input modalities that are complementary to each other. • The proposed approach is composed of two stages. (i) feature encoding where to discover motion behavior of the target agent wrt other directly and indirectly observable influences. Extract such behaviors from multiple perspectives such as in top-down and frontal view. (ii) cross-modal embedding where we embed a set of learned behavior representations into a single cross-modal latent space. • Construct a generative model and formulate the objective functions with an additional regularizer specifically designed for future prediction. • An extensive evaluation is conducted to show the efficacy of the proposed framework using two benchmark driving datasets.
  • 19. Shared Cross-Modal Trajectory Prediction for Autonomous Driving Given a sequence of images and past positions, the feature encoder analyzes internal, external, and social stimuli of agents. The features generated from multiple sensory data (e.g., top-down view LiDAR and frontal view RGB) are used to condition the generative model that aims to embed different input modalities into a single cross-modal latent space. The following decoder predicts future trajectory in top-down or frontal view using the latent variable sampled from the learned embedding space. Note that the dotted shapes and arrows are only visible at training time.
  • 20. Shared Cross-Modal Trajectory Prediction for Autonomous Driving The detailed illustration of the feature encoder. Using the past image sequence, model spatio-temporal factors given by external environments. The internal factors of the target agent is encoded from its past motion as well as surrounding local perceptual context. In addition, consider the relative motion between the target and every other interactive agents to construct the social interactions.
  • 21. Shared Cross-Modal Trajectory Prediction for Autonomous Driving
  • 22. TPNet: Trajectory Proposal Network for Motion Prediction • Making accurate motion prediction of the surrounding traffic agents such as pedestrians, vehicles, and cyclists is crucial for autonomous driving. • Recent data-driven motion prediction methods have attempted to learn to directly regress the exact future position or its distribution from massive amount of trajectory data. • However, it remains difficult for these methods to provide multimodal predictions as well as integrate physical constraints such as traffic rules and movable areas. • This work is a two-stage motion prediction framework, Trajectory Proposal Network (TPNet). • TPNet first generates a candidate set of future trajectories as hypothesis proposals, then makes the final predictions by classifying and refining the proposals which meets the physical constraints. • By steering the proposal generation process, safe and multimodal predictions are realized. • Thus this framework effectively mitigates the complexity of motion prediction problem while ensuring the multimodal output. • Experiments on four large-scale trajectory prediction datasets, i.e. the ETH, UCY, Apollo and Argoverse datasets, show that TPNet achieves the state-of-results.
  • 23. TPNet: Trajectory Proposal Network for Motion Prediction The movement of traffic agents are often regularized by the movable areas (white areas for vehicles and gray areas for pedestrians), while there might be multiple plausible future paths for the agents. Thus it requires the motion prediction systems to be able to incorporate the traffic constraints and output multimodal predictions. This framework generates the predictions with different intentions under physical constraints for both vehicles and pedestrians.
  • 24. TPNet: Trajectory Proposal Network for Motion Prediction Framework of the Trajectory Proposal Network (TPNet). In the first stage, a rough end point is regressed to reduce the searching space and then proposals are generated. In the second stage, proposals are classified and refined to generate final predictions. The dotted proposals are the proposals that lie outside of the movable area, which will be further punished.
  • 25. TPNet: Trajectory Proposal Network for Motion Prediction Illustration of proposal generation. Proposals are generated around the end point predicted in the first stage. Îł is used to control the shape of the proposal. Illustration of multimodal proposal generation using road information. The reference lines indicate the possible center lane lines that the vehicle could dive in.
  • 26. TPNet: Trajectory Proposal Network for Motion Prediction
  • 27. VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments • Reliable navigation like expert human drivers in urban environments is a critical capability for autonomous vehicles. • Traditional methods for autonomous driving are implemented with many building blocks from perception, planning and control, making them difficult to generalize to varied scenarios due to complex assumptions and interdependencies. • An end-to-end trajectory generation method based on imitation learning. • It can extract spatiotemporal features from the front-view camera images for scene understanding, then generate collision-free trajectories several seconds into the future. • The network consists of three sub-networks, which are selectively activated for three common driving tasks: keep straight, turn left and turn right. • The experimental results suggest that under various weather and lighting conditions, the network can reliably generate trajectories in different urban environments, such as turning at intersections and slowing down for collision avoidance. • Furthermore, by integrating the network into a navigation system, good generalization performance is presented in an unseen simulated world for autonomous driving on different types of vehicles, such as cars and trucks.
  • 28. VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments Different approaches for trajectory planning and decision-making for autonomous vehicles.
  • 29. VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments The architecture of VTGNet, which consists of a feature extractor and a trajectory generator. MobileNet V2 is used as the feature extractor with 17 bottleneck convolutional layers. And the long short-term memory (LSTM) is used in the decoder to process the spatiotemporal information. The output of the VTGNet is a vector of size 22×3 indicating the trajectory in the future 22 frames (velocity and x,y positions in the body frame). Note that the width of the network layers indicates the number of output channels.
  • 30. VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments Different baselines in this work. The feature extractor for these networks is the same as the one in the proposed VTGNet. The size of the output features is shown above the layers.
  • 31. VTGNet: A Vision-based Trajectory Generation Network for Autonomous Vehicles in Urban Environments
  • 32. UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving • Trajectory prediction has always been a challenging problem for autonomous driving, since it needs to infer the latent intention from the behaviors and interactions from traffic participants. • This problem is intrinsically hard, because each participant may behave differently under different environments and interactions. • This key is to effectively model interlaced influence from both spatial and temporal context. • Existing work usually encodes these two types of context separately, which would lead to inferior modeling of the scenarios. • A unified approach to treat time-space dimensions equally for modeling spatio-temporal context. • The module is simple and easy to implement within several lines of codes. • In contrast to existing methods which heavily rely on RNN for temporal context and hand-crafted structure for spatial context, it could auto-partition the spatio-temporal space to adapt the data. • Test on two recently proposed trajectory prediction dataset ApolloScape and Argoverse. • These encouraging results further validate the superiority of our approach.
  • 33. UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving Illustration and representations of the trajectory prediction task. Blue, green, red colors show trajectories for vehicles, bicycles, pedestrians respectively. (b) shows the common representation, which represents the surrounding agents as sequences of positions in 2D spatial space. (c) shows our proposed trajectory representation in a unified spatio-temporal space.
  • 34. UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving • Design spatio-temporal point sets to represent the raw input; • Treat the snapshot of status of agent n at time step t as a single point with metadata in a 3D space spanned by 2D location and time; • By this uniform representation, unify space and time into one representation, which eases the subsequent context modeling task. • To deal with such unordered and variable length data, the structure and operations of this feature extractor should be deliberately designed to fit the nature of the data. • Inspired by PointNet, take two key components for context extraction. • Embedding: to map S-T points into a hidden representation, in which the spatial context and temporal context are unified; • Permutation Invariant Aggregator: form the global context feature, by default, we use max pooling as the aggregator. • Recursive Refinement: concatenate the global context feature to every individual feature, and recursively apply the aforementioned steps. In the second step, the embedding is aware of the status of individual agent and all the global context, thus could capture the interactions. • Finally, feed the encoded spatio-temporal feature into a standard LSTM.
  • 35. UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving (a) and (f) show activation patterns of two typical neurons in the pooled spatio-temporal features. The number in other subfigures indicates the value of activation of this neuron of the case.
  • 36. UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving
  • 37. UST: Unifying Spatio-Temporal Context for Trajectory Prediction in Autonomous Driving
  • 38. Robust Trajectory Forecasting for Multiple Intelligent Agents in Dynamic Scene • Trajectory forecasting, or trajectory prediction, of multiple interacting agents in dynamic scenes, is an important problem for many applications, such as robotic systems and autonomous driving. • The problem is a great challenge because of the complex interactions among the agents and their inter- actions with the surrounding scenes. • A method for the robust trajectory forecasting of multiple intelligent agents in dynamic scenes. • The method consists of three major interrelated components: an interaction net for global spatiotemporal interactive feature extraction, an environment net for decoding dynamic scenes (i.e., the surrounding road topology of an agent), and a prediction net that combines the spa- tiotemporal feature, the scene feature, the past trajectories of agents and some random noise for the robust trajectory prediction of agents. • Experiments on pedestrian-walking and vehicle-pedestrian heterogeneous datasets demonstrate that the method outperforms the SOA prediction methods in terms of prediction accuracy.
  • 39. Robust Trajectory Forecasting for Multiple Intelligent Agents in Dynamic Scene The method contains three components, a spatio-temporal interaction network, an environment feature extraction network, and a trajectory prediction network.
  • 40. Robust Trajectory Forecasting for Multiple Intelligent Agents in Dynamic Scene
  • 41. PnPNet: End-to-End Perception and Prediction with Tracking in the Loop • The problem of joint perception and motion forecasting in the context of self-driving vehicles. • Towards this goal, PnPNet, an end-to-end model that takes as input sequential sensor data, and outputs at each time step object tracks and their future trajectories. • The key component is a tracking module that generates object tracks online from detections and exploits trajectory level features for motion forecasting. • Specifically, the object tracks get updated at each time step by solving both the data association problem and the trajectory estimation problem. • Importantly, the whole model is end-to-end trainable and benefits from joint optimization of all tasks. • Validate PnPNet on two large-scale driving datasets, improvements over the state-of-the- art with better occlusion recovery and more accurate future prediction.
  • 42. PnPNet: End-to-End Perception and Prediction with Tracking in the Loop Three paradigms for perception and prediction. Traditional approach (a) adopts the modular design that decomposes the stack into subtasks and solves them with individual models. End-to-end method (b) uses a joint model to solve detection and prediction simultaneously, but performs tracking as post-processing. As a result, the full temporal history contained in tracks is not used by detection and prediction. This approach (c) brings tracking into the loop so that all tasks benefit from rich temporal context.
  • 43. PnPNet: End-to-End Perception and Prediction with Tracking in the Loop PnPNet for end-to-end perception and prediction. The model consists of three modules that perform 3D object detection, discrete-continuous tracking, and motion forecasting sequentially. To extract trajectory level actor representations used for tracking and prediction, also equip the model with two explicit memories: one for global sensor feature maps, and one for past object trajectories. Both memories get updated at each time step with up-to-date sensor features and tracking results.
  • 44. PnPNet: End-to-End Perception and Prediction with Tracking in the Loop The trajectory level object representation. Given an object trajectory, we first extract its sensor observation and motion features at each time step, and then apply an LSTM network to model the temporal dynamics
  • 45. PnPNet: End-to-End Perception and Prediction with Tracking in the Loop
  • 46. The Importance of Prior Knowledge in Precise Multimodal Prediction • Roads have well defined geometries, topologies, and traffic rules. • While this has been widely exploited in motion planning methods to produce maneuvers that obey the law, little work has been devoted to utilize these priors in perception and motion forecasting methods. • This is a method to incorporate these structured priors as a loss function. • In contrast to imposing hard constraints, this approach allows the model to handle non-compliant maneuvers when those happen in the real world. • Safe motion planning is the end goal, and thus a probabilistic characterization of the possible future developments of the scene is key to choose the plan with the lowest expected cost. • Towards this goal, design a framework that leverages REINFORCE to incorporate non- differentiable priors over sample trajectories from a probabilistic model, thus optimizing the whole distribution. • On real-world self-driving datasets containing complex road topologies and multi-agent interactions. • Despite the importance of this evaluation, it has been often overlooked by previous perception and motion forecasting works.
  • 47. The Importance of Prior Knowledge in Precise Multimodal Prediction • Human driving behavior is highly structured: in the majority of scenarios, drivers will follow the road topology and traffic rules. • To leverage this informative prior, but not overly penalize non-compliant behavior, define a flexible traffic-rule informed loss that is conditioned on ground-truth behavior. • To this end, leverage a lane-graph representation where the nodes encode lane segments and the edges represent relationships between lane segments such as adjacency, predecessor, and successor (taking into account direction of traffic flow).
  • 48. The Importance of Prior Knowledge in Precise Multimodal Prediction • It is more important to precisely characterize motion of vehicles that might interact with the SDV (self-driving vehicle), rather than other traffic participants that do not influence the SDV behavior. • Approximate area of interest with the SDV’s route (i.e. high-level command), which is defined as union of all lane segments that the SDV can travel on to reach a preset goal, given the lane-graph. • The horizon is set to be equal to the prediction horizon (5s), the target lane by a route planner. • This gives a safe approximation over its future possible locations. • Define pos. traj.s as those with > one waypoint falling within the SDV route, and neg. otherwise. • Achieve high precision and high recall under this definition, taking into account if the ground- truth trajectory intersects the route (positive) or not (negative).
  • 49. The Importance of Prior Knowledge in Precise Multimodal Prediction