Ting Zhang,Wenjie Song,Mengyin Fu,Yi Yang,and Meiling Wang
Abstract—Intersections are quite important and complex traffic scenarios,where the future motion of surrounding vehicles is an indispensable reference factor for the decision-making or path planning of autonomous vehicles.Considering that the motion trajectory of a vehicle at an intersection partly obeys the statistical law of historical data once its driving intention is determined,this paper proposes a long short-term memory based (LSTM-based)framework that combines intention prediction and trajectory prediction together.First,we build an intersection prior trajectories model (IPTM) by clustering and statistically analyzing a large number of prior traffic flow trajectories.The prior trajectories model with fitted probabilistic density is used to approximate the distribution of the predicted trajectory,and also serves as a reference for credibility evaluation.Second,we conduct the intention prediction through another LSTM model and regard it as a crucial cue for a trajectory forecast at the early stage.Furthermore,the predicted intention is also a key that is associated with the prior trajectories model.The proposed framework is validated on two publically released datasets,next generation simulation (NGSIM) and INTERACTION.Compared with other prediction methods,our framework is able to sample a trajectory from the estimated distribution,with its accuracy improved by about 20%.Finally,the credibility evaluation,which is based on the prior trajectories model,makes the framework more practical in the real-world applications.
INTERSECTIONS have always been a difficult traffic scenario since complex regulations and heterogeneous road participants congregate here,sharing implicit interactions.The report published by the U.S.Transportation Department shows that about 40% traffic accidents are intersection-related[1].Therefore,to guide an autonomous vehicle (hereinafter referred to as ego-vehicle) safely through a complex intersection,knowledge about the future behaviour of its surrounding vehicles has to be considered [2].Extensive literature has introduced the combination of intention prediction and motion prediction in the highway scenarios[2]–[5].Unfortunately,for intersection scenarios,most of the works only stop at the exploration of intention prediction[6]–[8].The lack of a dynamic motion forecast impedes the direct usage for a predict-and-plan module.To this end,we propose a unified framework to predict trajectory with the assistance of estimated intention cues and prior trajectory boundaries.The model mainly focuses on situations where the vehicle is close to the entrance in the period of 6 s,as the approaching intention is directly linked to motion at the early stage.Specifically,when dealing with the problem of intersection trajectory prediction,the following difficulties should be noticed:1) The driver’s behaviour at intersection is usually more complex than that on a highway,and the trajectory profile corresponds to a driver’s intention;2) More interactions exist in the intersection,such as the interaction between the vehicle and traffic signal,and the interaction between motor vehicles,bicycles and pedestrians;3) The lack of explicit lanes results in ambiguous constraints on trajectories.In response to the problems raised above,we propose a framework that combines intention prediction and the trajectory prediction together.For 1),the introduced intention prediction indicates a probable driving pattern and trajectory profile.For 2),we employ a long short-term memory based (LSTM-based) neural network to describe the complicated interaction.After being trained,the model can learn multiple driving patterns including travelling straight,turning left/right and braking at the stop line.And to resolve problem 3),we propose an intersection prior trajectories model (IPTM) through conducting statistics on the historical traffic flow at the intersection.As illustrated in Fig.1 (a),the trajectory clusters are statistically obtained,with unique keys represented by different colors.Based on clusters,the fitted Gaussian distribution parameters in each pre-partitioned grid will be calculated for lane substitutions,as Fig.1 (b) (The road geometry of Fig.1 (a) and 1 (b) are drawn using the Lanelet2 library [9],which is designed for handling map data in the context of automated driving with high-definition map data provided).Fig.1 (c) is the schematic of the intersection scenario with our proposed idea.
Fig.1.With intention estimation,the future trajectory of vehicles is predicted.The statistical traffic flow trajectories serve as the boundary of prediction.(a) Statistical clusters of traffic flow trajectories,where each color is associated with a unique key;(b) The probabilistic distribution of statistical clusters based on the grid partition;(c) Intersection scenario .
So far,great efforts have been made for vehicle motion prediction at intersections.Generally,these methods can be categorized into those that are discrete and those that are continuous.The discrete method usually discretizes the result into the intention,strategy or behaviour,and the model essentially performs classification recognition;the continuous method usually outputs continuous physical quantities such as trajectory and probability density,which in fact solves the regression problem.In order to compare these methods,we present a literature review of aforementioned two types of models as follows.The comparison between our model and the common prediction models is also discussed in this section.
Predicting the driver’s intention or maneuvers at an intersection is essential for traffic situation understanding,as 44.1% causation of intersection-related crashes comes from inadequate surveillance [1].Though the signal indicator can foreshadow the corresponding behaviour,it tends to cause misjudgements when misused.Alternatively,the speed profile and the heading angle are frequently-used.Since intention prediction can be regarded as a classification problem,classification recognizers such as support vector machines(SVM) [10],hidden Markov models (HMM) [6],and Bayesian networks [11]–[13] have been studied in this area.However,these models are limited to particular scenarios or one-step prediction.In this respect,various LSTM-based models have been adopted for generalization.Reference [8]proposes a recurrent neural network (RNN)-based model for intention prediction at the unsignalized roundabout.After travelling 10 m distance from the entrance,higher accuracy is achieved,which suffers from a time delay.An LSTM-based solution is presented by Phillips to infer the intention of the vehicle before entering the intersection [7].The authors design an ablation experiment to evaluate the effect of historical motion features,neighbour features and traffic features.Particularly,the regulation based on lanes should be considered.For example,in a lane that allows only a straight or right turn,left turn behaviour is illegal.Therefore,we adopt the target vehicles’ trajectories,surrounding vehicles’trajectories,and traffic information as the features of LSTM for intent prediction,similar to [7].
Trajectory prediction at intersections is more difficult than that in the highway environment,because the road infrastructure is diverse and the interaction among the road agents are more frequent.Parametric models are primarily characterized by road geometry and the velocity profile.Reference [14] addresses the microscopic speed profile of the turning vehicles when constructing the mathematical model.Based on this,[15] modifies the model by accounting for the intersection angle and the corner,so the proposed desired velocity model is capable in predicting the trajectory without any constraints to the initial position.Neural network-based methods include the generative adversarial nets (GAN)-series model [16],[17],the RNN-series model [18],[19] and the graph convolutional network (GCN)-based model [20],[21].However,the models based on neural networks often output the average value instead of multi-modal results,which neglect uncertainty.Some works try to model multi-modal movements by using Gaussian distribution.References [22]and [23] output the Gaussian parameters,where they regard the mean value of the parameters as the prediction results and,the covariance of the distribution has not been evaluated yet.In order to acquire the distribution,the negative log likelihood(NLL) loss is widely used.Although the NLL aims to minimize the log likelihood,when sampling from the predicted distribution,the samples may deviate from the ground truth significantly even violate the dynamic feasibility.This is because the NLL may force the parameter µ close to the real value,but only makes loose restrictions on the parameter σ.In [21],the spectrum of the constructed graph is used as part of the loss function to constrain the mean and covariance of the future states.Inspired by their model,we consider highly regularized driving behaviour at intersections and try to approximate the real distribution by introducing the restriction from the proposed IPTM,which makes it possible to sample accurate locations from the predicted distribution.
The main contributions of this paper can be concluded as the following three aspects:
1) Framework:We combine intention prediction and trajectory prediction for the specified intersection scenarios.The predicted intention is not only a one-dimensional feature for trajectory prediction,but also a part of the key directly related to a prior trajectory boundary.
2) Distribution Parameter Constraints:In order to generate the trajectory boundary in the intersection,we propose an IPTM to create statistics of the historical trajectories,which approximates the distribution of the ground truth.
3) Evaluation Metrics:We analyse the credibility of the estimated trajectory by applying the modified Hausdorff distance criteria [24] to the predicted trajectory and the prior trajectory distribution,which does not require the ground truth.The prediction that conforms to the prior trajectory distribution seems more reasonable.
Firstly,the intersection prior trajectories model is introduced in this section to determine how we process the collected traffic flow trajectories to obtain the prior trajectory boundary.After this,a detailed introduction to the structure and designation of the framework is presented.
As described above,in our framework,the prior trajectory distribution boundary is a prerequisite for future trajectory forecast and credibility evaluation.Assuming that the historical trajectories XHof the scenarios are available,we group them into {ki:Ji=XH,i} with unique keykiformatted as“Intersection-Direction-Lane-Intention”(When the lane information is not available,the grouped trajectories can be obtained by clustering algorithm such as DBSCAN [25]).
Our goal is to fit the prior trajectories by polynomial curves,after which the mean and variance along the tangent and normal direction of the curve G in each grid are counted and stored.Taking one groupJiwith specific keykias an example,the schematic diagram of the statistic process is in Fig.2,which is illustrated by Algorithm 1.The scatters of the prior trajectory XH,iare plotted on,Fig.2 (a) (coloured royal blue).Because the prior trajectory is obtained from the map in a global frame,it is possible that only coordinates(x,y) in the Cartesian frame are available,which do not satisfy the injective relationship (eachx-coordinate corresponds to only oney-coordinate).So,it is necessary to transform the trajectoriesJi={(xi,yi)} intothat satisfy the quadratic function relationship,i.e.,As i n Algorithm 1,for trajectories including straight lines,we make them parallel tox-axis by rotating θ degree.For curves,the process is a little bit more complex as shown in Algorithm 1(Lines 5–9).Firstly,coordinates at the 4 corners with maximum curvature of intersection {(xtop,p,ytop,p)|p=1,2,3,4} are required.Then,scatters in the neighbourhood of the corner are extracted to approximate the tangent slope at the maximum or minimum of the quadratic function,and rotation angle is equal to the tangent angle θ.The rotated trajectories(royal blue) and fitted polynomial (red) are illustrated on,Fig.2 (b),and the rotated curve is a quadratic function with symmetry axis perpendicular tox-axis.In Fig.2 (c),the green triangles and lightgrey lines represent the discrete gird M.In each gridm,we hypothesize that the tracks follow the twodimensional Gaussian distribution as equation (1) along the tangent and normal direction,whose parameters are fitted according to equation (2) as in Fig.2 (d) (withnpoints in the grid).Then,the fitted distribution and the polynomial curve isstored as a dictionary.
Fig.2.Schematic diagram that describes the form of prior trajectory.(a) His-torical scatters(b)Rotated scatters(royalblue) and the fitted polynomial (red);(c) Rotated scatters and samples in grid (green triangles and lightgrey lines);(d) The fitted Gaussian parameters before and after rotation.
Fig.3.The proposed framework that contains three parts:intention prediction module,trajectory prediction module and IPTM.The intention module anticipates the behaviour before the vehicle enters the intersection,and the estimated intention together with other features are employed for trajectory prediction.The predicted trajectory is bounded by the prior trajectory distributions generated by IPTM.
It is noteworthy that the statistical tracks contain the driving patterns with the same intention and route profile,which represent the uncertainty of the predicted trajectory.For example,the straight-line trajectory usually has consistent trajectory variances,while the left turning vehicle is constrained by the left-turn waiting line.The radial uncertainty in each grid is partially uniform,while the uncertainty of the right turning vehicle turns out to be larger as hardly any border lines restrict the trajectories.As a consequence,the IPTM is used as the boundary term for the loss function as equation (6) and also a reference for the credibility evaluation as equation (8),which will be described in Section IV.
Our proposed framework pays special attention to the behaviour of vehicles as they approach and enter the intersection at the early stage.
1) Inputs and Outputs:The input features for the intention and trajectory prediction model derive from the collected featuresThe motion feature of the ego-vehicle includes position,velocity,acceleration(optional) and heading (optional):The neighbour feature contains the relative position (Δx,Δy)of the surroundingNvehicles (forward and backward vehicles in the left,middle,right lane):And traffic feature implies the traffic situation and traffic rules:Ftraf fic={D irection,Lane_type},where the Direction indicates the differentn+1 entrances of the intersection,indexed as 0−n.Lane_type is the index of the current lane type according to permissible driving actions (left turn-only,right turn-only,straight going-only,left&straight,etc.),which is available from the map.And the intentionIcomes from the intention prediction module.For the intention prediction module,the inputs data are the history features FTh(except for the intention feature) with frequency 5 Hz,and the outputs are the intention estimationIirepresented by index [ 0,1,2],(going straight:Is=0,turning left:Il=1,turning right:Ir=2).For the trajectory prediction model,the historical features are FTh(According to the ablation experiment,the neighbour feature Fneighbouris not adopted for the final results of trajectory prediction),withTh=3 s,and the outputs are the parameters of the jointly Gaussian distribution {(µx,t,µy,t,σx,t,σy,t,ρt)|t∈Tf} of the target vehicle in prediction horizonTf=3 s.
2) Designation and Training Details:Our proposed framework is composed of three parts.Similar to other works[26],the intention module is an LSTM-based network and serves as the upper module that connects the lower module for trajectory prediction.The difference is that we introduce the prepared prior trajectory distribution.As shown in Fig.3,among the prior distribution candidates,the one consistent with predict intention will be selected.The prior distribution plays two important roles in this module.One is to match the predicted intention and provide the kinematic constraint.Under the restraint of the loss function (as in equation (6)),the predicted trajectory can be repulsed into the reasonable domain;the other function is that the prior trajectory can be regarded as the reference for credibility evaluation,which will be illustrated in Section IV-D.
The motive of the designation is to predict the behaviour of vehicles at intersections,i.e.,the turning intention and the speed profile.The intention prediction module is responsible for predicting the driving direction of the vehicle before the stop line.The upper LSTM model learns the characteristics and outputs the possibility of going straightIs,turning leftIland turning rightIras equation (3).Once the intention is obtained,the lower trajectory prediction module comes into operation with features including the intention,as in equation(4).And the trajectory prediction module is also an LSTMbased model that outputs the position sequence at the early stage at the intersection.Due to the observed noise and the diverse driving patterns,the uncertainty should be considered,which is illustrated as equation (4).The analysis to determine how to get boundary terms from prior trajectories has been discussed above,which will not be reiterated here.
The LSTM models are the same in both modules,with hidden size of 64.The loss function for intention prediction LIand trajectory prediction LTare shown in equations (5)and (6),respectively,where µp,σpare from the prior trajectories boundary.The loss is propagated backward through RMSprop with learning rate 0.005 that decays 95%every 5 epochs.During the training phase,the upper module and the lower one are trained separately for 300 epochs,while in the test phase,these two modules are used in tandem.
We train and verify our framework with two publicly released datasets.The first one belongs to the next generation simulation (NGSIM) program [27].This dataset is today’s largest dataset of naturalistic vehicle trajectories and is widely used for research on traffic flow and driver models [28].It includes four different recording sites:Interstate 80 (I-80) in Emeryville,CA,US Highway 101 (US 101) in Los Angeles,CA,Lankershim Boulevard (LB) in Los Angeles,CA and Peachtree Street (PS) in Atlanta,GA [29].Considering that the recordings at LB and PS contain urban scenes,we mainly focus on these two sites,where data was collected for about 1 hour at 10 Hz,as in Figs.4 (a) and 4 (b).
Furthermore,we also make use of the INTERACTION dataset [30],which contains naturalistic motions of various traffic participants in a variety of highly interactive driving scenarios from different countries.Different from NGSIM,it mainly targets diverse,complex traffic scenarios like roundabouts and intersections.In this paper,we choose the dataset“US intersection MA”which describes a four-way intersection with multiple lanes as in Fig.4 (c) (Hereinafter referred to as INTERACTION).
These two datasets are trained and tested independently with the same processing method.For both datasets,we downsampled the tracks to 5 Hz and eliminate static states for signal waiting (supposed that the waiting time to bet0=t0,h+t0,f,then the past and prediction horizon is updated asTh+t0,h,Tf+t0,fwith still states int0period).We shuffle the processed data and split the training data and testing data in the ratio of 3:1.For generalization,we add measurement noise with Gaussian distribution to the positions of the neighbours.
Fig.4.Datasets used in experiments.(a) Lankershim blvd;(b) Peachtree street;(c) INTERACTION dataset.
The experimental results mainly include the accuracy and precision of intention prediction and the motion prediction of the proposed framework compared with some state-of-the-art models.
Driving intention is an important cue for vehicle trajectory prediction,so we evaluate the performance of this model in two aspects:accuracy and precision.The results of the two datasets are shown in Table I.
In the experiment,This fixed at 3 s,with 15 frames.However,in practical applications,it is difficult to clarify when the vehicle enters the intersection.Thus,we collected data from 20 frames within a 4 s period,applying a sliding window of 15 frames in width with a step of 1.Assuming thatIiis the result of theithsliding window,then results from the 5 sliding windows are weighted together into set I,which is a set composed of intentionsIs,Il,Irwith the amounts ofs,l,r∈N as in equation (7).Finally,the intentionIjwith the most repetitions in the set is chosen as the prediction resultI∗(|I| r epresents the amount of intentionI).
For both NGSIM and INTERACTION,the behaviour of going straight is well learned and predicted.Using the NGSIMdata,the right-turn intention has an accuracy of 76.92%,while using the INTERACTION dataset,prediction on left-turn intention is relatively poor.The different consequences can be attributed to two reasons.The first one is that there is an imbalance in the amount of data in the datasets.Due to the difference in the amount of data between classes,the model learns fewer features from the category with a small amount.We attempted to add noise to the data in the category with the smaller amount of data,but due to the limited magnitude of the noise and the damage to the naturalistic driving data caused by the noise,the trained model does not perform better.The second reason that could explain the inconsistent results of the two datasets could be due to the fact that we adopted the lane type as one of the traffic features,which provides the permitted legal action of the vehicle.In the NGSIM dataset,most of the left-turn vehicles driving in the left-only lanes,while straight-travel and right-turn vehicles usually share the same lane.Therefore,the left-turn samples were discriminated by lane feature,while the right-turn data and straight forward samples have similarities in lane feature.The same principle applies in INTERACTION dataset,where the straight travelling and left-turning vehicles share the same lane and are almost indistinguishable before entering the intersection.Even in the situation of single lane driving,leftturning vehicles steer later than right-turning vehicles,so some of the left-turning samples are mistaken as straight travel.
TABLEI INTENTION PREDICTION OF NGSIM AND INTERACTION
For the evaluation of trajectory prediction,we make a comparison of the proposed method with the models presented in [20] and [23],which are named the Conv-social Pooling and graph-based interaction-aware trajectory prediction(GRIP) models,respectively.These models are mainly designed and used for trajectory prediction in highways.To be fair,we use the same input features as those in [20] and [23]including the same motion features of the ego vehicle,the same neighbour features and same intention features (or maneuver in [23]).
Enc-Dec:The simple encoder-decoder network.The features are the same as that we use for our proposed model except for the intention feature.
Conv-social Pooling:Composed of an encoder,a convolutional social pooling layer that learns the spatial interdependencies of the agents,and a decoder that outputs the multi-modal motion of the vehicles.
GRIP:Use a graph to represent the interactions of traffic neighbours,and apply graph convolutional blocks to learn the features.
Ours:The proposed model as shown in Fig.3,using feature:F ={Fego,Fneighbour,I}.
The comparison results of NGSIM data are shown in Table II with respect to root mean squared error (RMSE).Compared with the models proposed for highway scenarios,our model achieves an improvement of at most 20% in accuracy(predictions of GRIP model do not converge at last).We determine that the common feature of [20] and [23] is to design a special convolution network to describe the interactions between vehicles.It is typically suitable for highways,where vehicles interact with each other,changing lanes or accelerating.In an environment with intersections,vehicles are more restricted to traffic rules.In order to verify our conjecture and find out the prediction characteristics applicable to the intersection environment,we carried out ablation experiments.
RMSE (m) OF NGSIMTA UBSLINEG I ICOMPARED MODELS
In this section,we will verify the effectiveness of the input features in different categories,i.e.,self features Fego,neighbour features Fneighbour,traffic features Ftra f fic,and the intention features as well as the proposed IPTM.Though the selected features are thought to be important for trajectory prediction,we propose that different features work for different scenarios.To evaluate different kinds of features,we have designed four sets of experiments,each of which uses the IPTM,but is different in terms of inputs.The basic experiment is conducted based on all features including Fego,Fneighbour,Ftraf fic,I,defined as the reference experiment.Compared to this,we removed one of the corresponding feature in the other three experiments,named“No intention(with IPTM)”,“No traffic (with IPTM)”and“No neighbor(with IPTM)”in columns of Table III.
From the experimental results,we can draw the conclusion that different kinds of characteristics have different effects on the prediction results.For the INTERACTION dataset,intention features and traffic features can contribute to more accurate results,with an increase of 34.31% and 1 3.46% in the final displacement error (FDE),respectively.It is because that intention is not only a crucial cue for the prediction network,but also introduces the prior distributions that impose restrictions on the predicted Gaussian parameters.For the NGSIM dataset,the experimental results without intention or traffic features are almost the same as the results with thesefeatures,which could be caused by the characteristics of the data itself.In NGSIM,there exist a large number of straightgoing vehicles,and the intention and traffic features are effective features for turning vehicles,while for straight-going vehicles these features are not discriminative enough for recognition.The features of neighbours cause a negative impact on prediction accuracy,without which,the accuracy increased by 14.03% and 18.89% in FDE of NGSIM and INTERACTION dataset.We propose two reasons for this situation.First,since artificial noise is applied to observations of the neighbours,it would disturb the model to identify features.Second,we surmise that when the vehicle is approaching the intersection,it is mainly dominated by traffic rules and has less interaction with vehicles in its neighbourhood,which explains the results of the two interaction-aware models.Particularly,it interacts with the vehicles coming from different directions in the conflict zone and also interacts with the pedestrians in the cross walk,which will be the focus of our research in the future.
TABLE IIIRMSE (m) OF MODELS WITH DIFFERENT KIND OF FEATURES
Furthermore,in order to verify the effectiveness of the proposed IPTM,we design the experiment“No intention(without IPTM)”in contrast with experiment“No intention(with IPTM)”.Results in Table III fully illustrate the effectiveness of our proposed IPTM module.When using IPTM as a priori information to constrain the trajectory distribution,the accuracy of trajectory prediction can be improved by about 12.66% in FDE of the two datasets.One advantage of using IPTM is that it can help the model predict a reasonable probability distribution,which is different from other network models that only train the mean value of the trajectory instead of the distribution.Additionally,further experimental analysis is needed for the cooperation effects of the IPTM and other features.
Consequently,we adopt the motion features of the ego vehicle Fego,the traffic features Ftraf fic,and the predicted intention together with IPTM for trajectory prediction.The best results of trajectory prediction are shown in column“No neighbour”of Table III.Moreover,we visualize the results of the two datasets in Fig.5 using the model with the best performance,where 4 kinds of behaviours are included:going straight,turning left,turning right and stopping gradually.In Fig.5,the historical and future tracks of the ego-vehicle are marked yellow and red,respectively,while blue dots refer to the tracks of neighbour vehicles.What’s more,the prior trajectory as well as the predicted distribution are provided with cool and warm series colors.
Most of the existing methods output the prediction results,and then evaluate the deviation between the prediction and the ground truth by the RMSE metric.Generally,the smaller the prediction error is,the stronger the predictive ability the model has.However,using RMSE as an evaluation indicator has two disadvantages.The first is that it can only represent the average performance of the model,and the worst prediction result of the model is likely to be covered up.The most important fact is that the calculation of RMSE depends on the ground truth,which is not feasible for evaluating the confidence of the trajectory predicted by the model in practical applications.
In particular,when the model gives results that do not conform to vehicle dynamics or deviate from the lane,it must be recognized correctly to avoid misleading planning and decision-making.In our model,the modified Hausdorff distance (MHD) is adopted to compare the predicted trackAand the prior trajectories distributionBfrom IPTM,thus evaluating the spatial similarity between two tracks before knowing the ground truth.As shown in (8),MHD equals the maximal distance from each pointaiin setAto setB,and the distance between pointaiand setBis the minimal value fromaito anybj∈B.Based on the results of MHD,another credibility cognition mechanism can be designed to assess the safety of the predicted trajectory quantitatively.In view of the fact that the smaller value of MHD means a similar driving pattern as usual,and a larger MHD implies an abnormal driving pattern which deviates from the statistical tracks,the MHD can be used as important evidence for credibility evaluation,which would mitigate the severe impact of false predictions and also be an interesting topic in our future work.
Fig.5.Distribution results of our proposed model,of which (a)–(d) belong to results of NGSIM.(e)–(h) are from INTERACTION.Tracks of neighbour vehicles:blue dots;Historical/future tracks of ego-vehicle:yellow/red dots;Predicted distribution:cool series colors;Prior trajectories distribution:warm series colors.
Fig.6.Boxplots of MHD metric for the datasets in order of“Ours on NGSIM”,“CS-LSTM on NGSIM”,“Encoder-Decoder on NGSIM”and“Ours on INTERACTION”.Smaller value means that the predicted trajectory is closer to the prior trajectory,which means that the prediction is more reasonable.
Using the testing data of NGSIM and INTERACTION,we create the boxplots in Fig.6.Consistent with the RMSE,the MHD of the INTERACTION dataset tends to be smaller,with a mean of 1.08 m.While the MHD of NGSIM has a mean of 3.50m generated by our model.As for the compared models,the average MHD of CS-LSTM and Enc-Dec is 4.34 m and 4.45m,respectively.The smaller MHD value generated by our models means that the predicted trajectory is closer to the prior trajectory,which indicates that the prediction is more reasonable.Furthermore,we classify the results according to different intentions.For NGSIM,the MHD of vehicles going straight is larger than the others among all the models.We speculate the reason is that in NGSIM data,a large proportion of samples are that of straight-going vehicles,and predictions of straight travel by mistake will lead to an increase of MHD.In the INTERACTION dataset,the amount ratio of each intention sample is relatively balanced,so MHD is not as sensitive to different intentions as with NGSIM.
For intersection scenarios,we propose an LSTM-based framework with intention and trajectory prediction modules.The design of the upper and lower structures makes full use of the guiding effect of intention prediction on the trajectory,and improves the accuracy of prediction.The introduction of IPTM in the framework not only serves as the lanes that approximates the distribution of trajectories,but also evaluates the credibility of prediction results when the ground truth is unknown.The validity of the model is also proved by verification with two datasets.In the future,we will introduce other features like vehicle turning signals into our model,and consider the interactions from the oncoming vehicles in the conflict zone and the interactions between vehicles and pedestrians.
ACkNOWLEDGMENT
The authors would like to thank the support of organizations,including the National Natural Science Foundation of China,China Post-doctoral Management Committee,etc.Many thanks to the open-source work of NGSIM,INTERACTION,PyTorch and other public datasets and tools.
IEEE/CAA Journal of Automatica Sinica2021年10期