- (book) Dynamic Programming, Bellman R. (1957).
- (book) Dynamic Programming and Optimal Control, Volumes 1 and 2, Bertsekas D. (1995).
- (book) Markov Decision Processes - Discrete Stochastic Dynamic Programming, Puterman M. (1995).
ExpectiMinimaxOptimal strategy in games with chance nodes, MelkΓ³ E., Nagy B. (2007).Sparse samplingA sparse sampling algorithm for near-optimal planning in large Markov decision processes, Kearns M. et al. (2002).MCTSEfficient Selectivity and Backup Operators in Monte-Carlo Tree Search, RΓ©mi Coulom, SequeL (2006).UCTBandit based Monte-Carlo Planning, Kocsis L., SzepesvΓ‘ri C. (2006).- Bandit Algorithms for Tree Search, Coquelin P-A., Munos R. (2007).
OPDOptimistic Planning for Deterministic Systems, Hren J., Munos R. (2008).OLOPOpen Loop Optimistic Planning, Bubeck S., Munos R. (2010).OPSSOptimistic planning for sparsely stochastic systems, L. BuΕoniu, R. Munos, B. De Schutter, and R. Babuska (2011).LGPLogic-Geometric Programming: An Optimization-Based Approach to Combined Task and Motion Planning, Toussaint M. (2015). ποΈAlphaGoMastering the game of Go with deep neural networks and tree search, Silver D. et al. (2016).AlphaGo ZeroMastering the game of Go without human knowledge, Silver D. et al. (2017).AlphaZeroMastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, Silver D. et al. (2017).TrailBlazerBlazing the trails before beating the path: Sample-efficient Monte-Carlo planning, Grill J. B., Valko M., Munos R. (2017).MCTSnetsLearning to search with MCTSnets, Guez A. et al. (2018).ADISolving the Rubik's Cube Without Human Knowledge, McAleer S. et al. (2018).OPC/SOPCContinuous-action planning for discounted inο¬nite-horizon nonlinear optimal control with Lipschitz values, Busoniu L., Pall E., Munos R. (2018).
- (book) Constrained Control and Estimation, Goodwin G. (2005).
PIΒ²A Generalized Path Integral Control Approach to Reinforcement Learning, Theodorou E. et al. (2010).PIΒ²-CMAPath Integral Policy Improvement with Covariance Matrix Adaptation, Stulp F., Sigaud O. (2010).iLQGA generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems, Todorov E. (2005).iLQG+Synthesis and stabilization of complex behaviors through online trajectory optimization, Tassa Y. (2012).
- (book) Model Predictive Control, Camacho E. (1995).
- (book) Predictive Control With Constraints, Maciejowski J. M. (2002).
- Linear Model Predictive Control for Lane Keeping and Obstacle Avoidance on Low Curvature Roads, Turri V. et al. (2013).
MPCCOptimization-based autonomous racing of 1:43 scale RC cars, Liniger A. et al. (2014). ποΈ | ποΈMIQPOptimal trajectory planning for autonomous driving integrating logical constraints: An MIQP perspective, Qian X., AltchΓ© F., Bender P., Stiller C. de La Fortelle A. (2016).
- Minimax analysis of stochastic problems, Shapiro A., Kleywegt A. (2002).
Robust DPRobust Dynamic Programming, Iyengar G. (2005).- Robust Planning and Optimization, Laumanns M. (2011). (lecture notes)
- Robust Markov Decision Processes, Wiesemann W., Kuhn D., Rustem B. (2012).
- Safe and Robust Learning Control with Gaussian Processes, Berkenkamp F., Schoellig A. (2015). ποΈ
Coarse-IdOn the Sample Complexity of the Linear Quadratic Regulator, Dean S., Mania H., Matni N., Recht B., Tu S. (2017).Tube-MPPIRobust Sampling Based Model Predictive Control with Sparse Objective Information, Williams G. et al. (2018). ποΈ
- A Comprehensive Survey on Safe Reinforcement Learning, GarcΓa J., FernΓ‘ndez F. (2015).
RA-QMDPRisk-averse Behavior Planning for Autonomous Driving under Uncertainty, Naghshvar M. et al. (2018).
ICSWill the Driver Seat Ever Be Empty?, Fraichard T. (2014).SafeOPTSafe Controller Optimization for Quadrotors with Gaussian Processes, Berkenkamp F., Schoellig A., Krause A. (2015). ποΈ
SafeMDPSafe Exploration in Finite Markov Decision Processes with Gaussian Processes, Turchetta M., Berkenkamp F., Krause A. (2016).
RSSOn a Formal Model of Safe and Scalable Self-driving Cars, Shalev-Shwartz S. et al. (2017).HJI-reachabilitySafe learning for control: Combining disturbance estimation, reachability analysis and reinforcement learning with systematic exploration, Heidenreich C. (2017).CPOConstrained Policy Optimization, Achiam J., Held D., Tamar A., Abbeel P. (2017).
RCPOReward Constrained Policy Optimization, Tessler C., Mankowitz D., Mannor S. (2018).BFTQA Fitted-Q Algorithm for Budgeted MDPs, Carrara N. et al. (2018).MPC-HJIOn Infusing Reachability-Based Safety Assurance within Probabilistic Planning Frameworks for Human-Robot Vehicle Interactions, Leung K. et al. (2018).LTL-RLReinforcement Learning with Probabilistic Guarantees for Autonomous Driving, Bouton M. et al. (2019).- Safe Reinforcement Learning with Scene Decomposition for Navigating Complex Urban Environments, Bouton M. et al. (2019).
- Batch Policy Learning under Constraints, Le H., Voloshin C., Yue Y. (2019).
- Simulation of Controlled Uncertain Nonlinear Systems, Tibken B., Hofer E. (1995).
- Trajectory computation of dynamic uncertain systems, Adrot O., Flaus J-M. (2002).
- Simulation of Uncertain Dynamic Systems Described By Interval Models: a Survey, Puig V. et al. (2005).
- Design of interval observers for uncertain dynamical systems, Efimov D., RaΓ―ssi T. (2016).
TSOn the Likelihood that One Unknown Probability Exceeds Another in View of the Evidence of Two Samples, Thompson W. (1933).UCB1 / UCB2Finite-time Analysis of the Multiarmed Bandit Problem, Auer P., Cesa-Bianchi N., Fischer P. (2002).Empirical Bernstein / UCB-VExploration-exploitation tradeoff using variance estimates in multi-armed bandits, Audibert J-Y, Munos R., Szepesvari C. (2009).- Empirical Bernstein Bounds and Sample Variance Penalization, Maurer A., Ponti M. (2009).
- An Empirical Evaluation of Thompson Sampling, Chapelle O., Li L. (2011).
kl-UCBThe KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond, Garivier A., CappΓ© O. (2011).KL-UCBKullback-Leibler Upper Confidence Bounds for Optimal Sequential Allocation, CappΓ© O. et al. (2013).LUCBPAC Subset Selection in Stochastic Multi-armed Bandits, Kalyanakrishnan S. et al. (2012).Track-and-StopOptimal Best Arm Identification with Fixed Confidence, Garivier A., Kaufmann E. (2016).M-LUCB / M-RacingMaximin Action Identification: A New Bandit Framework for Games, Garivier A., Kaufmann E., Koolen W. (2016).LUCB-microStructured Best Arm Identification with Fixed Confidence, Huang R. et al. (2017).
GP-UCBGaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design, Srinivas N., Krause A., Kakade S., Seeger M. (2009).DOO/SOOOptimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness, Munos R. (2011).POOBlack-box optimization of noisy functions with unknown smoothness, Grill J-B., Valko M., Munos R. (2015).- Bayesian Optimization in AlphaGo, Chen Y. et al. (2018)
- Reinforcement learning: A survey, Kaelbling L. et al. (1996).
NFQNeural fitted Q iteration - First experiences with a data efficient neural Reinforcement Learning method, Riedmiller M. (2005).DQNPlaying Atari with Deep Reinforcement Learning, Mnih V. et al. (2013). ποΈDDQNDeep Reinforcement Learning with Double Q-learning, van Hasselt H., Silver D. et al. (2015).DDDQNDueling Network Architectures for Deep Reinforcement Learning, Wang Z. et al. (2015). ποΈPDDDQNPrioritized Experience Replay, Schaul T. et al. (2015).NAFContinuous Deep Q-Learning with Model-based Acceleration, Gu S. et al. (2016).RainbowRainbow: Combining Improvements in Deep Reinforcement Learning, Hessel M. et al. (2017).Ape-X DQfDObserve and Look Further: Achieving Consistent Performance on Atari, Pohlen T. et al. (2018). ποΈ
REINFORCESimple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning, Williams R. (1992).Natural GradientA Natural Policy Gradient, Kakade S. (2002).- Policy Gradient Methods for Robotics, Peters J., Schaal S. (2006).
TRPOTrust Region Policy Optimization, Schulman J. et al. (2015). ποΈPPOProximal Policy Optimization Algorithms, Schulman J. et al. (2017). ποΈDPPOEmergence of Locomotion Behaviours in Rich Environments, Heess N. et al. (2017). ποΈ
ACPolicy Gradient Methods for Reinforcement Learning with Function Approximation, Sutton R. et al. (1999).NACNatural Actor-Critic, Peters J. et al. (2005).DPGDeterministic Policy Gradient Algorithms, Silver D. et al. (2014).DDPGContinuous Control With Deep Reinforcement Learning, Lillicrap T. et al. (2015). ποΈ 1 | 2 | 3 | 4MACETerrain-Adaptive Locomotion Skills Using Deep Reinforcement Learning, Peng X., Berseth G., van de Panne M. (2016). ποΈ | ποΈA3CAsynchronous Methods for Deep Reinforcement Learning, Mnih V. et al 2016. ποΈ 1 | 2 | 3SACSoft Actor-Critic : Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, Haarnoja T. et al. (2018). ποΈ
CEMLearning Tetris Using the Noisy Cross-Entropy Method, Szita I., LΓΆrincz A. (2006). ποΈCMAESCompletely Derandomized Self-Adaptation in Evolution Strategies, Hansen N., Ostermeier A. (2001).NEATEvolving Neural Networks through Augmenting Topologies, Stanley K. (2002). ποΈ
DynaIntegrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming, Sutton R. (1990).UCRL2Near-optimal Regret Bounds for Reinforcement Learning, Jaksch T. (2010).PILCOPILCO: A Model-Based and Data-Efficient Approach to Policy Search, Deisenroth M., Rasmussen C. (2011). (talk)DBNProbabilistic MDP-behavior planning for cars, Brechtel S. et al. (2011).GPSEnd-to-End Training of Deep Visuomotor Policies, Levine S. et al. (2015). ποΈDeepMPCDeepMPC: Learning Deep Latent Features for Model Predictive Control, Lenz I. et al. (2015). ποΈSVGLearning Continuous Control Policies by Stochastic Value Gradients, Heess N. et al. (2015). ποΈ- Optimal control with learned local models: Application to dexterous manipulation, Kumar V. et al. (2016). ποΈ
BPTTLong-term Planning by Short-term Prediction, Shalev-Shwartz S. et al. (2016). ποΈ 1 | 2- Deep visual foresight for planning robot motion, Finn C., Levine S. (2016). ποΈ
VINValue Iteration Networks, Tamar A. et al (2016). ποΈVPNValue Prediction Network, Oh J. et al. (2017).- An LSTM Network for Highway Trajectory Prediction, AltchΓ© F., de La Fortelle A. (2017).
DistGBPModel-Based Planning with Discrete and Continuous Actions, Henaff M. et al. (2017). ποΈ 1 | 2- Prediction and Control with Temporal Segment Models, Mishra N. et al. (2017).
PredictronThe Predictron: End-To-End Learning and Planning, Silver D. et al. (2017). ποΈMPPIInformation Theoretic MPC for Model-Based Reinforcement Learning, Williams G. et al. (2017). ποΈ- Learning Real-World Robot Policies by Dreaming, Piergiovanni A. et al. (2018).
- Coupled Longitudinal and Lateral Control of a Vehicle using Deep Learning, Devineau G., Polack P., AlchtΓ© F., Moutarde F. (2018) ποΈ
PlaNetLearning Latent Dynamics for Planning from Pixels, Hafner et al. (2018). ποΈ
- Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear, Lipton Z. et al. (2016).
HERHindsight Experience Replay, Andrychowicz M. et al. (2017). ποΈVHERVisual Hindsight Experience Replay, Sahni H. et al. (2019).RNDExploration by Random Network Distillation, Burda Y. et al. (OpenAI) (2018). ποΈGo-ExploreGo-Explore: a New Approach for Hard-Exploration Problems, Ecoffet A. et al. (Uber) (2018). ποΈ
- Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning, Sutton R. et al. (1999).
- Intrinsically motivated learning of hierarchical collections of skills, Barto A. et al. (2004).
OCThe Option-Critic Architecture, Bacon P-L., Harb J., Precup D. (2016).- Learning and Transfer of Modulated Locomotor Controllers, Heess N. et al. (2016). ποΈ
- Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving, Shalev-Shwartz S. et al. (2016).
FuNsFeUdal Networks for Hierarchical Reinforcement Learning, Vezhnevets A. et al. (2017).- Combining Neural Networks and Tree Search for Task and Motion Planning in Challenging Environments, Paxton C. et al. (2017). ποΈ
DeepLocoDeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning , Peng X. et al. (2017). ποΈ | ποΈ- Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play, Mahjourian R. et al (2018). [ποΈ](https://sites.google.com/view/
DACDAC: The Double Actor-Critic Architecture for Learning Options, Zhang S., Whiteson S. (2019).
PBVIPoint-based Value Iteration: An anytime algorithm for POMDPs, Pineau J. et al. (2003).cPBVIPoint-Based Value Iteration for Continuous POMDPs, Porta J. et al. (2006).POMCPMonte-Carlo Planning in Large POMDPs, Silver D., Veness J. (2010).- A POMDP Approach to Robot Motion Planning under Uncertainty, Du Y. et al. (2010).
- Solving Continuous POMDPs: Value Iteration with Incremental Learning of an Efficient Space Representation, Brechtel S. et al. (2013).
- Probabilistic Decision-Making under Uncertainty for Autonomous Driving using Continuous POMDPs, Brechtel S. et al. (2014).
MOMDPIntention-Aware Motion Planning, Bandyopadhyay T. et al. (2013).- The value of inferring the internal state of traffic participants for autonomous freeway driving, Sunberg Z. et al. (2017).
- Belief State Planning for Autonomously Navigating Urban Intersections, Bouton M., Cosgun A., Kochenderfer M. (2017).
- Probabilistic Decision-Making at Road Intersections: Formulation and Quantitative Evaluation, Barbier M., Laugier C., Simonin O., Ibanez J. (2018).
IT&ERobots that can adapt like animals, Cully A., Clune J., Tarapore D., Mouret J-B. (2014). ποΈMAMLModel-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Finn C., Abbeel P., Levine S. (2017). ποΈ- Virtual to Real Reinforcement Learning for Autonomous Driving, Pan X. et al. (2017). ποΈ
- Sim-to-Real: Learning Agile Locomotion For Quadruped Robots, Tan J. et al. (2018). ποΈ
ME-TRPOModel-Ensemble Trust-Region Policy Optimization, Kurutach T. et al. (2018). ποΈ- Kickstarting Deep Reinforcement Learning, Schmitt S. et al. (2018).
- Learning Dexterous In-Hand Manipulation, OpenAI (2018). ποΈ
GrBAL / ReBALLearning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning, Nagabandi A. et al. (2018). ποΈ- Learning agile and dynamic motor skills for legged robots, Hwangbo J. et al. (ETH Zurich / Intel ISL) (2019). ποΈ
- Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning, Lee J., Hwangbo J., Hutter M. (ETH Zurich RSL) (2019)
IT&ELearning and adapting quadruped gaits with the "Intelligent Trial & Error" algorithm, Dalin E., Desreumaux P., Mouret J-B. (2019). ποΈ
Minimax-QMarkov games as a framework for multi-agent reinforcement learning, M. Littman (1994).- Autonomous Agents Modelling Other Agents: A Comprehensive Survey and Open Problems, Albrecht S., Stone P. (2017).
MILPTime-optimal coordination of mobile robots along specified paths, AltchΓ© F. et al. (2016). ποΈMIQPAn Algorithm for Supervised Driving of Cooperative Semi-Autonomous Vehicles, AltchΓ© F. et al. (2017). ποΈSA-CADRLSocially Aware Motion Planning with Deep Reinforcement Learning, Chen Y. et al. (2017). ποΈ- Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction: Theory and experiment, Galceran E. et al. (2017).
- Online decision-making for scalable autonomous systems, Wray K. et al. (2017).
MAgentMAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence, Zheng L. et al. (2017). ποΈ- Cooperative Motion Planning for Non-Holonomic Agents with Value Iteration Networks, Rehder E. et al. (2017).
MPPOTowards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning, Long P. et al. (2017). ποΈCOMACounterfactual Multi-Agent Policy Gradients, Foerster J. et al. (2017).FTWHuman-level performance in first-person multiplayer games with population-based deep reinforcement learning, Jaderberg M. et al. (2018). ποΈ
- Variable Resolution Discretization in Optimal Control, Munos R., Moore A. (2002). ποΈ
DeepDrivingDeepDriving: Learning Affordance for Direct Perception in Autonomous Driving, Chen C. et al. (2015). ποΈ- On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training, Shalev-Shwartz S. et al. (2016).
- Learning sparse representations in reinforcement learning with sparse coding, Le L., Kumaraswamy M., White M. (2017).
- World Models, Ha D., Schmidhuber J. (2018). ποΈ

- Learning to Drive in a Day, Kendall A. et al. (2018). ποΈ
MERLINUnsupervised Predictive Memory in a Goal-Directed Agent, Wayne G. et al. (2018). ποΈ 1 | 2 | 3 | 4 | 5 | 6- Variational End-to-End Navigation and Localization, Amini A. et al. (2018). ποΈ
- Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks, Lee M. et al. (2018). ποΈ
- Deep Neuroevolution of Recurrent and Discrete World Models, Risi S., Stanley K.O. (2019). ποΈ

- Is the Bellman residual a bad proxy?, Geist M., Piot B., Pietquin O. (2016).
- Deep Reinforcement Learning that Matters, Henderson P. et al. (2017).
- Automatic Bridge Bidding Using Deep Reinforcement Learning, Yeh C. and Lin H. (2016).
- Shared Autonomy via Deep Reinforcement Learning, Reddy S. et al. (2018). ποΈ
- Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review, Levine S. (2018).
- The Value Function Polytope in Reinforcement Learning, Dadashi R. et al. (2019).
- On Value Functions and the Agent-Environment Boundary, Jiang N. (2019).
QMDP-RCNNReinforcement Learning via Recurrent Convolutional Neural Networks, Shankar T. et al. (2016). (talk)DQfDLearning from Demonstrations for Real World Reinforcement Learning, Hester T. et al. (2017). ποΈ- Find Your Own Way: Weakly-Supervised Segmentation of Path Proposals for Urban Autonomy, Barnes D., Maddern W., Posner I. (2016). ποΈ
GAILGenerative Adversarial Imitation Learning, Ho J., Ermon S. (2016).- From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots, Pfeiffer M. et al. (2017). ποΈ
BranchedEnd-to-end Driving via Conditional Imitation Learning, Codevilla F. et al. (2017). ποΈ | talkUPNUniversal Planning Networks, Srinivas A. et al. (2018). ποΈDeepMimicDeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills, Peng X. B. et al. (2018). ποΈR2P2Deep Imitative Models for Flexible Inference, Planning, and Control, Rhinehart N. et al. (2018). ποΈ
- ALVINN, an autonomous land vehicle in a neural network, Pomerleau D. (1989).
- End to End Learning for Self-Driving Cars, Bojarski M. et al. (2016). ποΈ
- End-to-end Learning of Driving Models from Large-scale Video Datasets, Xu H., Gao Y. et al. (2016). ποΈ
- End-to-End Deep Learning for Steering Autonomous Vehicles Considering Temporal Dependencies, Eraqi H. et al. (2017).
- Driving Like a Human: Imitation Learning for Path Planning using Convolutional Neural Networks, Rehder E. et al. (2017).
- Imitating Driver Behavior with Generative Adversarial Networks, Kuefler A. et al. (2017).
PS-GAILMulti-Agent Imitation Learning for Driving Simulation, Bhattacharyya R. et al. (2018). ποΈ
ProjectionApprenticeship learning via inverse reinforcement learning, Abbeel P., Ng A. (2004).MMPMaximum margin planning, Ratliff N. et al. (2006).BIRLBayesian inverse reinforcement learning, Ramachandran D., Amir E. (2007).MEIRLMaximum Entropy Inverse Reinforcement Learning, Ziebart B. et al. (2008).LEARCHLearning to search: Functional gradient techniques for imitation learning, Ratliff N., Siver D. Bagnell A. (2009).CIOCContinuous Inverse Optimal Control with Locally Optimal Examples, Levine S., Koltun V. (2012). ποΈMEDIRLMaximum Entropy Deep Inverse Reinforcement Learning, Wulfmeier M. (2015).GCLGuided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, Finn C. et al. (2016). ποΈRIRLRepeated Inverse Reinforcement Learning, Amin K. et al. (2017).- Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning, Piot B. et al. (2017).
- Apprenticeship Learning for Motion Planning, with Application to Parking Lot Navigation, Abbeel P. et al. (2008).
- Navigate like a cabbie: Probabilistic reasoning from observed context-aware behavior, Ziebart B. et al. (2008).
- Planning-based Prediction for Pedestrians, Ziebart B. et al. (2009). ποΈ
- Learning for autonomous navigation, Bagnell A. et al. (2010).
- Learning Autonomous Driving Styles and Maneuvers from Expert Demonstration, Silver D. et al. (2012).
- Learning Driving Styles for Autonomous Vehicles from Demonstration, Kuderer M. et al. (2015).
- Learning to Drive using Inverse Reinforcement Learning and Deep Q-Networks, Sharifzadeh S. et al. (2016).
- Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments, Wulfmeier M. (2016). ποΈ
- Planning for Autonomous Cars that Leverage Effects on Human Actions, Sadigh D. et al. (2016).
- A Learning-Based Framework for Handling Dilemmas in Urban Automated Driving, Lee S., Seo S. (2017).
- Learning Trajectory Prediction with Continuous Inverse Optimal Control via Langevin Sampling of Energy-Based Models, Xu Y. et al. (2019).
DijkstraA Note on Two Problems in Connexion with Graphs, Dijkstra E. W. (1959).A*A Formal Basis for the Heuristic Determination of Minimum Cost Paths , Hart P. et al. (1968).- Planning Long Dynamically-Feasible Maneuvers For Autonomous Vehicles, Likhachev M., Ferguson D. (2008).
- Optimal Trajectory Generation for Dynamic Street Scenarios in a Frenet Frame, Werling M., Kammel S. (2010). ποΈ
- 3D perception and planning for self-driving and cooperative automobiles, Stiller C., Ziegler J. (2012).
- Motion Planning under Uncertainty for On-Road Autonomous Driving, Xu W. et al. (2014).
- Monte Carlo Tree Search for Simulated Car Racing, Fischer J. et al. (2015). ποΈ
RRT*Sampling-based Algorithms for Optimal Motion Planning, Karaman S., Frazzoli E. (2011). ποΈLQG-MPLQG-MP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information, van den Berg J. et al. (2010).- Motion Planning under Uncertainty using Differential Dynamic Programming in Belief Space, van den Berg J. et al. (2011).
- Rapidly-exploring Random Belief Trees for Motion Planning Under Uncertainty, Bry A., Roy N. (2011).
PRM-RLPRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning, Faust A. et al. (2017).
- Trajectory planning for Bertha - A local, continuous method, Ziegler J. et al. (2014).
- Learning Attractor Landscapes for Learning Motor Primitives, Ijspeert A. et al. (2002).
PFReal-time obstacle avoidance for manipulators and mobile robots, Khatib O. (1986).VFHThe Vector Field Histogram - Fast Obstacle Avoidance For Mobile Robots, Borenstein J. (1991).VFH+VFH+: Reliable Obstacle Avoidance for Fast Mobile Robots, Ulrich I., Borenstein J. (1998).Velocity ObstaclesMotion planning in dynamic environments using velocity obstacles, Fiorini P., Shillert Z. (1998).
- A Review of Motion Planning Techniques for Automated Vehicles, GonzΓ‘lez D. et al. (2016).
- A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles, Paden B. et al. (2016).
- Autonomous driving in urban environments: Boss and the Urban Challenge, Urmson C. et al. (2008).
- The MIT-Cornell collision and why it happened, Fletcher L. et al. (2008).
- Making bertha drive-an autonomous journey on a historic route, Ziegler J. et al. (2014).