Title: "Learning a visuomotor controller by planning trajectories in simulation"

Short abstract:

The idea of my thesis is combining Reinforced Learning (RL) with planning. I am using robotics simulator V-REP, OMPL motion planning, and PyTorch for deep learning. One common problem is that deep learning approaches typically require a large amount of data to train a model. In particular, collecting training experience on the real robot is time consuming and requires human intervention. Although there have been advances in learning policies in simulation using Reinforcement Learning (RL), fundamental challenges remain, including sample inefficiency, slow convergence and hyperparameter tuning. Instead of using RL, I propose learning the value function or policy directly by planning the trajectories in simulation and generate training experience along these trajectories. A planner with full state feedback is used to find collision free trajectories for the robotic hand from a random initial configuration to reach a target configuration. I then use observations with actions along these trajectories to train a policy. Alternatively, I learn a q-function, where the planner calculates the value of a state (e.g. the number of steps to reach the target). Subsequently, I use RL to finetune the control policy.

Thesis proposal document:

Thesis proposal (11-12-18)

Thesis proposal defense presentation:

Thesis proposal presentation (12-12-18)

Thesis committee members:

Robert Platt (advisor), homepage

Christopher Amato (CCIS, Northeastern), homepage

Hanumant Singh (ECE, Northeastern), homepage

Kate Saenko (Boston University), homepage

Justification of the choice of the committee by advisor Robert Platt:

Christopher Amato is an expert on POMDPs, ML, and applications to

robotics. As such, he can ensure that our algorithmic contributions are


Hanumant Singh is an expert on field robotics. He can ensure that our

applications are strong.

Kate Saenko is an expert in computer vision. As such, she can

ensure that the perceptual aspects of Uli's work are strong.