Policy learning in SE(3) action spaces

In the spatial action representation, the action space spans the space of target poses for robot motion commands, i.e. SE(2) or SE(3). This approach has been used to solve challenging robotic manipulation problems and shows promise. However, the method is often limited to a three dimensional action space and short horizon tasks. This paper proposes ASRSE3, a new method for handling higher dimensional spatial action spaces that transforms an original MDP with high dimensional action space into a new MDP with reduced action space and augmented state space. We also propose SDQfD, a variation of DQfD designed for large action spaces. ASRSE3 and SDQfD are evaluated in the context of a set of challenging block construction tasks. We show that both methods outperform standard baselines and can be used in practice on real robotics systems.

alg

Our model learns to select action sequentially. q1 maps the heightmap image onto a |X| X |Y| map of Q values with a maximum at axy. Given that selection of axy, q2 predicts Q values for atheta. Similarly, q3, q4, and q5 predicts Q values for az, aphi, and apsi, respectively.

Paper

arXiv:2010.02798[cs.RO]

Dian Wang, Colin Kohler, Robert Platt

Northeastern University

@article{wang2020policy,
  title={Policy learning in SE (3) action spaces},
  author={Wang, Dian and Kohler, Colin and Platt, Robert},
  journal={arXiv preprint arXiv:2010.02798},
  year={2020}
}

Video

Presentation

Code

Agent: https://github.com/pointW/asrse3_corl20

Environment: https://github.com/ColinKohler/helping_hands_rl_envs

Contact

If you have any questions, please feel free to contact Dian Wang at wang[dot]dian[at]northeastern[dot]edu.