In the spatial action representation, the action space spans the space of target poses for robot motion commands, i.e. SE(2) or SE(3). This approach has been used to solve challenging robotic manipulation problems and shows promise. However, the method is often limited to a three dimensional action space and short horizon tasks. This paper proposes ASRSE3, a new method for handling higher dimensional spatial action spaces that transforms an original MDP with high dimensional action space into a new MDP with reduced action space and augmented state space. We also propose SDQfD, a variation of DQfD designed for large action spaces. ASRSE3 and SDQfD are evaluated in the context of a set of challenging block construction tasks. We show that both methods outperform standard baselines and can be used in practice on real robotics systems.
Our model learns to select action sequentially. q1 maps the heightmap image onto a |X| X |Y| map of Q values with a maximum at axy. Given that selection of axy, q2 predicts Q values for atheta. Similarly, q3, q4, and q5 predicts Q values for az, aphi, and apsi, respectively.
Dian Wang, Colin Kohler, Robert Platt
Northeastern University
@article{wang2020policy,
title={Policy learning in SE (3) action spaces},
author={Wang, Dian and Kohler, Colin and Platt, Robert},
journal={arXiv preprint arXiv:2010.02798},
year={2020}
}
Agent: https://github.com/pointW/asrse3_corl20
Environment: https://github.com/ColinKohler/helping_hands_rl_envs
If you have any questions, please feel free to contact Dian Wang at wang[dot]dian[at]northeastern[dot]edu.