ARCH: Hierarchical Hybrid Learning for Long-Horizon Contact-Rich Robotic Assembly

1Stanford University, 2MIT, 3University of Michigan, 4Autodesk Research



Abstract

Generalizable long-horizon robotic assembly requires reasoning at multiple levels of abstraction. End-to-end imitation learning (IL) has been proven a promising approach, but it requires a large amount of demonstration data for training and often fails to meet the high-precision requirement of assembly tasks. Reinforcement Learning (RL) approaches have succeeded in high-precision assembly tasks, but suffer from sample inefficiency and hence, are less competent at long-horizon tasks. To address these challenges, we propose a hierarchical modular approach, named ARCH (Adaptive Robotic Compositional Hierarchy), which enables long-horizon high-precision assembly in contact-rich settings. ARCH employs a hierarchical planning framework, including a low-level primitive library of continuously parameterized skills and a high-level policy. The low-level primitive library includes essential skills for assembly tasks, such as grasping and inserting. These primitives consist of both RL and model-based controllers. The high-level policy, learned via imitation learning from a handful of demonstrations, selects the appropriate primitive skills and instantiates them with continuous input parameters. We extensively evaluate our approach on a real robot manipulation platform. We show that while trained on a single task, ARCH generalizes well to unseen tasks and outperforms baseline methods in terms of success rate and data efficiency.


Overview of ARCH

method-imag
method-image

We propose a hierarchical framework for long-horizon robotic assembly. The high-level policy, which takes as input object pose from pose estimation and robot proprioception, and outputs a categorical distribution to select the best low-level primitive as well as its continuous parameters, is obtained via imitation learning. The low-level policy executes the selected primitive using either an RL-based or a motion-planned (MP) policy. We train in simulation an RL policy for the contact-rich portion of the task, e.g. insertion, based on force-torque feedback. We use the MP policies for primitives that are in free space, e.g. move.


Generalization to Unseen Objects