Cross Embodiment Robot Manipulation Skill Transfer from Cycle Consistency

This paper focuses on transferring control policies between robot manipulators with different morphology. While reinforcement learning (RL) methods have shown successful results in performing robot manipulation tasks, transferring a trained policy from simulation to a real robot or deploying it on a robot with different kinematics or dynamics is challenging. Our key insight to achieve cross embodiment policy transfer is to project the state and action spaces of the source and target robots into a common latent space representation. We first introduce encoders and decoders for the source robot for state-action projection between its own space and a latent space. To regularize the latent space such that latent state evolution remains consistent, we introduce a latent dynamics constraint. In this stage, the encoders, decoders and latent dynamics are trained simultaneously with RL. Next, we use adversarial training with a cycle consistency constraint to align the latent distributions from the source and target domains using unpaired, unaligned, randomly collected data. The latent policy trained in the first stage is combined with the encoders and decoders trained in the second stage to achieve policy transfer to the target robot without access to the task reward or reward tuning in the target domain.

Cross-Embodiment Robot Manipulation Skill Transfer Using Latent Space Alignment

Abstract

Latent Alignment losses:

Cross Embodiment Skill Transfer in Simulation:

Cross Embodiment Skill Transfer from Sim to Real:

Sim2Real Transfer Demo