Yirui Wang
Xiuwei Xu
Angyuan Ma
Bingyao Yu
Jie Zhou
Jiwen Lu
Tsinghua University
Paper (arXiv)
Code (Coming Soon)
Manipulation policies deployed in uncontrolled real-world scenarios are faced with great in-category geometric diversity of everyday objects. In order to function robustly under such variations, policies need to work in a category-level manner, i.e. knowing how to interact with any object in a certain category, instead of only a specific one seen during training. This in-category generalizability is usually nurtured with shape-diversified training data;
however, manually collecting such a corpus of data is infeasible due to the requirement of intense human labor and large collections of divergent objects at hand. In this paper, we propose ShapeGen, a data generation method that aims at generating shape-variated manipulation data in a simulator-free and 3D manner. ShapeGen decomposes the process into two stages: Shape Library curation and Function-Aware Generation. In the first stage, we train spatial warpings between shapes mapping points to points that correspond functionally, and aggregate 3D models along with the warpings into a plug-and-play Shape Library. In the second stage, we design a pipeline that, leveraging established Libraries, requires only minimal human annotation to generate physically plausible and functionally correct novel demonstrations. Experiments in the real world demonstrate the effectiveness of ShapeGen to boost policies' in-category shape generalizability.
Shape Library Curation. To generate data, ShapeGen first curate a Shape Library for each relevant category of object that contains not only 3D shapes, but also spatial warpings to them from a common template shape.
Function-Aware Generation. Given a sequence of raw observation-action pairs, we ask an annotator to recognize keypoints on the actual objects used, designate methods for inter-shape alignment, then uses the established Shape Library to automatically calculate optimal alignment and correct action.
Visualization of Shape Libraries. Visualizations of categories kettle, mug, hammer and scissors are provided. In each category, points of the same color are mapped from a same point on the common template shape.
Qualitative comparison with feature-matching method. Feature-matching is prone to imprecise single-point matches; examples are highlighted with red arrows. Corresponding points are marked with the same color.