The Functional Correspondence Problem

Zihang Lai*, Senthil Purushwalkam*, Abhinav Gupta

Carnegie Mellon University

* Authors contribued equally

Abstract

The ability to find correspondences in visual data is the essence of most computer vision tasks. But what are the right correspondences? The task of visual correspondence is well defined for two different images of same object instance. In case of two images of objects belonging to same category, visual correspondence is reasonably well-defined in most cases. But what about correspondence between two objects of completely different category -- e.g., a shoe and a bottle? Does there exist any correspondence? Inspired by humans' ability to: (a) generalize beyond semantic categories and; (b) infer functional affordances, we introduce the problem of functional correspondences in this paper. Given images of two objects, we ask a simple question: what is the set of correspondences between these two images for a given task? For example, what are the correspondences between a bottle and shoe for the task of pounding or the task of pouring. We introduce a new dataset: FunKPoint that has ground truth correspondences for 10 tasks and 20 object categories. We also introduce a modular task-driven representation for attacking this problem and demonstrate that our learned representation is effective for this task. But most importantly, because our supervision signal is not bound by semantics, we show that our learned representation can generalize better on few-shot classification problem. We hope this paper will inspire our community to think beyond semantics and focus more on cross-category generalization and learning representations for robotics tasks.

@inproceedings{lai2021functional,
  title={The functional correspondence problem},
  author={Lai, Zihang and Purushwalkam, Senthil and Gupta, Abhinav},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={15772--15781},
  year={2021}
}
					
Downloads
Please contact zihang.lai at gmail.com if you have any questions.
Acknowledgements
This research is supported by grants from ONR MURI, the ONR Young Investigator Award to Abhinav Gupta and the DAPRA MCS award.