Find Paper, Faster
Example:10.1021/acsami.1c06204 or Chem. Rev., 2007, 107, 2411-2502
Learning Spherical Convolution for 360 Recognition.
IEEE Transactions on Pattern Analysis and Machine Intelligence  (IF16.389),  Pub Date : 2021-09-20, DOI: 10.1109/tpami.2021.3113612
Yu-Chuan Su,Kristen Grauman

While 360 cameras offer tremendous new possibilities in vision, graphics, and augmented reality, the spherical images they produce make visual recognition non-trivial. Ideally, 360 imagery could inherit the convolutional neural networks (CNNs) trained with great success on perspective projection images. However, existing methods to transfer CNNs from perspective to spherical images introduce significant computational costs and/or degradations in accuracy. We propose to learn a Spherical Convolution Network (SphConv) that translates a planar CNN to the equirectangular projection of 360 images. Given a source CNN for perspective images, SphConv learns to reproduce the flat filter outputs on 360 data. The key benefits are 1) efficient and accurate recognition for 360 images, and 2) the ability to leverage pre-trained networks for perspective images. We propose two instantiations of SphConv---Spherical Kernel, which learns location dependent kernels on the sphere, and Kernel Transformer Network, which learns a functional transformation that generates SphConv from the source CNN. Validating our approach with multiple source CNNs and datasets, we show that it successfully preserves the source CNN's accuracy, while offering efficiency, transferability, and scalability to typical image resolutions. We further introduce a spherical Faster R-CNN based on SphConv and show that we can learn a spherical object detector without any object annotations in 360 images.