Physically plausible spatially varying surface appearance estimated using the proposed SA-SVBRDF-net from a single photograph of planar spatially varying plastic (a,b), wood (c) and metal (d) captured unknown natural lighting, and revisualized under a novel lighting condition.
We present a convolutional neural network (CNN) based solution for modeling physically plausible spatially varying surface reflectance functions (SVBRDF) from a single photograph of a planar material sample under unknown natural illumination. Gathering a sufficiently large set of labeled training pairs consisting of photographs of SVBRDF samples and corresponding reflectance parameters, is a difficult and arduous process. To reduce the amount of required labeled training data, we propose to leverage the appearance information embedded in unlabeled images of spatially varying materials to self-augment the training process. Starting from a coarse network obtained from a small set of labeled training pairs, we estimate provisional model parameters for each unlabeled training exemplar. Given this provisional reflectance estimate, we then synthesize a novel temporary labeled training pair by rendering the exact corresponding image under a new lighting condition. After refining the network using these additional training samples, we re-estimate the provisional model parameters for the unlabeled data and repeat the self-augmentation process until convergence. We demonstrate the efficacy of the proposed network structure on spatially varying wood, metal, and plastics, as well as thoroughly validate the effectiveness of the self-augmentation training process.
KeywordsAppearance Modeling, SVBRDF, CNN Paper and videoTrained modelThe trained models for SVBRDF-net and SA-SVBRDF-net are provided in the Caffe CNN format. DatasetThe data includes both labeled and unlabeled images for training and evaluation and environment maps for rendering. Part of our training labeled training data was sourced from an online material repository [vray-materials.de], and the unlabeled training data is partially based on OpenSurfaces [2] Use of this training dataset is restricted to academic research purposes only. When using our dataset, please cite both our work [1], as well as OpenSurfaces [2] as below:
[1] Xiao Li, Yue Dong, Pieter Peers, Xin Tong. Modeling Surface Appearance from a Single Photograph using Self-augmented Convolutional Neural Networks. ACM Trans. Grp. 36, 4 (July 2017), 45:1-45:11 |
ImplementationOur implementation is based on Caffe, all the training scripts are included in the source code file. For all the downloads, a detailed readme.txt file is included in the compressed zip file. ErrataUnfortunately, we found an error in the network structure illustration (Figure 2 in the paper). The correct network structures can be found here, and the source code in the GitHub site is the correct implementation. AcknowledgementsWe would like to thank the reviewers for their constructive feedback, and the Beijing Film Academy for their help in creating the SVBRDF datasets. Pieter Peers was partially supported by NSF grant IIS-1350323. |