Published in ACM Transactions on Graphics, Volume 36, Issue 4 (SIGGRAPH 2017)

Modeling Surface Appearance from a Single Photograph using Self-augmented Convolutional Neural Networks

rendering resutls

        Physically plausible spatially varying surface appearance estimated using the proposed SA-SVBRDF-net from a single photograph of planar         spatially varying plastic (a,b), wood (c) and metal (d) captured unknown natural lighting, and revisualized under a novel lighting condition.


We present a convolutional neural network (CNN) based solution for modeling physically plausible spatially varying surface reflectance functions (SVBRDF) from a single photograph of a planar material sample under unknown natural illumination. Gathering a sufficiently large set of labeled training pairs consisting of photographs of SVBRDF samples and corresponding reflectance parameters, is a difficult and arduous process. To reduce the amount of required labeled training data, we propose to leverage the appearance information embedded in unlabeled images of spatially varying materials to self-augment the training process. Starting from a coarse network obtained from a small set of labeled training pairs, we estimate provisional model parameters for each unlabeled training exemplar. Given this provisional reflectance estimate, we then synthesize a novel temporary labeled training pair by rendering the exact corresponding image under a new lighting condition. After refining the network using these additional training samples, we re-estimate the provisional model parameters for the unlabeled data and repeat the self-augmentation process until convergence. We demonstrate the efficacy of the proposed network structure on spatially varying wood, metal, and plastics, as well as thoroughly validate the effectiveness of the self-augmentation training process.


Appearance Modeling, SVBRDF, CNN

Paper and video

Trained model

The trained models for SVBRDF-net and SA-SVBRDF-net are provided in the Caffe CNN format.


The data includes both labeled and unlabeled images for training and evaluation and environment maps for rendering. Part of our training labeled training data was sourced from an online material repository [], and the unlabeled training data is partially based on OpenSurfaces [2]

Use of this training dataset is restricted to academic research purposes only. When using our dataset, please cite both our work [1], as well as OpenSurfaces [2] as below:

[1] Xiao Li, Yue Dong, Pieter Peers, Xin Tong. Modeling Surface Appearance from a Single Photograph using Self-augmented Convolutional Neural Networks. ACM Trans. Grp. 36, 4 (July 2017), 45:1-45:11
[2] Sean Bell, Paul Upchurch, Noah Snavely, and Kavita Bala. 2013. OpenSurfaces: a richly annotated catalog of surface appearance. ACM Trans. Graph. 32, 4 (July 2013), 111:1–111:17.


Our implementation is based on Caffe, all the training scripts are included in the source code file.

For all the downloads, a detailed readme.txt file is included in the compressed zip file.


Unfortunately, we found an error in the network structure illustration (Figure 2 in the paper). The correct network structures can be found here, and the source code in the GitHub site is the correct implementation.


We would like to thank the reviewers for their constructive feedback, and the Beijing Film Academy for their help in creating the SVBRDF datasets. Pieter Peers was partially supported by NSF grant IIS-1350323.