We propose an end-to-end deep learning architecture that produces a 3D shapein triangular mesh from a single color image. Limited by the nature of deepneural network, previous methods usually represent a 3D shape in volume orpoint cloud, and it is non-trivial to convert them to the more ready-to-usemesh model. Unlike the existing methods, our network represents 3D mesh in agraph-based convolutional neural network and produces correct geometry byprogressively deforming an ellipsoid, leveraging perceptual features extractedfrom the input image. We adopt a coarse-to-fine strategy to make the wholedeformation procedure stable, and define various of mesh related losses tocapture properties of different levels to guarantee visually appealing andphysically accurate 3D geometry. Extensive experiments show that our method notonly qualitatively produces mesh model with better details, but also achieveshigher 3D shape estimation accuracy compared to the state-of-the-art.