We study the problem of inferring an object-centric scene representation froma single image, aiming to derive a representation that explains the imageformation process, captures the scene's 3D nature, and is learned withoutsupervision. Most existing methods on scene decomposition lack one or more ofthese characteristics, due to the fundamental challenge in integrating thecomplex 3D-to-2D image formation process into powerful inference schemes likedeep networks. In this paper, we propose unsupervised discovery of ObjectRadiance Fields (uORF), integrating recent progresses in neural 3D scenerepresentations and rendering with deep inference networks for unsupervised 3Dscene decomposition. Trained on multi-view RGB images without annotations, uORFlearns to decompose complex scenes with diverse, textured background from asingle image. We show that uORF performs well on unsupervised 3D scenesegmentation, novel view synthesis, and scene editing on three datasets.