A fully automated object reconstruction pipeline is crucial for digitalcontent creation. While the area of 3D reconstruction has witnessed profounddevelopments, the removal of background to obtain a clean object model stillrelies on different forms of manual labor, such as bounding box labeling, maskannotations, and mesh manipulations. In this paper, we propose a novelframework named AutoRecon for the automated discovery and reconstruction of anobject from multi-view images. We demonstrate that foreground objects can berobustly located and segmented from SfM point clouds by leveragingself-supervised 2D vision transformer features. Then, we reconstruct decomposedneural scene representations with dense supervision provided by the decomposedpoint clouds, resulting in accurate object reconstruction and segmentation.Experiments on the DTU, BlendedMVS and CO3D-V2 datasets demonstrate theeffectiveness and robustness of AutoRecon.