Being able to learn dense semantic representations of images withoutsupervision is an important problem in computer vision. However, despite itssignificance, this problem remains rather unexplored, with a few exceptionsthat considered unsupervised semantic segmentation on small-scale datasets witha narrow visual domain. In this paper, we make a first attempt to tackle theproblem on datasets that have been traditionally utilized for the supervisedcase. To achieve this, we introduce a novel two-step framework that adopts apredetermined prior in a contrastive optimization objective to learn pixelembeddings. This marks a large deviation from existing works that relied onproxy tasks or end-to-end clustering. Additionally, we argue about theimportance of having a prior that contains information about objects, or theirparts, and discuss several possibilities to obtain such a prior in anunsupervised manner. Extensive experimental evaluation shows that the proposed method comes withkey advantages over existing works. First, the learned pixel embeddings can bedirectly clustered in semantic groups using K-Means. Second, the method canserve as an effective unsupervised pre-training for the semantic segmentationtask. In particular, when fine-tuning the learned representations using just 1%of labeled examples on PASCAL, we outperform supervised ImageNet pre-trainingby 7.1% mIoU. The code is available athttps://github.com/wvangansbeke/Unsupervised-Semantic-Segmentation.