Visual content often contains recurring elements. Text is made up of glyphsfrom the same font, animations, such as cartoons or video games, are composedof sprites moving around the screen, and natural videos frequently haverepeated views of objects. In this paper, we propose a deep learning approachfor obtaining a graphically disentangled representation of recurring elementsin a completely self-supervised manner. By jointly learning a dictionary oftexture patches and training a network that places them onto a canvas, weeffectively deconstruct sprite-based content into a sparse, consistent, andinterpretable representation that can be easily used in downstream tasks. Ourframework offers a promising approach for discovering recurring patterns inimage collections without supervision.