Abstract
Given an input painting, we reconstruct a time-lapse video of how it may havebeen painted. We formulate this as an autoregressive image generation problem,in which an initially blank "canvas" is iteratively updated. The model learnsfrom real artists by training on many painting videos. Our approachincorporates text and region understanding to define a set of painting"instructions" and updates the canvas with a novel diffusion-based renderer.The method extrapolates beyond the limited, acrylic style paintings on which ithas been trained, showing plausible results for a wide range of artistic stylesand genres.
Quick Read (beta)
loading the full paper ...