Abstract
We propose a new automated digital painting framework, based on a paintingagent trained through reinforcement learning. To synthesize an image, the agentselects a sequence of continuous-valued actions representing primitive paintingstrokes, which are accumulated on a digital canvas. Action selection is guidedby a given reference image, which the agent attempts to replicate subject tothe limitations of the action space and the agent's learned policy. Thepainting agent policy is determined using a variant of proximal policyoptimization reinforcement learning. During training, our agent is presentedwith patches sampled from an ensemble of reference images. To acceleratetraining convergence, we adopt a curriculum learning strategy, wherebyreference patches are sampled according to how challenging they are using thecurrent policy. We experiment with differing loss functions, includingpixel-wise and perceptual loss, which have consequent differing effects on thelearned policy. We demonstrate that our painting agent can learn an effectivepolicy with a high dimensional continuous action space comprising pen pressure,width, tilt, and color, for a variety of painting styles. Through acoarse-to-fine refinement process our agent can paint arbitrarily compleximages in the desired style.