Abstract
A common and controversial use of text-to-image models is to generatepictures by explicitly naming artists, such as "in the style of GregRutkowski". We introduce a benchmark for prompted-artist recognition:predicting which artist names were invoked in the prompt from the image alone.The dataset contains 1.95M images covering 110 artists and spans fourgeneralization settings: held-out artists, increasing prompt complexity,multiple-artist prompts, and different text-to-image models. We evaluatefeature similarity baselines, contrastive style descriptors, data attributionmethods, supervised classifiers, and few-shot prototypical networks.Generalization patterns vary: supervised and few-shot models excel on seenartists and complex prompts, whereas style descriptors transfer better when theartist's style is pronounced; multi-artist prompts remain the most challenging.Our benchmark reveals substantial headroom and provides a public testbed toadvance the responsible moderation of text-to-image models. We release thedataset and benchmark to foster further research:https://graceduansu.github.io/IdentifyingPromptedArtists/