Learning User Preferences for Image Generation Model

Abstract

User preference prediction requires a comprehensive and accurateunderstanding of individual tastes. This includes both surface-levelattributes, such as color and style, and deeper content-related aspects, suchas themes and composition. However, existing methods typically rely on generalhuman preferences or assume static user profiles, often neglecting individualvariability and the dynamic, multifaceted nature of personal taste. To addressthese limitations, we propose an approach built upon Multimodal Large LanguageModels, introducing contrastive preference loss and preference tokens to learnpersonalized user preferences from historical interactions. The contrastivepreference loss is designed to effectively distinguish between user ''likes''and ''dislikes'', while the learnable preference tokens capture shared interestrepresentations among existing users, enabling the model to activategroup-specific preferences and enhance consistency across similar users.Extensive experiments demonstrate our model outperforms other methods inpreference prediction accuracy, effectively identifying users with similaraesthetic inclinations and providing more precise guidance for generatingimages that align with individual tastes. The project page is\texttt{https://learn-user-pref.github.io/}.

Quick Read (beta)

loading the full paper ...