Abstract
Determining consumer preferences and utility is a foundational challenge ineconomics. They are central in determining consumer behaviour through theutility-maximising consumer decision-making process. However, preferences andutilities are not observable and may not even be known to the individual makingthe choice; only the outcome is observed in the form of demand. Without theability to observe the decision-making mechanism, demand estimation becomes achallenging task and current methods fall short due to lack of scalability orability to identify causal effects. Estimating these effects is critical whenconsidering changes in policy, such as pricing, the impact of taxes andsubsidies, and the effect of a tariff. To address the shortcomings of existingmethods, we combine revealed preference theory and inverse reinforcementlearning to present a novel algorithm, Preference Extraction and RewardLearning (PEARL) which, to the best of our knowledge, is the only algorithmthat can uncover a representation of the utility function that bestrationalises observed consumer choice data given a specified functional form.We introduce a flexible utility function, the Input-Concave Neural Networkwhich captures complex relationships across goods, including cross-priceelasticities. Results show PEARL outperforms the benchmark on both noise-freeand noisy synthetic data.