Black Box Deployed -- Functional Criteria for Artificial Moral Agents in the LLM Era

Abstract

The advancement of powerful yet opaque large language models (LLMs)necessitates a fundamental revision of the philosophical criteria used toevaluate artificial moral agents (AMAs). Pre-LLM frameworks often relied on theassumption of transparent architectures, which LLMs defy due to theirstochastic outputs and opaque internal states. This paper argues thattraditional ethical criteria are pragmatically obsolete for LLMs due to thismismatch. Engaging with core themes in the philosophy of technology, this paperproffers a revised set of ten functional criteria to evaluate LLM-basedartificial moral agents: moral concordance, context sensitivity, normativeintegrity, metaethical awareness, system resilience, trustworthiness,corrigibility, partial transparency, functional autonomy, and moralimagination. These guideposts, applied to what we term "SMA-LLS" (SimulatingMoral Agency through Large Language Systems), aim to steer AMAs toward greateralignment and beneficial societal integration in the coming years. Weillustrate these criteria using hypothetical scenarios involving an autonomouspublic bus (APB) to demonstrate their practical applicability in morallysalient contexts.

Quick Read (beta)

loading the full paper ...