LifeTox: Unveiling Implicit Toxicity in Life Advice

Abstract

As large language models become increasingly integrated into daily life,detecting implicit toxicity across diverse contexts is crucial. To this end, weintroduce LifeTox, a dataset designed for identifying implicit toxicity withina broad range of advice-seeking scenarios. Unlike existing safety datasets,LifeTox comprises diverse contexts derived from personal experiences throughopen-ended questions. Experiments demonstrate that RoBERTa fine-tuned onLifeTox matches or surpasses the zero-shot performance of large language modelsin toxicity classification tasks. These results underscore the efficacy ofLifeTox in addressing the complex challenges inherent in implicit toxicity. Weopen-sourced thedataset\footnote{\url{https://huggingface.co/datasets/mbkim/LifeTox}} and theLifeTox moderator family; 350M, 7B, and 13B.

Quick Read (beta)

loading the full paper ...