LifeTox: Unveiling Implicit Toxicity in Life Advice

  • 2024-03-19 03:20:50
  • Minbeom Kim, Jahyun Koo, Hwanhee Lee, Joonsuk Park, Hwaran Lee, Kyomin Jung
As large language models become increasingly integrated into daily life,detecting implicit toxicity across diverse contexts is crucial. To this end, weintroduce LifeTox, a dataset designed for identifying implicit toxicity withina broad range of advice-seeking scenarios. Unlike existing safety datasets,LifeTox comprises diverse contexts derived from personal experiences throughopen-ended questions. Experiments demonstrate that RoBERTa fine-tuned onLifeTox matches or surpasses the zero-shot performance of large language modelsin toxicity classification tasks. These results underscore the efficacy ofLifeTox in addressing the complex challenges inherent in implicit toxicity. Weopen-sourced thedataset\footnote{\url{}} and theLifeTox moderator family; 350M, 7B, and 13B.


