Pragmatic Inference Chain (PIC) Improving LLMs' Reasoning of Authentic Implicit Toxic Language

  • 2025-08-21 10:49:45
  • Xi Chen, Shuo Wang
  • 0

Abstract

The rapid development of large language models (LLMs) gives rise to ethicalconcerns about their performance, while opening new avenues for developingtoxic language detection techniques. However, LLMs' unethical output and theircapability of detecting toxicity have primarily been tested on language datathat do not demand complex meaning inference, such as the biased associationsof 'he' with programmer and 'she' with household. Nowadays toxic languageadopts a much more creative range of implicit forms, thanks to advancedcensorship. In this study, we collect authentic toxic interactions that evadeonline censorship and that are verified by human annotators asinference-intensive. To evaluate and improve LLMs' reasoning of the authenticimplicit toxic language, we propose a new prompting method, Pragmatic InferenceChain (PIC), drawn on interdisciplinary findings from cognitive science andlinguistics. The PIC prompting significantly improves the success rate ofGPT-4o, Llama-3.1-70B-Instruct, DeepSeek-v2.5, and DeepSeek-v3 in identifyingimplicit toxic language, compared to five baseline prompts, such as CoT andrule-based baselines. In addition, it also facilitates the models to producemore explicit and coherent reasoning processes, hence can potentially begeneralized to other inference-intensive tasks, e.g., understanding humour andmetaphors.

 

Quick Read (beta)

loading the full paper ...