Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model

Abstract

Many recent studies endeavor to improve open-source language models throughimitation learning, and re-training on the synthetic instruction data fromstate-of-the-art proprietary models like ChatGPT and GPT-4. However, the innatenature of synthetic data inherently contains noisy data, giving rise to asubstantial presence of low-quality data replete with erroneous responses, andflawed reasoning. Although we intuitively grasp the potential harm of noisydata, we lack a quantitative understanding of its impact. To this end, thispaper explores the correlation between the degree of noise and its impact onlanguage models through instruction tuning. We first introduce theFalsity-Controllable (FACO) dataset, which comprises pairs of true answers withcorresponding reasoning, as well as false pairs to manually control the falsityratio of the dataset.Through our extensive experiments, we found multipleintriguing findings of the correlation between the factuality of the datasetand instruction tuning: Specifically, we verified falsity of the instruction ishighly relevant to various benchmark scores. Moreover, when LLMs are trainedwith false instructions, they learn to lie and generate fake unfaithfulanswers, even though they know the correct answer for the user request.Additionally, we noted that once the language model is trained with a datasetcontaminated by noise, restoring its original performance is possible, but itfailed to reach full performance.

Quick Read (beta)

loading the full paper ...