Abstract
Deep learning models were frequently reported to learn from shortcuts likedataset biases. As deep learning is playing an increasingly important role inthe modern healthcare system, it is of great need to combat shortcut learningin medical data as well as develop unbiased and trustworthy models. In thispaper, we study the problem of developing debiased chest X-ray diagnosis modelsfrom the biased training data without knowing exactly the bias labels. We startwith the observations that the imbalance of bias distribution is one of the keyreasons causing shortcut learning, and the dataset biases are preferred by themodel if they were easier to be learned than the intended features. Based onthese observations, we proposed a novel algorithm, pseudo bias-balancedlearning, which first captures and predicts per-sample bias labels viageneralized cross entropy loss and then trains a debiased model using pseudobias labels and bias-balanced softmax function. We constructed several chestX-ray datasets with various dataset bias situations and demonstrated withextensive experiments that our proposed method achieved consistent improvementsover other state-of-the-art approaches.