Abstract
The detection of semantic and covariate out-of-distribution (OOD) examples isa critical yet overlooked challenge in digital pathology (DP). Recently,substantial insight and methods on OOD detection were presented by the MLcommunity, but how do they fare in DP applications? To this end, we establish abenchmark study, our highlights being: 1) the adoption of proper evaluationprotocols, 2) the comparison of diverse detectors in both a single andmulti-model setting, and 3) the exploration into advanced ML settings liketransfer learning (ImageNet vs. DP pre-training) and choice of architecture(CNNs vs. transformers). Through our comprehensive experiments, we contributenew insights and guidelines, paving the way for future research and discussion.