Abstract
Machine and deep learning have grown in popularity and use in biologicalresearch over the last decade but still present challenges in interpretabilityof the fitted model. The development and use of metrics to determine featuresdriving predictions and increase model interpretability continues to be an openarea of research. We investigate the use of Shapley Additive Explanations(SHAP) on a multi-view deep learning model applied to multi-omics data for thepurposes of identifying biomolecules of interest. Rankings of features viathese attribution methods are compared across various architectures to evaluateconsistency of the method. We perform multiple computational experiments toassess the robustness of SHAP and investigate modeling approaches anddiagnostics to increase and measure the reliability of the identification ofimportant features. Accuracy of a random-forest model fit on subsets offeatures selected as being most influential as well as clustering quality usingonly these features are used as a measure of effectiveness of the attributionmethod. Our findings indicate that the rankings of features resulting from SHAPare sensitive to the choice of architecture as well as different randominitializations of weights, suggesting caution when using attribution methodson multi-view deep learning models applied to multi-omics data. We present analternative, simple method to assess the robustness of identification ofimportant biomolecules.