Reliable identification of molecular biomarkers is essential for accuratepatient stratification. While state-of-the-art machine learning approaches forsample classification continue to push boundaries in terms of performance, mostof these methods are not able to integrate different data types and lackgeneralization power limiting their application in a clinical setting.Furthermore, many methods behave as black boxes, and we have very littleunderstanding about the mechanisms that lead to the prediction provided. Whileopaqueness concerning machine behaviour might not be a problem in deterministicdomains, in health care, providing explanations about the molecular factors andphenotypes that are driving the classification is crucial to build trust in theperformance of the predictive system. We propose Pathway Induced MultipleKernel Learning (PIMKL), a novel methodology to reliably classify samples thatcan also help gain insights into the molecular mechanisms that underlie theclassification. PIMKL exploits prior knowledge in the form of a molecularinteraction network and annotated gene sets, by optimizing a mixture ofpathway-induced kernels using a Multiple Kernel Learning (MKL) algorithm, anapproach that has demonstrated excellent performance in different machinelearning applications. After optimizing the combination of kernels forprediction of a specific phenotype, the model provides a stable molecularsignature that can be interpreted in the light of the ingested prior knowledgeand that can be used in transfer learning tasks.