Abstract
Despite the extent of recent advances in Machine Learning (ML) and NeuralNetworks, providing formal guarantees on the behavior of these systems is stillan open problem, and a crucial requirement for their adoption in regulated orsafety-critical scenarios. We consider the task of training differentiable MLmodels guaranteed to satisfy designer-chosen properties, stated as input-outputimplications. This is very challenging, due to the computational complexity ofrigorously verifying and enforcing compliance in modern neural models. Weprovide an innovative approach based on three components: 1) a general, simplearchitecture enabling efficient verification with a conservative semantic; 2) arigorous training algorithm based on the Projected Gradient Method; 3) aformulation of the problem of searching for strong counterexamples. Theproposed framework, being only marginally affected by model complexity, scaleswell to practical applications, and produces models that provide full propertysatisfaction guarantees. We evaluate our approach on properties defined bylinear inequalities in regression, and on mutually exclusive classes inmultilabel classification. Our approach is competitive with a baseline thatincludes property enforcement during preprocessing, i.e. on the training data,as well as during postprocessing, i.e. on the model predictions. Finally, ourcontributions establish a framework that opens up multiple research directionsand potential improvements.