Abstract
Antimicrobial peptides (AMPs) are a promising class of therapeutics to treat bacterial infections. Discovering and designing such peptides is difficult because of the vast number of possible sequences of amino acids. Deep generative models, such as variational autoencoders, have shown value in peptide design due to their ability to model sequence space with a continuous-valued latent space. Although such models have already been used to great effect in biomolecular design, they still suffer from a lack of interpretability and rigorous quantification of latent space quality as a search space. We investigate (1) whether searching through a dimensionally-reduced variant of the latent design space may facilitate optimization, (2) how organizing latent spaces with physicochemical properties may improve the efficiency of optimizing antimicrobial activity, and (3) the interpretability of the spaces. We find that employing a dimensionally-reduced version of the latent space is more interpretable and can be advantageous, while we can organize the latent space with different physicochemical properties even at different percentages of available labels. This work lays crucial groundwork for biophysically-motivated peptide design procedures.