Abstract
Proteins are essential for almost all biological processes and derive theirdiverse functions from complex 3D structures, which are in turn determined bytheir amino acid sequences. In this paper, we exploit the rich biologicalinductive bias of amino acid sequences and introduce FoldFlow-2, a novelsequence-conditioned SE(3)-equivariant flow matching model for proteinstructure generation. FoldFlow-2 presents substantial new architecturalfeatures over the previous FoldFlow family of models including a protein largelanguage model to encode sequence, a new multi-modal fusion trunk that combinesstructure and sequence representations, and a geometric transformer baseddecoder. To increase diversity and novelty of generated samples -- crucial forde-novo drug design -- we train FoldFlow-2 at scale on a new dataset that is anorder of magnitude larger than PDB datasets of prior works, containing bothknown proteins in PDB and high-quality synthetic structures achieved throughfiltering. We further demonstrate the ability to align FoldFlow-2 to arbitraryrewards, e.g. increasing secondary structures diversity, by introducing aReinforced Finetuning (ReFT) objective. We empirically observe that FoldFlow-2outperforms previous state-of-the-art protein structure-based generativemodels, improving over RFDiffusion in terms of unconditional generation acrossall metrics including designability, diversity, and novelty across all proteinlengths, as well as exhibiting generalization on the task of equilibriumconformation sampling. Finally, we demonstrate that a fine-tuned FoldFlow-2makes progress on challenging conditional design tasks such as designingscaffolds for the VHH nanobody.