We use deep learning to model interactions across two or more sets ofobjects, such as user-movie ratings or protein-drug bindings. The canonicalrepresentation of such interactions is a matrix (or tensor) with anexchangeability property: the encoding's meaning is not changed by permutingrows or columns. We argue that models should hence be Permutation Equivariant(PE): constrained to make the same predictions across such permutations. Wepresent a parameter-sharing scheme and prove that it could not be made any moreexpressive without violating PE. This scheme yields three benefits. First, wedemonstrate performance competitive with the state of the art on multiplematrix completion benchmarks. Second, our models require a number of parametersindependent of the numbers of objects, and thus scale well to large datasets.Third, models can be queried about new objects that were not available attraining time, but for which interactions have since been observed. We observedsurprisingly good generalization performance on this matrix extrapolation task,both within domains (e.g., new users and new movies drawn from the samedistribution used for training) and even across domains (e.g., predicting musicratings after training on movie ratings).