Abstract
During the 2017 NBA playoffs, Celtics coach Brad Stevens was faced with adifficult decision when defending against the Cavaliers: "Do you double andrisk giving up easy shots, or stay at home and do the best you can?" It's atough call, but finding a good defensive strategy that effectively incorporatesdoubling can make all the difference in the NBA. In this paper, we analyzedouble teaming in the NBA, quantifying the trade-off between risk and reward.Using player trajectory data pertaining to over 643,000 possessions, weidentified when the ball handler was double teamed. Given these data and thecorresponding outcome (i.e., was the defense successful), we used deepreinforcement learning to estimate the quality of the defensive actions. Wepresent qualitative and quantitative results summarizing our learned defensivestrategy for defending. We show that our policy value estimates are predictiveof points per possession and win percentage. Overall, the proposed frameworkrepresents a step toward a more comprehensive understanding of defensivestrategies in the NBA.