Abstract
Data from multiple environments offer valuable opportunities to uncovercausal relationships among variables. Leveraging the assumption that the causaloutcome model remains invariant across heterogeneous environments,state-of-the-art methods attempt to identify causal outcome models by learninginvariant prediction models and rely on exhaustive searches over all(exponentially many) covariate subsets. These approaches present two majorchallenges: 1) determining the conditions under which the invariant predictionmodel aligns with the causal outcome model, and 2) devising computationallyefficient causal discovery algorithms that scale polynomially, instead ofexponentially, with the number of covariates. To address both challenges, wefocus on the additive intervention regime and propose nearly necessary andsufficient conditions for ensuring that the invariant prediction model matchesthe causal outcome model. Exploiting the essentially necessary identifiabilityconditions, we introduce Negative Weight Distributionally Robust Optimization(NegDRO), a nonconvex continuous minimax optimization whose global optimizerrecovers the causal outcome model. Unlike standard group DRO problems thatmaximize over the simplex, NegDRO allows negative weights on environmentlosses, which break the convexity. Despite its nonconvexity, we demonstratethat a standard gradient method converges to the causal outcome model, and weestablish the convergence rate with respect to the sample size and the numberof iterations. Our algorithm avoids exhaustive search, making it scalableespecially when the number of covariates is large. The numerical resultsfurther validate the efficiency of the proposed method.