Abstract
Causal understanding is essential for many kinds of decision-making, butcausal inference from observational data has typically only been applied tostructured, low-dimensional datasets. While text classifiers producelow-dimensional outputs, their use in causal inference has not previously beenstudied. To facilitate causal analyses based on language data, we consider therole that text classifiers can play in causal inference through establishedmodeling mechanisms from the causality literature on missing data andmeasurement error. We demonstrate how to conduct causal analyses using textclassifiers on simulated and Yelp data, and discuss the opportunities andchallenges of future work that uses text data in causal inference.
Quick Read (beta)
loading the full paper ...