Scenic: A Language for Scenario Specification and Data Generation

  • 2020-10-13 17:58:31
  • Daniel J. Fremont, Edward Kim, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, Sanjit A. Seshia
  • 1

Abstract

We propose a new probabilistic programming language for the design andanalysis of cyber-physical systems, especially those based on machine learning.Specifically, we consider the problems of training a system to be robust torare events, testing its performance under different conditions, and debuggingfailures. We show how a probabilistic programming language can help addressthese problems by specifying distributions encoding interesting types ofinputs, then sampling these to generate specialized training and test data.More generally, such languages can be used to write environment models, anessential prerequisite to any formal analysis. In this paper, we focus onsystems like autonomous cars and robots, whose environment at any point in timeis a 'scene', a configuration of physical objects and agents. We design adomain-specific language, Scenic, for describing scenarios that aredistributions over scenes and the behaviors of their agents over time. As aprobabilistic programming language, Scenic allows assigning distributions tofeatures of the scene, as well as declaratively imposing hard and softconstraints over the scene. We develop specialized techniques for sampling fromthe resulting distribution, taking advantage of the structure provided byScenic's domain-specific syntax. Finally, we apply Scenic in a case study on aconvolutional neural network designed to detect cars in road images, improvingits performance beyond that achieved by state-of-the-art synthetic datageneration methods.

 

Quick Read (beta)

loading the full paper ...