CosmoFlow: Using Deep Learning to Learn the Universe at Scale

  • 2018-08-14 14:54:37
  • Amrita Mathuriya, Deborah Bard, Peter Mendygral, Lawrence Meadows, James Arnemann, Lei Shao, Siyu He, Tuomas Karna, Daina Moise, Simon J. Pennycook, Kristyn Maschoff, Jason Sewall, Nalini Kumar, Shirley Ho, Mike Ringenburg, Prabhat, Victor Lee
  • 26

Abstract

Deep learning is a promising tool to determine the physical model thatdescribes our universe. To handle the considerable computational cost of thisproblem, we present CosmoFlow: a highly scalable deep learning applicationbuilt on top of the TensorFlow framework. CosmoFlow uses efficientimplementations of 3D convolution and pooling primitives, together withimprovements in threading for many element-wise operations, to improve trainingperformance on Intel(C) Xeon Phi(TM) processors. We also utilize the Cray PEMachine Learning Plugin for efficient scaling to multiple nodes. We demonstratefully synchronous data-parallel training on 8192 nodes of Cori with 77%parallel efficiency, achieving 3.5 Pflop/s sustained performance. To ourknowledge, this is the first large-scale science application of the TensorFlowframework at supercomputer scale with fully-synchronous training. Theseenhancements enable us to process large 3D dark matter distribution and predictthe cosmological parameters $\Omega_M$, $\sigma_8$ and n$_s$ with unprecedentedaccuracy.

 

Quick Read (beta)

loading the full paper ...