t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data

  • 2018-07-31 14:04:33
  • David M. Chan, Roshan Rao, Forrest Huang, John F. Canny
Modern datasets and models are notoriously difficult to explore and analyzedue to their inherent high dimensionality and massive numbers of samples.Existing visualization methods which employ dimensionality reduction to two orthree dimensions are often inefficient and/or ineffective for these datasets.This paper introduces t-SNE-CUDA, a GPU-accelerated implementation oft-distributed Symmetric Neighbor Embedding (t-SNE) for visualizing datasets andmodels. t-SNE-CUDA significantly outperforms current implementations with50-700x speedups on the CIFAR-10 and MNIST datasets. These speedups enable, forthe first time, visualization of the neural network activations on the entireImageNet dataset - a feat that was previously computationally intractable. Wealso demonstrate visualization performance in the NLP domain by visualizing theGloVe embedding vectors. From these visualizations, we can draw interestingconclusions about using the L2 metric in these embedding spaces. t-SNE-CUDA ispublicly available athttps://github.com/CannyLab/tsne-cuda


