Reliable Distributed Clustering with Redundant Data Assignment

  • 2020-02-20 17:44:37
  • Venkata Gandikota, Arya Mazumdar, Ankit Singh Rawat
  • 2

Abstract

In this paper, we present distributed generalized clustering algorithms thatcan handle large scale data across multiple machines in spite of straggling orunreliable machines. We propose a novel data assignment scheme that enables usto obtain global information about the entire data even when some machines failto respond with the results of the assigned local computations. The assignmentscheme leads to distributed algorithms with good approximation guarantees for avariety of clustering and dimensionality reduction problems.

 

Quick Read (beta)

loading the full paper ...