We formulate the problem of sampling and recovering clustered graph signal asa multi-armed bandit (MAB) problem. This formulation lends naturally tolearning sampling strategies using the well-known gradient MAB algorithm. Inparticular, the sampling strategy is represented as a probability distributionover the individual arms of the MAB and optimized using gradient ascent. Someillustrative numerical experiments indicate that the sampling strategies basedon the gradient MAB algorithm outperform existing sampling methods.

