Towards provably efficient quantum algorithms for large-scale machine-learning models

Abstract

Large machine learning models are revolutionary technologies of artificialintelligence whose bottlenecks include huge computational expenses, power, andtime used both in the pre-training and fine-tuning process. In this work, weshow that fault-tolerant quantum computing could possibly provide provablyefficient resolutions for generic (stochastic) gradient descent algorithms,scaling as $\mathcal{O}(T^2 \times \text{polylog}(n))$, where $n$ is the sizeof the models and $T$ is the number of iterations in the training, as long asthe models are both sufficiently dissipative and sparse, with small learningrates. Based on earlier efficient quantum algorithms for dissipativedifferential equations, we find and prove that similar algorithms work for(stochastic) gradient descent, the primary algorithm for machine learning. Inpractice, we benchmark instances of large machine learning models from 7million to 103 million parameters. We find that, in the context of sparsetraining, a quantum enhancement is possible at the early stage of learningafter model pruning, motivating a sparse parameter download and re-uploadscheme. Our work shows solidly that fault-tolerant quantum algorithms couldpotentially contribute to most state-of-the-art, large-scale machine-learningproblems.

Quick Read (beta)

loading the full paper ...