Measurement-based adaptation protocol with quantum reinforcement learning

Abstract

Machine learning employs dynamical algorithms that mimic the human capacityto learn, where the reinforcement learning ones are among the most similar tohumans in this respect. On the other hand, adaptability is an essential aspectto perform any task efficiently in a changing environment, and it isfundamental for many purposes, such as natural selection. Here, we propose analgorithm based on successive measurements to adapt one quantum state to areference unknown state, in the sense of achieving maximum overlap. Theprotocol naturally provides many identical copies of the reference state, suchthat in each measurement iteration more information about it is obtained. Inour protocol, we consider a system composed of three parts, the "environment"system, which provides the reference state copies; the register, which is anauxiliary subsystem that interacts with the environment to acquire informationfrom it; and the agent, which corresponds to the quantum state that is adaptedby digital feedback with input corresponding to the outcome of the measurementson the register. With this proposal we can achieve an average fidelity betweenthe environment and the agent of more than $90\% $ with less than $30$iterations of the protocol. In addition, we extend the formalism to $ d$-dimensional states, reaching an average fidelity of around $80\% $ in lessthan $400$ iterations for $d=$ 11, for a variety of genuinely quantum as wellas semiclassical states. This work paves the way for the development of quantumreinforcement learning protocols using quantum data, and the future deploymentof semi-autonomous quantum systems.

Quick Read (beta)

loading the full paper ...