### Abstract

Hamiltonian simulation is believed to be one of the first tasks where quantumcomputers can yield a quantum advantage. One of the most popular methods ofHamiltonian simulation is Trotterization, which makes use of the approximation$e^{i\sum_jA_j}\sim \prod_je^{iA_j}$ and higher-order corrections thereto.However, this leaves open the question of the order of operations (i.e. theorder of the product over $j$, which is known to affect the quality ofapproximation). In some cases this order is fixed by the desire to minimise theerror of approximation; when it is not the case, we propose that the order canbe chosen to optimize compilation to a native quantum architecture. Thispresents a new compilation problem -- order-agnostic quantum circuitcompilation -- which we prove is NP-hard in the worst case. In lieu of aneasily-computable exact solution, we turn to methods of heuristic optimizationof compilation. We focus on reinforcement learning due to the sequential natureof the compilation task, comparing it to simulated annealing and Monte Carlotree search. While two of the methods outperform a naive heuristic,reinforcement learning clearly outperforms all others, with a gain of around12% with respect to the second-best method and of around 50% compared to thenaive heuristic in terms of the gate count. We further test the ability of RLto generalize across instances of the compilation problem, and find that asingle learner is able to solve entire problem families. This demonstrates theability of machine learning techniques to provide assistance in anorder-agnostic quantum compilation task.