The performance of the code generated by a compiler depends on the order inwhich the optimization passes are applied. In high-level synthesis, the qualityof the generated circuit relates directly to the code generated by thefront-end compiler. Choosing a good order--often referred to as thephase-ordering problem--is an NP-hard problem. In this paper, we evaluate a newtechnique to address the phase-ordering problem: deep reinforcement learning.We implement a framework in the context of the LLVM compiler to optimize theordering for HLS programs and compare the performance of deep reinforcementlearning to state-of-the-art algorithms that address the phase-orderingproblem. Overall, our framework runs one to two orders of magnitude faster thanthese algorithms, and achieves a 16% improvement in circuit performance overthe -O3 compiler flag.