Genomic selection (GS) is a technique that plant breeders use to selectindividuals to mate and produce new generations of species. Allocation ofresources is a key factor in GS. At each selection cycle, breeders are facingthe choice of budget allocation to make crosses and produce the next generationof breeding parents. Inspired by recent advances in reinforcement learning forAI problems, we develop a reinforcement learning-based algorithm toautomatically learn to allocate limited resources across different generationsof breeding. We mathematically formulate the problem in the framework of MarkovDecision Process (MDP) by defining state and action spaces. To avoid theexplosion of the state space, an integer linear program is proposed thatquantifies the trade-off between resources and time. Finally, we propose avalue function approximation method to estimate the action-value function andthen develop a greedy policy improvement technique to find the optimalresources. We demonstrate the effectiveness of the proposed method in enhancinggenetic gain using a case study with realistic data.