In this study, we analyze and compare the performance of state-of-the-artdeep reinforcement learning algorithms for solving the supply chain inventorymanagement problem. This complex sequential decision-making problem consists ofdetermining the optimal quantity of products to be produced and shipped acrossdifferent warehouses over a given time horizon. In particular, we present amathematical formulation of a two-echelon supply chain environment withstochastic and seasonal demand, which allows managing an arbitrary number ofwarehouses and product types. Through a rich set of numerical experiments, wecompare the performance of different deep reinforcement learning algorithmsunder various supply chain structures, topologies, demands, capacities, andcosts. The results of the experimental plan indicate that deep reinforcementlearning algorithms outperform traditional inventory management strategies,such as the static (s, Q)-policy. Furthermore, this study provides detailedinsight into the design and development of an open-source software library thatprovides a customizable environment for solving the supply chain inventorymanagement problem using a wide range of data-driven approaches.