A model and expansion plan have been developed to optimally determinemicrogrid designs as they evolve to dynamically react to changing conditionsand to exploit energy storage capabilities. In the wake of the highlyelectrified future ahead of us, the role of energy storage is crucial whereverdistributed generation is abundant, such as microgrid settings. Given thevariety of storage options that are recently becoming more economical,determining which type of storage technology to invest in, along with theappropriate timing and capacity becomes a critical research question. Inproblems where the investment timing is of high priority, like this one,developing analytical and systematic frameworks for rigorously consideringthese issues is indispensable. From a business perspective, these strategicframeworks will aim to optimize the process of investment planning, byleveraging novel approaches and by capturing all the problem details thattraditional approaches are unable to. Reinforcement learning algorithms haverecently proven to be successful in problems where sequential decision-makingis inherent. In the operations planning area, these algorithms are already usedbut mostly in short-term problems with well-defined constraints and low levelsof uncertainty modeling. On the contrary, in this work, we expand and tailorthese techniques to long-term investment planning by utilizing model-freeapproaches, like the Q-learning algorithm, combined with simulation-basedmodels. We find that specific types of energy storage units, including thevanadium-redox battery, can be expected to be at the core of the futuremicrogrid applications, and therefore, require further attention. Another keyfinding is that the optimal storage capacity threshold for a system dependsheavily on the price movements of the available storage units in the market.