The paper presents a reinforcement learning solution to dynamic resourceallocation for 5G radio access network slicing. Available communicationresources (frequency-time blocks and transmit powers) and computationalresources (processor usage) are allocated to stochastic arrivals of networkslice requests. Each request arrives with priority (weight), throughput,computational resource, and latency (deadline) requirements, and if feasible,it is served with available communication and computational resources allocatedover its requested duration. As each decision of resource allocation makes someof the resources temporarily unavailable for future, the myopic solution thatcan optimize only the current resource allocation becomes ineffective fornetwork slicing. Therefore, a Q-learning solution is presented to maximize thenetwork utility in terms of the total weight of granted network slicingrequests over a time horizon subject to communication and computationalconstraints. Results show that reinforcement learning provides majorimprovements in the 5G network utility relative to myopic, random, and firstcome first served solutions. While reinforcement learning sustains scalableperformance as the number of served users increases, it can also be effectivelyused to assign resources to network slices when 5G needs to share the spectrumwith incumbent users that may dynamically occupy some of the frequency-timeblocks.