Long Short-Term Memory Networks (LSTMs) have been applied to daily dischargeprediction with remarkable success. Many practical scenarios, however, requirepredictions at more granular timescales. For instance, accurate prediction ofshort but extreme flood peaks can make a life-saving difference, yet such peaksmay escape the coarse temporal resolution of daily predictions. Naivelytraining an LSTM on hourly data, however, entails very long input sequencesthat make learning hard and computationally expensive. In this study, wepropose two Multi-Timescale LSTM (MTS-LSTM) architectures that jointly predictmultiple timescales within one model, as they process long-past inputs at asingle temporal resolution and branch out into each individual timescale formore recent input steps. We test these models on 516 basins across thecontinental United States and benchmark against the US National Water Model.Compared to naive prediction with a distinct LSTM per timescale, themulti-timescale architectures are computationally more efficient with no lossin accuracy. Beyond prediction quality, the multi-timescale LSTM can processdifferent input variables at different timescales, which is especially relevantto operational applications where the lead time of meteorological forcingsdepends on their temporal resolution.