We examine the applicability of modern neural network architectures to themidterm prediction of earthquakes. Our data-based classification model aims topredict if an earthquake with the magnitude above a threshold takes place at agiven area of size $10 \times 10$ kilometers in $30$-$180$ days from a givenmoment. Our deep neural network model has a recurrent part (LSTM) that accountsfor time dependencies between earthquakes and a convolutional part thataccounts for spatial dependencies. Obtained results show that neuralnetworks-based models beat baseline feature-based models that also account forspatio-temporal dependencies between different earthquakes. For historical dataon Japan earthquakes $1990$-$2016$ our best model predicts earthquakes withmagnitude $M_c > 5$ with quality metrics ROC AUC $0.975$ and PR AUC $0.0890$,making $1.18 \cdot 10^3$ correct predictions, while missing $2.09 \cdot 10^3$earthquakes and making $192 \cdot 10^3$ false alarms. The baseline approach hassimilar ROC AUC $0.992$, the number of correct predictions $1.19 \cdot 10^3$,and missing $2.07 \cdot 10^3$ earthquakes, but significantly worse PR AUC$0.00911$, and the number of false alarms $1004 \cdot 10^3$.