Making accurate motion prediction of surrounding agents such as pedestriansand vehicles is a critical task when robots are trying to perform autonomousnavigation tasks. Recent research on multi-modal trajectory prediction,including regression and classification approaches, perform very well atshort-term prediction. However, when it comes to long-term prediction, mostLong Short-Term Memory (LSTM) based models tend to diverge far away from theground truth. Therefore, in this work, we present a two-stage framework forlong-term trajectory prediction, which is named as Long-Term Network (LTN). OurLong-Term Network integrates both the regression and classification approaches.We first generate a set of proposed trajectories with our proposed distributionusing a Conditional Variational Autoencoder (CVAE), and then classify them withbinary labels, and output the trajectories with the highest score. Wedemonstrate our Long-Term Network's performance with experiments on tworeal-world pedestrian datasets: ETH/UCY, Stanford Drone Dataset (SDD), and onechallenging real-world driving forecasting dataset: nuScenes. The results showthat our method outperforms multiple state-of-the-art approaches in long-termtrajectory prediction in terms of accuracy.