In this paper, we propose a reinforcement learning-based algorithm fortrajectory optimization for constrained dynamical systems. This problem ismotivated by the fact that for most robotic systems, the dynamics may notalways be known. Generating smooth, dynamically feasible trajectories could bedifficult for such systems. Using sampling-based algorithms for motion planningmay result in trajectories that are prone to undesirable control jumps.However, they can usually provide a good reference trajectory which amodel-free reinforcement learning algorithm can then exploit by limiting thesearch domain and quickly finding a dynamically smooth trajectory. We use thisidea to train a reinforcement learning agent to learn a dynamically smoothtrajectory in a curriculum learning setting. Furthermore, for generalization,we parameterize the policies with goal locations, so that the agent can betrained for multiple goals simultaneously. We show result in both simulatedenvironments as well as real experiments, for a $6$-DoF manipulator armoperated in position-controlled mode to validate the proposed idea. We comparethe proposed ideas against a PID controller which is used to track a designedtrajectory in configuration space. Our experiments show that our RL agenttrained with a reference path outperformed a model-free PID controller of thetype commonly used on many robotic platforms for trajectory tracking.