Deep Reinforcement Learning (Deep RL) has been in the spotlight for the pastfew years, due to its remarkable abilities to solve problems which wereconsidered to be practically unsolvable using traditional Machine Learningmethods. However, even state-of-the-art Deep RL algorithms have variousweaknesses that prevent them from being used extensively within industryapplications, with one such major weakness being their sample-inefficiency. Inan effort to patch these issues, we integrated a meta-learning technique inorder to shift the objective of learning to solve a task into the objective oflearning how to learn to solve a task (or a set of tasks), which we empiricallyshow that improves overall stability and performance of Deep RL algorithms. Ourmodel, named REIN-2, is a meta-learning scheme formulated within the RLframework, the goal of which is to develop a meta-RL agent (meta-learner) thatlearns how to produce other RL agents (inner-learners) that are capable ofsolving given environments. For this task, we convert the typical interactionof an RL agent with the environment into a new, single environment for themeta-learner to interact with. Compared to traditional state-of-the-art Deep RLalgorithms, experimental results show remarkable performance of our model inpopular OpenAI Gym environments in terms of scoring and sample efficiency,including the Mountain Car hard-exploration environment.