Reinforcement learning (RL) is increasingly adopted in job shop schedulingproblems (JSSP). But RL for JSSP is usually done using a vectorizedrepresentation of machine features as the state space. It has three majorproblems: (1) the relationship between the machine units and the job sequenceis not fully captured, (2) exponential increase in the size of the state spacewith increasing machines/jobs, and (3) the generalization of the agent tounseen scenarios. We present a novel framework - GraSP-RL, GRAph neuralnetwork-based Scheduler for Production planning problems using ReinforcementLearning. It represents JSSP as a graph and trains the RL agent using featuresextracted using a graph neural network (GNN). While the graph is itself in thenon-euclidean space, the features extracted using the GNNs provide a richencoding of the current production state in the euclidean space, which is thenused by the RL agent to select the next job. Further, we cast the schedulingproblem as a decentralized optimization problem in which the learning agent isassigned to all the production units and the agent learns asynchronously fromthe data collected on all the production units. The GraSP-RL is then applied toa complex injection molding production environment with 30 jobs and 4 machines.The task is to minimize the makespan of the production plan. The scheduleplanned by GraSP-RL is then compared and analyzed with a priority dispatch rulealgorithm like first-in-first-out (FIFO) and metaheuristics like tabu search(TS) and genetic algorithm (GA). The proposed GraSP-RL outperforms the FIFO,TS, and GA for the trained task of planning 30 jobs in JSSP. We further testthe generalization capability of the trained agent on two different problemclasses: Open shop system (OSS) and Reactive JSSP (RJSSP) where our methodproduces results better than FIFO and comparable results to TS and GA.