Research on Autonomous Robots Navigation based on Reinforcement Learning

Abstract

Reinforcement learning continuously optimizes decision-making based onreal-time feedback reward signals through continuous interaction with theenvironment, demonstrating strong adaptive and self-learning capabilities. Inrecent years, it has become one of the key methods to achieve autonomousnavigation of robots. In this work, an autonomous robot navigation method basedon reinforcement learning is introduced. We use the Deep Q Network (DQN) andProximal Policy Optimization (PPO) models to optimize the path planning anddecision-making process through the continuous interaction between the robotand the environment, and the reward signals with real-time feedback. Bycombining the Q-value function with the deep neural network, deep Q network canhandle high-dimensional state space, so as to realize path planning in complexenvironments. Proximal policy optimization is a strategy gradient-based method,which enables robots to explore and utilize environmental information moreefficiently by optimizing policy functions. These methods not only improve therobot's navigation ability in the unknown environment, but also enhance itsadaptive and self-learning capabilities. Through multiple training andsimulation experiments, we have verified the effectiveness and robustness ofthese models in various complex scenarios.

Quick Read (beta)

loading the full paper ...