Abstract
Robots are increasingly integrated across industries, particularly inhealthcare. However, many valuable applications for quadrupedal robots remainoverlooked. This research explores the effectiveness of three reinforcementlearning algorithms in training a simulated quadruped robot for autonomousnavigation and obstacle avoidance. The goal is to develop a robotic guide dogsimulation capable of path following and obstacle avoidance, with long-termpotential for real-world assistance to guide dogs and visually impairedindividuals. It also seeks to expand research into medical 'pets', includingrobotic guide and alert dogs. A comparative analysis of thirteen related research papers shaped keyevaluation criteria, including collision detection, pathfinding algorithms,sensor usage, robot type, and simulation platforms. The study focuses on sensorinputs, collision frequency, reward signals, and learning progression todetermine which algorithm best supports robotic navigation in complexenvironments. Custom-made environments were used to ensure fair evaluation of all threealgorithms under controlled conditions, allowing consistent data collection.Results show that Proximal Policy Optimization (PPO) outperformed DeepQ-Network (DQN) and Q-learning across all metrics, particularly in average andmedian steps to goal per episode. By analysing these results, this study contributes to robotic navigation, AIand medical robotics, offering insights into the feasibility of AI-drivenquadruped mobility and its role in assistive robotics.