Abstract
Multi-agent hierarchical reinforcement learning (MAHRL) has been studied asan effective means to solve intelligent decision problems in complex andlarge-scale environments. However, most current MAHRL algorithms follow thetraditional way of using reward functions in reinforcement learning, whichlimits their use to a single task. This study aims to design a multi-agentcooperative algorithm with logic reward shaping (LRS), which uses a moreflexible way of setting the rewards, allowing for the effective completion ofmulti-tasks. LRS uses Linear Temporal Logic (LTL) to express the internal logicrelation of subtasks within a complex task. Then, it evaluates whether thesubformulae of the LTL expressions are satisfied based on a designed rewardstructure. This helps agents to learn to effectively complete tasks by adheringto the LTL expressions, thus enhancing the interpretability and credibility oftheir decisions. To enhance coordination and cooperation among multiple agents,a value iteration technique is designed to evaluate the actions taken by eachagent. Based on this evaluation, a reward function is shaped for coordination,which enables each agent to evaluate its status and complete the remainingsubtasks through experiential learning. Experiments have been conducted onvarious types of tasks in the Minecraft-like environment. The resultsdemonstrate that the proposed algorithm can improve the performance ofmulti-agents when learning to complete multi-tasks.