DSDF: An approach to handle stochastic agents in collaborative multi-agent reinforcement learning

Abstract

Multi-Agent reinforcement learning has received lot of attention in recentyears and have applications in many different areas. Existing methods involvingCentralized Training and Decentralized execution, attempts to train the agentstowards learning a pattern of coordinated actions to arrive at optimal jointpolicy. However if some agents are stochastic to varying degrees ofstochasticity, the above methods often fail to converge and provides poorcoordination among agents. In this paper we show how this stochasticity ofagents, which could be a result of malfunction or aging of robots, can add tothe uncertainty in coordination and there contribute to unsatisfactory globalcoordination. In this case, the deterministic agents have to understand thebehavior and limitations of the stochastic agents while arriving at optimaljoint policy. Our solution, DSDF which tunes the discounted factor for theagents according to uncertainty and use the values to update the utilitynetworks of individual agents. DSDF also helps in imparting an extent ofreliability in coordination thereby granting stochastic agents tasks which areimmediate and of shorter trajectory with deterministic ones taking the taskswhich involve longer planning. Such an method enables joint co-ordinations ofagents some of which may be partially performing and thereby can reduce ordelay the investment of agent/robot replacement in many circumstances. Resultson benchmark environment for different scenarios shows the efficacy of theproposed approach when compared with existing approaches.

Quick Read (beta)

loading the full paper ...