Federated learning (FL) is a privacy-preserving machine learning paradigmthat enables collaborative training among geographically distributed andheterogeneous users without gathering their data. Extending FL beyond theconventional supervised learning paradigm, federated Reinforcement Learning(RL) was proposed to handle sequential decision-making problems for variousprivacy-sensitive applications such as autonomous driving. However, theexisting federated RL algorithms directly combine model-free RL with FL, andthus generally have high sample complexity and lack theoretical guarantees. Toaddress the above challenges, we propose a new federated RL algorithm thatincorporates model-based RL and ensemble knowledge distillation into FL.Specifically, we utilise FL and knowledge distillation to create an ensemble ofdynamics models from clients, and then train the policy by solely using theensemble model without interacting with the real environment. Furthermore, wetheoretically prove that the monotonic improvement of the proposed algorithm isguaranteed. Extensive experimental results demonstrate that our algorithmobtains significantly higher sample efficiency compared to federated model-freeRL algorithms in the challenging continuous control benchmark environments. Theresults also show the impact of non-IID client data and local update steps onthe performance of federated RL, validating the insights obtained from ourtheoretical analysis.