We consider a Bayesian approach to offline model-based inverse reinforcementlearning (IRL). The proposed framework differs from existing offlinemodel-based IRL approaches by performing simultaneous estimation of theexpert's reward function and subjective model of environment dynamics. We makeuse of a class of prior distributions which parameterizes how accurate theexpert's model of the environment is to develop efficient algorithms toestimate the expert's reward and subjective dynamics in high-dimensionalsettings. Our analysis reveals a novel insight that the estimated policyexhibits robust performance when the expert is believed (a priori) to have ahighly accurate model of the environment. We verify this observation in theMuJoCo environments and show that our algorithms outperform state-of-the-artoffline IRL algorithms.