Recent research efforts enable study for natural language grounded navigationin photo-realistic environments, e.g., following natural language instructionsor dialog. However, existing methods tend to overfit training data in seenenvironments and fail to generalize well in previously unseen environments. Inorder to close the gap between seen and unseen environments, we aim at learninga generalized navigation model from two novel perspectives: (1) we introduce amultitask navigation model that can be seamlessly trained on bothVision-Language Navigation (VLN) and Navigation from Dialog History (NDH)tasks, which benefits from richer natural language guidance and effectivelytransfers knowledge across tasks; (2) we propose to learn environment-agnosticrepresentations for the navigation policy that are invariant among theenvironments seen during training, thus generalizing better on unseenenvironments. Extensive experiments show that training withenvironment-agnostic multitask learning objective significantly reduces theperformance gap between seen and unseen environments and the navigation agentso trained outperforms the baselines on unseen environments by 16% (relativemeasure on success rate) on VLN and 120% (goal progress) on NDH. Our submissionto the CVDN leaderboard establishes a new state-of-the-art for the NDH taskoutperforming the existing best model by more than 66% (goal progress) on theholdout test set. The code for training the navigation model usingenvironment-agnostic multitask learning is available athttps://github.com/google-research/valan.