We present a new neural representation, called Neural Ray (NeuRay), for thenovel view synthesis (NVS) task with multi-view images as input. Existingneural scene representations for solving the NVS problem, such as NeRF, cannotgeneralize to new scenes and take an excessively long time on training on eachnew scene from scratch. The other subsequent neural rendering methods based onstereo matching, such as PixelNeRF, SRF and IBRNet are designed to generalizeto unseen scenes but suffer from view inconsistency in complex scenes withself-occlusions. To address these issues, our NeuRay method represents everyscene by encoding the visibility of rays associated with the input views. Thisneural representation can efficiently be initialized from depths estimated byexternal MVS methods, which is able to generalize to new scenes and achievessatisfactory rendering images without any training on the scene. Then, theinitialized NeuRay can be further optimized on every scene with little trainingtiming to enforce spatial coherence to ensure view consistency in the presenceof severe self-occlusion. Experiments demonstrate that NeuRay can quicklygenerate high-quality novel view images of unseen scenes with little finetuningand can handle complex scenes with severe self-occlusions which previousmethods struggle with.