This paper explores the use of extreme points in an object (left-most,right-most, top, bottom pixels) as input to obtain precise object segmentationfor images and videos. We do so by adding an extra channel to the image in theinput of a convolutional neural network (CNN), which contains a Gaussiancentered in each of the extreme points. The CNN learns to transform thisinformation into a segmentation of an object that matches those extreme points.We demonstrate the usefulness of this approach for guided segmentation(grabcut-style), interactive segmentation, video object segmentation, and densesegmentation annotation. We show that we obtain the most precise results todate, also with less user input, in an extensive and varied selection ofbenchmarks and datasets. All our models and code will be made publiclyavailable.