Abstract
Scene text detection has been made great progress in recent years. Thedetection manners are evolving from axis-aligned rectangle to rotated rectangleand further to quadrangle. However, current datasets contain very little curvetext, which can be widely observed in scene images such as signboard, productname and so on. To raise the concerns of reading curve text in the wild, inthis paper, we construct a curve text dataset named CTW1500, which includesover 10k text annotations in 1,500 images (1000 for training and 500 fortesting). Based on this dataset, we pioneering propose a polygon based curvetext detector (CTD) which can directly detect curve text without empiricalcombination. Moreover, by seamlessly integrating the recurrent transverse andlongitudinal offset connection (TLOC), the proposed method can be end-to-endtrainable to learn the inherent connection among the position offsets. Thisallows the CTD to explore context information instead of predicting pointsindependently, resulting in more smooth and accurate detection. We also proposetwo simple but effective post-processing methods named non-polygon suppress(NPS) and polygonal non-maximum suppression (PNMS) to further improve thedetection accuracy. Furthermore, the proposed approach in this paper isdesigned in an universal manner, which can also be trained with rectangular orquadrilateral bounding boxes without extra efforts. Experimental results onCTW-1500 demonstrate our method with only a light backbone can outperformstate-of-the-art methods with a large margin. By evaluating only in the curveor non-curve subset, the CTD + TLOC can still achieve the best results. Code isavailable at https://github.com/Yuliang-Liu/Curve-Text-Detector.