Abstract
This paper addresses writer identification and writer retrieval which isconsidered as a challenging problem in the document analysis and recognitionfield. In this work, a novel pipeline is proposed for the problem at hand byemploying a unified neural network architecture consisting of the ResNet-20 asa feature extractor and an integrated NetVLAD layer, inspired by the vector oflocally aggregated descriptors (VLAD), in the head of the latter part. Havingdefined this architecture, the triplet semi-hard loss function is used todirectly learn an embedding for individual input image patches. Subsequently,generalized max-pooling technique is employed for the aggregation of embeddeddescriptors of each handwritten image. Also, a novel re-ranking strategy isintroduced for the task of identification and retrieval based on $k$-reciprocalnearest neighbors, and it is shown that the pipeline can benefit tremendouslyfrom this step. Experimental evaluation has been done on the three publiclyavailable datasets: the ICDAR 2013, CVL, and KHATT datasets. Results indicatethat while we perform comparably to the state-of-the-art on the KHATT, ourwriter identification and writer retrieval pipeline achieves superiorperformance on the ICDAR 2013 and CVL datasets in terms of mAP.