Abstract
Dynamic texture (DT) segmentation, and video processing in general, iscurrently widely dominated by methods based on deep neural networks thatrequire the deployment of a large number of layers. Although this parametricapproach has shown superior performances for the dynamic texture segmentation,all current deep learning methods suffer from a significant main weaknessrelated to the lack of a sufficient reference annotation to train models and tomake them functional. This study explores the unsupervised segmentationapproach that can be used in the absence of training data to segment newvideos. We present an effective unsupervised learning consensus model for thesegmentation of dynamic texture (ULCM). This model is designed to mergedifferent segmentation maps that contain multiple and weak quality regions inorder to achieve a more accurate final result of segmentation. The diverselabeling fields required for the combination process are obtained by asimplified grouping scheme applied to an input video (on the basis of a threeorthogonal planes: xy, yt and xt). In the proposed model, the set of values ofthe requantized local binary patterns (LBP) histogram around the pixel to beclassified are used as features which represent both the spatial and temporalinformation replicated in the video. Experiments conducted on the challengingSynthDB dataset show that, contrary to current dynamic texture segmentationapproaches that either require parameter estimation or a training step, ULCM issignificantly faster, easier to code, simple and has limited parameters.Further qualitative experiments based on the YUP++ dataset prove theefficiently and competitively of the ULCM.