Abstract
In this paper, we propose a new architecture named Rotation-invariant MixedGraphical Model Network (R-MGMN) to solve the problem of 2D hand poseestimation from a monocular RGB image. By integrating a rotation net, theR-MGMN is invariant to rotations of the hand in the image. It also has a poolof graphical models, from which a combination of graphical models could beselected, conditioning on the input image. Belief propagation is performed oneach graphical model separately, generating a set of marginal distributions,which are taken as the confidence maps of hand keypoint positions. Finalconfidence maps are obtained by aggregating these confidence maps together. Weevaluate the R-MGMN on two public hand pose datasets. Experiment results showour model outperforms the state-of-the-art algorithm which is widely used in 2Dhand pose estimation by a noticeable margin.