Abstract
Humans recognize object structure from both their appearance and motion;often, motion helps to resolve ambiguities in object structure that arise whenwe observe object appearance only. There are particular scenarios, however,where neither appearance nor spatial-temporal motion signals are informative:occluding twigs may look connected and have almost identical movements, thoughthey belong to different, possibly disconnected branches. We propose to tacklethis problem through spectrum analysis of motion signals, because vibrations ofdisconnected branches, though visually similar, often have distinctive naturalfrequencies. We propose a novel formulation of tree structure based on aphysics-based link model, and validate its effectiveness by theoreticalanalysis, numerical simulation, and empirical experiments. With thisformulation, we use nonparametric Bayesian inference to reconstruct treestructure from both spectral vibration signals and appearance cues. Our modelperforms well in recognizing hierarchical tree structure from real-world videosof trees and vessels.