video stream images; pose estimation; minimum spanning tree
The extrinsic camera parameters from video stream images can be accurately estimated by tracking features through the image sequence and using these features to compute parameter estimates. The poses for long video sequences have been estimated in this manner. However,
the poses of large sets of still images cannot be estimated
using the same strategy because wide-baseline correspondences are not as robust as narrow-baseline feature tracks. Moreover, video pose estimation requires a linear or hierarchically-linear ordering on the images to be calibrated, reducing the image matches to the neighboring video frames. We propose a novel generalization to the linear ordering requirement of video pose estimation by computing the Minimum Spanning Tree of the camera adjacency graph and using the tree hierarchy to determine the calibration order for a set of input images. We validate the pose accuracy using an error metric that is functionally independent of the estimation process. Because we do not rely on feature tracking for generating feature correspondences, our method can use internally calibrated wide- or narrow-baseline images as input, and can estimate the camera poses from multiple video streams without special pre-processing to concatenate the streams.