Loading Events

« All Events

  • This event has passed.

PhD Thesis Defense

March 27 @ 2:00 PM - 5:00 PM IST

Name of the Candidate:  Lalit Manam
Research Supervisor: Venu Madhav Govindu
Date and Time: March 27, 2025, Thursday, 2:00 PM
Venue: C-241, First Floor, Multimedia Classroom (MMCR), EE
Title: Global Methods for Camera Motion Estimation
Abstract:
In computer vision, Structure-from-Motion (SfM) aims to recover a 3D reconstruction of a scene from a collection of images of the scene. SfM has been of interest to the 3D computer vision community for the last few decades, with a wide range of scientific, industrial and domestic applications. A key component of global approaches to SfM is estimating the motion of individual cameras, i.e. their rotation and translation with respect to a frame of reference. The camera motions are generally not available a priori, making their estimation a crucial component. This thesis focuses on accurate, reliable and efficient estimation of camera motions in SfM. We examine a number of issues concerning the estimation of camera motions, namely input quality, choice of cost function and the underlying graph representation, and develop methods to address these issues.
Input Quality: We examine the problem of translation averaging, where all camera translations are simultaneously estimated, given relative directions between them as the input. The accuracy of relative camera directions is limited due to multiple factors relating to how they are obtained. We take recourse to keypoint correspondences between image pairs from which relative motions between the cameras are estimated. We propose a modular framework to iteratively reweight keypoint correspondences based on their global consistency via translation averaging methods. Our proposed framework improves relative translation directions in comparison to the translation averaging scheme used.
Choice of Cost Function: In translation averaging, recovering camera translations from relative directions involves two types of optimization costs, either comparing directions or displacements. These cost functions are often relaxed to obtain simpler optimization approaches. We observe that neither cost performs the best under varying distributions of the underlying camera translations and the noise present in the input directions. We propose a principled approach to recursively fuse the estimates obtained from both the relaxed costs using a principled uncertainty model. Our method leads to improvement in camera translation estimates compared to the individual costs.
Graph Representation: We leverage the underlying graph representation that arises due to relationships between cameras in two applications to improve camera motion estimates. In the first application, we introduce the idea of sensitivity in translation averaging, which analyzes the change in camera translations with small perturbations to input directions. We develop two different formulations to theoretically analyze the sensitivity/conditioning of the problem based solely on the inputs. We propose efficient algorithms to remove the ill-conditioned configuration of inputs, which is abundant in real data. Removal of ill-conditioned inputs significantly improves the translation estimates, revealing the benefits of our analysis.
The second application that leverages graph representation solves the tasks of graph sparsification and disambiguation of repeated structures in a unified manner. We present a scoring mechanism to identify redundant and false edges and remove them with a threshold that is optimal under an edge selection cost. We design efficient algorithms which can be applied as a preprocessing step to any SfM pipeline. Our approach handles both tasks in a unified manner, making it practical for use. Applying our methods reduces reconstruction time due to the sparsification of graphs and avoids superimposed reconstructions because of repeated structures being disambiguated.

Details

Date:
March 27
Time:
2:00 PM - 5:00 PM IST

Venue

Multi-Media Class Room (MMCR), EE Department (Hybrid mode)