Fusing Directions and Displacements in Translation Averaging

Lalit Manam, Venu Madhav Govindu
[Paper] [Supp] [Poster] [Code] (Updated Mar 15, 2024)

Sensitive Triangles

Translation averaging solves for 3D camera translations given many pairwise relative translation directions. The mismatch between inputs (directions) and output estimates (absolute translations) makes translation averaging a challenging problem, which is often addressed by comparing either directions or displacements using relaxed cost functions that are relatively easy to optimize. However, the distinctly different nature of the cost functions leads to varied behaviour under different baselines and noise conditions. In this paper, we argue that translation averaging can benefit from a fusion of the two approaches. Specifically, we recursively fuse the individual updates suggested by direction and displacement-based methods using their uncertainties. The uncertainty of each estimate is modelled by the inverse of the Hessian of the corresponding optimization problem. As a result, our method utilizes the advantages of both methods in a principled manner. The superiority of our translation averaging scheme is demonstrated via the improved accuracies of camera translations on benchmark datasets compared to the state-of-the-art methods.

Notations

  • Underlying viewgraph: \( \mathcal{G}=\left(\mathcal{V},\mathcal{E}\right) \).
  • Global rotation and translation: \(\mathbf{R}_{i} \in \mathbb{SO}(3), \mathbf{T}_{i} \in \mathbb{R}^{3}, \forall i \in \mathcal{V} \).
  • Relative rotation and translation direction: \( \mathbf{R}_{ij}= \mathbf{R}_j \mathbf{R}_i^{-1}, \mathbf{t}_{ij}= \frac{\mathbf{R}_j(\mathbf{T}_i-\mathbf{T}_j)}{\|\mathbf{R}_j(\mathbf{T}_i-\mathbf{T}_j)\|_2}, \forall \left(i,j\right) \in \mathcal{E} \).
  • Relative translation direction in global reference frame: \( \mathbf{v}_{ij}=-\mathbf{R}_j^{-1} \mathbf{t}_{ij}= \frac{\mathbf{T}_j-\mathbf{T}_i}{\|\mathbf{T}_j-\mathbf{T}_i\|_2}, \forall \left(i,j\right) \in \mathcal{E} \).

Cost Functions for Translation Averaging

  • Comparing displacement vectors: \begin{align*} \min_{\mathbf{T}_{i, i \in \mathcal{V}}} & \rho \left( \| \mathbf{T}_j - \mathbf{T}_i - \|\mathbf{T}_j - \mathbf{T}_i\| \mathbf{v}_{ij} \|_2 \right) \\ \text{s.t. } \sum_{i \in \mathcal{V}} \mathbf{T}_i =\mathbf{0}, \sum_{(i,j) \in \mathcal{E}} & \left\langle \mathbf{T}_j - \mathbf{T}_i, \mathbf{v}_{ij} \right\rangle =1, \lambda_{ij} \ge 0, \text{ } \forall (i,j) \in \mathcal{E} \end{align*}
  • Comparing direction vectors: \begin{align*} \min_{\mathbf{T}_{i, i \in \mathcal{V}}} & \rho \left(\left\| \frac{\mathbf{T}_j - \mathbf{T}_i}{\| \mathbf{T}_j - \mathbf{T}_i \|} - \mathbf{v}_{ij} \right\|_2 \right) \\ \text{s.t. } \sum_{i \in \mathcal{V}} \mathbf{T}_i =\mathbf{0}, \sum_{(i,j) \in \mathcal{E}} & \left\langle \mathbf{T}_j - \mathbf{T}_i, \mathbf{v}_{ij} \right\rangle =1, \gamma_{ij} \ge 0, \text{ } \forall (i,j) \in \mathcal{E} \end{align*}
  • These cost functions are relaxed by introducting non-negative slack variables, \(\lambda_{ij}\) and \(\gamma_{ij}\), which are ideally equal to baseline and inverse baseline for the edge \((i,j)\) respectively.

  • Comparing relaxed displacement vectors: \begin{align*} \min_{\mathbf{T}_{i, i \in \mathcal{V}}, \lambda_{ij, (i,j) \in \mathcal{E}} } & \rho \left( \| \mathbf{T}_j - \mathbf{T}_i - \lambda_{ij} \mathbf{v}_{ij} \|_2 \right) \\ \text{s.t. } \sum_{i \in \mathcal{V}} \mathbf{T}_i =\mathbf{0}, \sum_{(i,j) \in \mathcal{E}} & \left\langle \mathbf{T}_j - \mathbf{T}_i, \mathbf{v}_{ij} \right\rangle =1, \lambda_{ij} \ge 0, \text{ } \forall (i,j) \in \mathcal{E} \end{align*}
  • Comparing relaxed direction vectors: \begin{align*} \min_{\mathbf{T}_{i, i \in \mathcal{V}}, \gamma_{ij, (i,j) \in \mathcal{E}} } & \rho \left( \| \left(\mathbf{T}_j - \mathbf{T}_i\right) \gamma_{ij} - \mathbf{v}_{ij} \|_2 \right) \\ \text{s.t. } \sum_{i \in \mathcal{V}} \mathbf{T}_i =\mathbf{0}, \sum_{(i,j) \in \mathcal{E}} & \left\langle \mathbf{T}_j - \mathbf{T}_i, \mathbf{v}_{ij} \right\rangle =1, \gamma_{ij} \ge 0, \text{ } \forall (i,j) \in \mathcal{E} \end{align*}
  • The behaviour of the relaxed cost functions vary with noise and the spread of baselines as seen in the top figure. In real data, the noise and baseline distribution differs with scenes as seen in below figures.

MND_baselines

YKM_baselines

Montreal Notre Dame

Yorkminster

Figure 1. Histograms of normalized ground truth baselines.

ELS_RT_Errors

MDR_RT_Errors

Ellis Island

Madrid Metropolis

Figure 2. Histograms of input direction errors (in degrees).

Results

    Synthetic Data

    Hist_SynDiff_Sig5

    Hist_SynSim_Sig5

    Disparate baselines with input noise \( \sigma = 5\)

    Similar baselines with input noise \( \sigma = 5\)

    Figure 3. Histogram of mean errors for synthetic datasets on noisy data with different baselines. The leftward shift indicates the superior performance of our method.

    Real Data

    Dataset

    1DSfM

    LUD

    ShapeFit

    BATA

    Fused-TA (Ours)

    Alamo

    2.4

    3.0

    2.6

    2.4

    2.4

    Ellis Island

    32.3

    26.0

    15.6

    25.3

    17.5

    Montreal Notre Dame

    1.9

    2.4

    2.3

    1.9

    1.8

    NYC Library

    3.1

    3.0

    3.6

    2.5

    2.3

    Piazza del Popolo

    7.2

    5.5

    15.8

    5.0

    4.5

    Piccadilly

    2.2

    2.6

    2.4

    2.3

    2.1

    Roman Forum

    4.3

    8.6

    8.6

    4.4

    5.3

    Tower of London

    8.0

    10.7

    8.6

    6.9

    6.2

    Trafalgar

    11.7

    7.6

    8.3

    6.9

    6.5

    Union Square

    9.2

    8.5

    14.3

    8.1

    7.9

    Vienna Cathedral

    12.8

    6.5

    7.1

    6.5

    6.1

    Table 1. Median camera translation errors (in meters) on 1DSfM datasets.

Publication

  1. Fusing Directions and Displacements in Translation Averaging (Lalit Manam and Venu Madhav Govindu), International Conference on 3D Vision, 2024 (Oral Presentation). [bibtex]