Image Correspondance

Correspondence

Relevant to structure from motion, pose estimation

- i.e. Given an image pair, which parts of an image corresponds to which part of another image.

Task

Given two image patches (not images), do they correspond?

Task Solution

Build a discriminative descriptor

Network

Basically the same CNN is used for both patches (same W) to map to space.

Loss

p1 and p2 are "indexes" representing the keypoint index. I.e. if both points refer to the same point in the 3D object, then they are equal.

Solution

Get rid of noisy weight updates.

- Key idea: predict potentially noisy updates. Do not use them.

How?

Basically only backprop on high loss samples.

Problem solved by paper

Training can be tough.

Paper

Link to ICCV paper

Mining

Costly, but really helps.

- Different mining ratios, but best is 8.

- Performance beats

Rotation Invariant

Beats SIFT, which is not rotation invariant.

Wide baseline matching

Performs well.

Deformation and Illumination

Performs well.

Results

Key Takeaways

- L2 distance for training/testing.

- Generalizes well for tasks

However, not much better than VGG, computationally expensive, mining during training needs forward prop and not efficient to compute.

CNN Architecture