Leverages occlusion reasoning, modeling occlusion by the occluding and occluded objects relationship.
Underlying
intuition is that 1) all the invisible regions of selected
detections shall be explained either by another occluding
object or by image truncation, and 2) visible regions of selected
detections should not overlap with each other. The
model is formulated below.
'mi' is 2D visibility mask composed of three components: 'mv_i' (visible region), 'mo_i' (occluded region), and 'mt_i' (truncated region)