Learnable Online Graph Representations for 3D Multi-Object Tracking
JN Zaech, D Dai, A Liniger, M Danelljan, L Van Gool

Figure 1. The proposed method uses a graph representation for detections and tracks. A neural message passing based architecture performs matching of detections and tracks and provides a learning based framework for track initialization, effectively replacing heuristics that are required in current approaches.

Fig.2: The proposed tracking graph combines tracks, represented by a sequence of track nodes and detections in a single representation. During the NMP iterations, information is exchanged between nodes and edges, and thus, distributed globally throughout the graph.

Fig.3: Visualization of different update scenarios, with only active edges in the graph. The graph represents a single track and two detections at each time step. a) Shows the ideal case where a track is matched to one node at every timestep and each detection node is connected with each other. b) Represents the case where a match at one timestep is dropped and the track is only matched to two detection nodes. c) Shows a situation, where the proposed approach is able to decide for the globally best solution, even though two detection nodes have been matched to the track in the last frame.

Table 1: Results on the nuScenes test set. Methods marked with asterisk use private detections and thus, no direct comparison is possible.