https://arxiv.org/abs/2111.09621
SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking
3D multi-object tracking (MOT) has witnessed numerous novel benchmarks and approaches in recent years, especially those under the "tracking-by-detection" paradigm. Despite their progress and usefulness, an in-depth analysis of their strengths and weaknesse
arxiv.org
content
1. 3d mot 설명
2. 논문 pipeline
3. code로 진행되는 과정 파악
4. 결론
1. 3D MOT
3d mot란 3d multi object tracking
TBD, JDT방법 모두 전체 아키텍쳐 안에 detector와 association부분이 있습니다. TBD같은 경우 위에 말했듯이 detector를 기존 모델을 사용하고 나온 결과물을 가지고 KF를 사용하여 accociation을 진행합니다.이 방법이 이 포스팅에서 설명할 AB3MOT방법입니다.
JDT같은 경우는 association부분도 learning base로 이루어져야 하기때문에 RNN의 sequencial한 network 혹은 trasnformer를 활용하여 accociation을 진행합니다.
3d mot는 여러 객체 tracking을 수행해야 하기 때문에 association 단계가 필요하다.
2. 논문 pipeline
1. pre-processing of input detection
selecting the bounding boxes with scores higher than a certain threshold
2. motion model
use the Kalman filter, and CenterPoint(CV model) -> predict next frame
KF -> high frequency case
CV -> robust at low frequency case
3. association
similarity measures the distance between a pair of detection and tracklet(tracking 결과물)
4. life cycle management
controls birth, death, and output
3. code
1) preprocessing
dataloader 단계에서 preprocess 진행
result [ 'dets' ], result [ 'det_types' ], result [ 'aux_info' ][ 'velos' ] = \
self . frame_nms ( result [ 'dets' ], result [ 'det_types' ], result [ 'aux_info' ][ 'velos' ], 0.1 )
nms(non-maximum suppression) 설명
2) motion model
kalman filter 사용하여 predict 수행.
kf.x : position and velocity
kf.F : state transition matrix
kf.H : measurement function
kf.B : control transition matrix
kf.P : covariance matrix
#predict()
self . x = dot ( F , self . x )
# P = FPF' + Q
self . P = self . _alpha_sq * dot ( dot ( F , self . P ), F .T) + Q
# save prior
self . x_prior = self . x .copy()
self . P_prior = self . P .copy()
예측
공분산 예측
잔차 계산
칼만이득
업데이트
공분산 업데이트
# y = z - Hx
self . y = z - dot ( H , self . x )
PHT = dot ( self . P , H .T)
# S = HPH' + R
self . S = dot ( H , PHT ) + R
self . SI = self . inv ( self . S )
# K = PH'inv(S)
self . K = dot ( PHT , self . SI )
# x = x + Ky
# predict new x with residual scaled by the kalman gain
self . x = self . x + dot ( self . K , self . y )
# P = (I-KH)P(I-KH)' + KRK'
I_KH = self . _I - dot ( self . K , H )
self . P = dot ( dot ( I_KH , self . P ), I_KH .T) + dot ( dot ( self . K , R ), self . K .T)
dets = input_data . dets
det_indexes = [ i for i , det in enumerate ( dets ) if det .s >= self . score_threshold ]
dets = [ dets [ i ] for i in det_indexes ]
trk_preds = list ()
for trk in self . trackers :
trk_preds . append ( trk .predict( input_data . time_stamp , input_data . aux_info [ 'is_key_frame' ]))
matched , unmatched_dets , unmatched_trks = associate_dets_to_tracks ( dets , trk_preds ,
self . match_type , self . asso , self . asso_thres , trk_innovation_matrix )
3) association
def associate_dets_to_tracks ( dets , tracks , mode , asso ,
dist_threshold = 0.9 , trk_innovation_matrix = None ):
""" associate the tracks with detections
"""
matched_indices , dist_matrix = \
bipartite_matcher ( dets , tracks , asso , dist_threshold , trk_innovation_matrix )
unmatched_dets = list ()
for d , det in enumerate ( dets ):
if d not in matched_indices [:, 0 ]:
unmatched_dets . append ( d )
unmatched_tracks = list ()
for t , trk in enumerate ( tracks ):
if t not in matched_indices [:, 1 ]:
unmatched_tracks . append ( t )
matches = list ()
for m in matched_indices :
if dist_matrix [ m [ 0 ], m [ 1 ]] > dist_threshold :
unmatched_dets . append ( m [ 0 ])
unmatched_tracks . append ( m [ 1 ])
else :
matches . append ( m .reshape( 2 ))
return matches , np . array ( unmatched_dets ), np . array ( unmatched_tracks )
bipartite matching(이분 매칭 알고리즘) 설명, giou 설명, hungarian algorithm 설명
giou 값 계산하여 linear sum assignment 진행
giou -> loss값 계산(box 간)
linear sum assignment -> loss값을 바탕으로 최소 비용을 가지는 쌍을 선정함(hungarian algorithm 사용)
4) life cycle management
update the matched tracks, create new tracks for unmatched detections, remove dead tracks 수행
# update the matched tracks
for t , trk in enumerate ( self . trackers ):
if t not in unmatched_trks :
for k in range ( len ( matched )):
if matched [ k ][ 1 ] == t :
d = matched [ k ][ 0 ]
break
aux_info = {
'velo' : list ( input_data . aux_info [ 'velos' ][ d ]),
'is_key_frame' : input_data . aux_info [ 'is_key_frame' ]}
update_info = UpdateInfoData ( mode = 1 , bbox = input_data . dets [ d ], ego = input_data . ego ,
frame_index = self . frame_count ,
dets = input_data . dets , aux_info = aux_info )
trk .update( update_info )
# create new tracks for unmatched detections
for index in unmatched_dets :
if self . has_velo :
aux_info = {
'velo' : list ( input_data . aux_info [ 'velos' ][ index ]),
'is_key_frame' : input_data . aux_info [ 'is_key_frame' ]}
else :
aux_info = { 'is_key_frame' : input_data . aux_info [ 'is_key_frame' ]}
track = tracklet . Tracklet ( self . configs , self . count , input_data . dets [ index ], input_data . det_types [ index ],
self . frame_count , aux_info = aux_info , time_stamp = input_data . time_stamp )
self . trackers . append ( track )
self . count += 1
# remove dead tracks
track_num = len ( self . trackers )
for index , trk in enumerate ( reversed ( self . trackers )):
if trk .death( self . frame_count ):
self . trackers . pop ( track_num - 1 - index )
# output the results
result = list ()
for trk in self . trackers :
state_string = trk .state_string( self . frame_count )
result . append (( trk .get_state(), trk .id, state_string , trk .det_type))