This paper considers the detection of the ball in team sport scenes observed with still or motion-compensated calibrated cameras. Foreground masks do provide primary cues to identify circular moving objects in the scene, but are shown to be too noisy to achieve reliable detections of weakly contrasted balls, especially when a single viewpoint is available, as often desired for reduced deployment cost. In those cases, trajectory analysis has been shown to provide valuable complementary information to differentiate true and false positives among the candidates detected by the foreground mask(s). In this paper, we focus on the detection of ball trajectory segments, exclusively from visual cues, without considering semantic reasoning about team play to connect those segments into long trajectories. We revisit several recent works, mainly the ones that we presented in Amit Kumar et al. (ICDSC, 1), Parisot and De Vleeschouwer (ICME, 2), and introduce a publicly available dataset to compare them. We conclude that randomized consensus-based methods are competitive compared to the alternative deterministic graph-based solutions, while offering the additional advantage to naturally extend to the cost-effective single-view scenario. As an original contribution, we also introduce a procedure to efficiently clean up the foreground mask in correlation-based methods like (Amit Kumar et al. in ICDSC, 1) and a nonlinear rank-order filter to merge the foreground cues from multiple viewpoints. We also derive recommendations regarding the camera positioning and the buffering needs of a real-time acquisition system.
In this paper, the player positions are computed as in [4].
The use of a small threshold induces a significant amount of false positive foreground pixels in the mask, which makes the cleanup stage, as well as the subsequent template-based correlation and ballistic trajectory analysis, especially helpful to avoid false ball detections.
Each player is approximated by a 40 cm (width) \(\times\) 180 cm (height) rectangular box.
To prevent the unfounded removal of ball candidates, the operating point of the player detector is chosen to maintain a very small false positive rate, at the cost of a relatively moderate detection rate.
Assuming that the single viewpoint is at least 15 m away from the field, which is common in practice, the actual template size lies within a margin of less than \(30\,\%\) around the constant-size template, chosen to fit the ball silhouette when the ball is in the field center.
False detections generally appear in bursts, and candidate location inaccuracies result from calibration and synchronization errors in multi-view setups.
For some M, the order r might be larger than the number of views in which M is visible. In that case, the order of the filter is set to the number of available views.
Note that the first rank-order filter has not been depicted because it leads to poor performance, due to the noisy nature of individual foreground masks, but also due to the fact that a single-view 2D location is associated with many 3D positions.
We neglect the complexity associated with the RANSAC-based ball trajectory estimation, since it runs over a relatively small number of preselected ball position candidates (\({\le} T\) for the Corr approach and \({\le} 3T\) for the CCA approach, T being the length of the window, typically \(T=24\)).
\(14 \times 10^6\) = 12,000 fps \(\times\) (40 \(\times\) 30 \(\times\) 30 )/30 fps.
