An efficient pitch-by-pitch extraction algorithm through multimodal information
Introduction
Sport video analysis [23], [22], [33], [8], [24], [7], [5], [4], [18], [20], [17] has been an active research field for a long time. It facilitates the discovery of semantic structures in recorded sport broadcast videos and enables a wide spectrum of applications. For example, to decide suitable strategies for the future games, coaches and managers have to carefully watch specific shots, e.g., pitching or batting shots in a baseball game, to study the strength and weaknesses of rival players. In general, identifying interested shots, such as pitch shots, from broadcast baseball videos requires great human labor to browse through those videos. This study proposes a novel technique to efficiently and accurately segment complete pitch shots through multimodal information [34], [35], [37], [38].
Previous researchers, Chen et al. [4], designed a technique to extract pitch-by-pitch videos by first tracking ball trajectories to locate pitch shots in per-batter videos. Two types of videos were analyzed or extracted by Chen et al.:
- •
Pitch-by-pitch video: A video clip encompasses frames captured in a period between the starting time when pitchers take a ready stance on the pitching rubber and the ending time when the ball reaches the home plate.
- •
Per-batter video: A video clip includes frames between two adjacent batter change events [33], and it happens when the batter is either out (flies out, grounds out, and strike out) or credited with a base (on base hit or balls).
Based on the identified pitch shots, Chen et al. [4] extracted pitch-by-pitch videos mainly by computing baseball trajectories. This technique would fail to extract pitch-by-pitch shots for at least following three instances: (1) ball trajectory is out of a predefined search region, (2) pitch replays, and (3) white-on-white problems [4]. Furthermore, the computational complexity of [4] is high because it requires a lot of time to compute ball trajectories. To overcome the above limitations and issues, in this work, we propose an efficient algorithm that detects pitch speed emerging in the scoreboard as an indication of ending time of a pitch. Afterward, we will utilize the motion degree information of a certain pitcher to determine the starting time of a pitch.
To display the game status to audiences on sport video, the scoreboard in general appears frequently in a fixed position. Usually, the important information, e.g., scores, current inning, and the number of balls, strikes and outs, will be updated when certain pitching or batting events happen. For example, the pitch speed emerges in the scoreboard once the radar gun detects the moving speed of the ball in the end of a pitch. Therefore, we propose a novel technique, which infers ending pitching events associated with pitch shots by recognizing pitch speeds appearing in the scoreboard and estimating when and where the pitcher is present. Based on the identified pitch shots and estimated pitcher presence information, the proposed technique searches starting and releasing events by measuring the motion degree of the pitcher, and generates a list of potential pitch shots that satisfy time-based pitching constraints [3] to filter out noisy non-pitch events. By incorporating rules observed from best practices of efficient pitching [14], the proposed technique finally groups nearby events as final pitch shots and then extracts pitch-by-pitch video clips based on the event distribution in each shot from the original broadcast video footage.
The major contribution of this study is to design, prototype, and evaluate a novel pitch-by-pitch video extraction technique by incorporating a pitch speed recognition scheme, a pitcher localization technique, and a motion degree measurement scheme to accurately identify pitch shots. To validate the performance and accuracy of this algorithm, we collected a dataset of per-batter videos broadcasted in various countries by randomly selecting segmented videos generated by the per-batter video segmentation module (described later in Section 3). The experimental results show that the proposed algorithm accurately extracts all pitch-by-pitch videos from testing data and keeps a small amount of pitching frames by dropping 84% of non-pitching frames of original videos. When comparing to the existing state-of-the-art approach [4], the proposed method not only extracts better pitch-by-pitch videos, but also significantly reduces the time complexity of 92% in average.
The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 gives an overview of the proposed algorithm that extracts pitch-by-pitch segments from baseball broadcast videos. Section 4 infers pitching events by detecting when the pitch speed appears, estimating when and where the pitcher is present, and analyzing the pitcher’s motion degree. Section 5 filters out noisy non-pitching events and extracts final pitch-by-pitch videos based on inferred potential pitching events. Section 6 summarizes the experimental results and verifies the promising performance of this technique. Section 7 concludes and gives future research directions of this study.
Section snippets
Related work
Many existing studies have designed techniques to analyze broadcast sport videos, such as baseball [23], [22], [33], [8], [24], [7], [5], [4], tennis [25], [9], soccer [16], [30], or basketball [32], [31], [12] videos, to extract interested sport events for coaches and managers to plan future game strategies. Among these studies, techniques for identifying significant baseball events or scenes from original broadcast baseball videos involve various topics, such as scoreboard localization [13],
System overview
Fig. 1 presents an overview of this technique, which includes three modules: (1) per-batter video segmentation, (2) pitching event inference, and (3) pitch-by-pitch video extraction, are described as follows.
The per-batter video segmentation module segments input broadcast videos into per-batter video clips, which records the process that a batter tries to hit the ball before he/she is out or reaches the first base, based on recognizing basic baseball events defined in [33]. By detecting
Pitching event inference
Fig. 2 illustrates three processing pipelines of the pitching event inference module, including (1) pitch completion recognition, (2) pitcher localization, and (3) pitch action detection. These three pipelines, that detect the pitcher presence information and the time associated with pitch starting, ball releasing, and pitch ending events, are described as follows.
Pitch-by-pitch video extraction
The pitch-by-pitch video extraction module generates pitch-by-pitch videos through two processing steps, including (1) time-based event grouping, and (2) rule-based pitch shot detection. These steps are described as follows.
Evaluations
To examine the performance and verify the robustness of the proposed system, we tested the proposed technique on various baseball videos collected from Major Baseball League (MLB), Nippon Professional Baseball (NPB), and Chinese Professional Baseball League (CPBL). Note that these baseball videos were broadcasted in different countries.
Conclusions and future work
This study presents a novel technique to accurately extract pitch-by-pitch videos from baseball broadcast videos. This algorithm first detects pitch speed emerging in the scoreboard as an indication that a pitch ends and perform a pitcher localization technique to predict when and where the pitcher is present. Based on the identified pitch shots and pitcher presence information, the proposed technique searches starting and releasing events by measuring the motion degree of the pitcher, and
References (38)
- et al.
k-Partite graph reinforcement and its application in multimedia information retrieval
Inform. Sci.
(2012) - et al.
VisionGo: towards video retrieval with joint exploration of human and computer
Inform. Sci.
(2011) - Average Speed of Baseball Pitches by Age Group....
- Baseball Basics: On the Field ∣ MLB.com: Official info....
- Holding Runners....
- Hsuan Sheng Chen, Hua Tsung Chen, Wen Jiin Tsai, Suh Yin Lee, Jen Yu Yu, Pitch-by-pitch extraction from single view...
- Hua Tsung Chen, Wen Jiin Tsai, Suh Yin Lee, Stance-based strike zone shaping and visualization in broadcast baseball...
- et al.
Video adaptation for small display based on content recomposition
IEEE Trans. Circ. Syst. Video Technol.
(2007) - Chih Yi Chiu, Po Chih Lin, Wei Ming Chang, Hsin Min Wang, Shi Nine Yang, Detecting pitching frames in baseball game...
- et al.
Explicit semantic events detection and development of realistic applications for broadcasting baseball videos
Multimedia Tools Appl.
(2008)
Identifying functional modules in protein-protein interaction networks: an integrated exact approach
Bioinformatics
Screen-strategy analysis in broadcast basketball video using player tracking
IEEE Visual Commun. Image Process.
Localization and recognition of the scoreboard in sports video based on sift point matching
Event detection of broadcast baseball videos
IEEE Trans. Circ. Syst. Video Technol.
A template-based baseball video scene classification using efficient playfield segmentation
Multimedia Tools Appl.
Cited by (4)
Fencing Tactics Analysis in Broadcast Video: A Point-by-Point Analytical System
2019, IEEE Transactions on Circuits and Systems for Video TechnologyA Survey of Content-Aware Video Analysis for Sports
2018, IEEE Transactions on Circuits and Systems for Video Technology