Elsevier

Information Sciences

Volume 294, 10 February 2015, Pages 64-77
Information Sciences

An efficient pitch-by-pitch extraction algorithm through multimodal information

https://doi.org/10.1016/j.ins.2014.09.001Get rights and content

Highlights

  • A novel automation for extracting pitch-by-pitch shots is proposed.

  • The idea is based on utilization of multimodal information.

  • Our method achieves the best performance of accuracy and time complexity.

Abstract

Sport video analysis facilitates the discovery of semantic structures in sport broadcast videos and enables a wide spectrum of applications. For example, coaches can analyze offensive and defensive plays performed during games to assess a team’s capabilities. In general, identifying interested shots, e.g. pitch shots, from broadcast baseball videos requires great human labor to browse through those videos. In this work, we proposed a novel technique that automatically extracts pitch-by-pitch shots by recognizing the reliable emergence of pitching speed displayed on the scoreboard, estimating when and where the pitcher is present, and identifying the pitch shots based on the pitcher’s motion degree. To validate the performance and accuracy of the proposed technique, we collected a dataset of baseball videos broadcasted in various countries. The experimental results verify that the proposed technique successfully extracts the desired pitch-by-pitch videos. Furthermore, it outperforms the state-of-the-art approach in terms of accuracy and time complexity.

Introduction

Sport video analysis [23], [22], [33], [8], [24], [7], [5], [4], [18], [20], [17] has been an active research field for a long time. It facilitates the discovery of semantic structures in recorded sport broadcast videos and enables a wide spectrum of applications. For example, to decide suitable strategies for the future games, coaches and managers have to carefully watch specific shots, e.g., pitching or batting shots in a baseball game, to study the strength and weaknesses of rival players. In general, identifying interested shots, such as pitch shots, from broadcast baseball videos requires great human labor to browse through those videos. This study proposes a novel technique to efficiently and accurately segment complete pitch shots through multimodal information [34], [35], [37], [38].

Previous researchers, Chen et al. [4], designed a technique to extract pitch-by-pitch videos by first tracking ball trajectories to locate pitch shots in per-batter videos. Two types of videos were analyzed or extracted by Chen et al.:

  • Pitch-by-pitch video: A video clip encompasses frames captured in a period between the starting time when pitchers take a ready stance on the pitching rubber and the ending time when the ball reaches the home plate.

  • Per-batter video: A video clip includes frames between two adjacent batter change events [33], and it happens when the batter is either out (flies out, grounds out, and strike out) or credited with a base (on base hit or balls).

Based on the identified pitch shots, Chen et al. [4] extracted pitch-by-pitch videos mainly by computing baseball trajectories. This technique would fail to extract pitch-by-pitch shots for at least following three instances: (1) ball trajectory is out of a predefined search region, (2) pitch replays, and (3) white-on-white problems [4]. Furthermore, the computational complexity of [4] is high because it requires a lot of time to compute ball trajectories. To overcome the above limitations and issues, in this work, we propose an efficient algorithm that detects pitch speed emerging in the scoreboard as an indication of ending time of a pitch. Afterward, we will utilize the motion degree information of a certain pitcher to determine the starting time of a pitch.

To display the game status to audiences on sport video, the scoreboard in general appears frequently in a fixed position. Usually, the important information, e.g., scores, current inning, and the number of balls, strikes and outs, will be updated when certain pitching or batting events happen. For example, the pitch speed emerges in the scoreboard once the radar gun detects the moving speed of the ball in the end of a pitch. Therefore, we propose a novel technique, which infers ending pitching events associated with pitch shots by recognizing pitch speeds appearing in the scoreboard and estimating when and where the pitcher is present. Based on the identified pitch shots and estimated pitcher presence information, the proposed technique searches starting and releasing events by measuring the motion degree of the pitcher, and generates a list of potential pitch shots that satisfy time-based pitching constraints [3] to filter out noisy non-pitch events. By incorporating rules observed from best practices of efficient pitching [14], the proposed technique finally groups nearby events as final pitch shots and then extracts pitch-by-pitch video clips based on the event distribution in each shot from the original broadcast video footage.

The major contribution of this study is to design, prototype, and evaluate a novel pitch-by-pitch video extraction technique by incorporating a pitch speed recognition scheme, a pitcher localization technique, and a motion degree measurement scheme to accurately identify pitch shots. To validate the performance and accuracy of this algorithm, we collected a dataset of per-batter videos broadcasted in various countries by randomly selecting segmented videos generated by the per-batter video segmentation module (described later in Section 3). The experimental results show that the proposed algorithm accurately extracts all pitch-by-pitch videos from testing data and keeps a small amount of pitching frames by dropping 84% of non-pitching frames of original videos. When comparing to the existing state-of-the-art approach [4], the proposed method not only extracts better pitch-by-pitch videos, but also significantly reduces the time complexity of 92% in average.

The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 gives an overview of the proposed algorithm that extracts pitch-by-pitch segments from baseball broadcast videos. Section 4 infers pitching events by detecting when the pitch speed appears, estimating when and where the pitcher is present, and analyzing the pitcher’s motion degree. Section 5 filters out noisy non-pitching events and extracts final pitch-by-pitch videos based on inferred potential pitching events. Section 6 summarizes the experimental results and verifies the promising performance of this technique. Section 7 concludes and gives future research directions of this study.

Section snippets

Related work

Many existing studies have designed techniques to analyze broadcast sport videos, such as baseball [23], [22], [33], [8], [24], [7], [5], [4], tennis [25], [9], soccer [16], [30], or basketball [32], [31], [12] videos, to extract interested sport events for coaches and managers to plan future game strategies. Among these studies, techniques for identifying significant baseball events or scenes from original broadcast baseball videos involve various topics, such as scoreboard localization [13],

System overview

Fig. 1 presents an overview of this technique, which includes three modules: (1) per-batter video segmentation, (2) pitching event inference, and (3) pitch-by-pitch video extraction, are described as follows.

The per-batter video segmentation module segments input broadcast videos into per-batter video clips, which records the process that a batter tries to hit the ball before he/she is out or reaches the first base, based on recognizing basic baseball events defined in [33]. By detecting

Pitching event inference

Fig. 2 illustrates three processing pipelines of the pitching event inference module, including (1) pitch completion recognition, (2) pitcher localization, and (3) pitch action detection. These three pipelines, that detect the pitcher presence information and the time associated with pitch starting, ball releasing, and pitch ending events, are described as follows.

Pitch-by-pitch video extraction

The pitch-by-pitch video extraction module generates pitch-by-pitch videos through two processing steps, including (1) time-based event grouping, and (2) rule-based pitch shot detection. These steps are described as follows.

Evaluations

To examine the performance and verify the robustness of the proposed system, we tested the proposed technique on various baseball videos collected from Major Baseball League (MLB), Nippon Professional Baseball (NPB), and Chinese Professional Baseball League (CPBL). Note that these baseball videos were broadcasted in different countries.

Conclusions and future work

This study presents a novel technique to accurately extract pitch-by-pitch videos from baseball broadcast videos. This algorithm first detects pitch speed emerging in the scoreboard as an indication that a pitch ends and perform a pitcher localization technique to predict when and where the pitcher is present. Based on the identified pitch shots and pitcher presence information, the proposed technique searches starting and releasing events by measuring the motion degree of the pitcher, and

References (38)

  • Yue Gao et al.

    k-Partite graph reinforcement and its application in multimedia information retrieval

    Inform. Sci.

    (2012)
  • Huanbo Luan et al.

    VisionGo: towards video retrieval with joint exploration of human and computer

    Inform. Sci.

    (2011)
  • Average Speed of Baseball Pitches by Age Group....
  • Baseball Basics: On the Field ∣ MLB.com: Official info....
  • Holding Runners....
  • Hsuan Sheng Chen, Hua Tsung Chen, Wen Jiin Tsai, Suh Yin Lee, Jen Yu Yu, Pitch-by-pitch extraction from single view...
  • Hua Tsung Chen, Wen Jiin Tsai, Suh Yin Lee, Stance-based strike zone shaping and visualization in broadcast baseball...
  • Wen-Huang Cheng et al.

    Video adaptation for small display based on content recomposition

    IEEE Trans. Circ. Syst. Video Technol.

    (2007)
  • Chih Yi Chiu, Po Chih Lin, Wei Ming Chang, Hsin Min Wang, Shi Nine Yang, Detecting pitching frames in baseball game...
  • Wei Ta Chu et al.

    Explicit semantic events detection and development of realistic applications for broadcasting baseball videos

    Multimedia Tools Appl.

    (2008)
  • Damien Connaghan, Philip Kelly, Noel E. O’Connor, Game, shot and match: event-based indexing of tennis, in: IEEE...
  • Marcus T. Dittrich et al.

    Identifying functional modules in protein-protein interaction networks: an integrated exact approach

    Bioinformatics

    (2008)
  • David A. Forsyth, Jean Ponce, Computer Vision: A Modern Approach, Prentice Hall Professional Technical Reference,...
  • Tsung Sheng Fu et al.

    Screen-strategy analysis in broadcast basketball video using player tracking

    IEEE Visual Commun. Image Process.

    (2011)
  • Jinlin Guo et al.

    Localization and recognition of the scoreboard in sports video based on sift point matching

  • T. House, G. Heil, S. Johnson, The Art & Science of Pitching. Coaches Choice Series, Coaches Choice,...
  • Mao-Hsiung Hung et al.

    Event detection of broadcast baseball videos

    IEEE Trans. Circ. Syst. Video Technol.

    (2008)
  • Jong Yun Kim, Tae Yong Kim, Soccer ball tracking using dynamic kalman filter with velocity control, in: IEEE...
  • Chung-Ming Kuo et al.

    A template-based baseball video scene classification using efficient playfield segmentation

    Multimedia Tools Appl.

    (2011)
  • View full text