A method for real-time translation of online video subtitles in sports events

Zhiliang, Zeng; Lei, Wang; Qiang, Liu

doi:10.1007/s11760-024-03606-2

A method for real-time translation of online video subtitles in sports events

Original Paper
Published: 18 December 2024

Volume 19, article number 146, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Zeng Zhiliang¹,
Wang Lei² &
Liu Qiang³

153 Accesses
Explore all metrics

Abstract

This study offers a fresh technique for translating subtitles in sports events, addressing the issues of real-time translation with improved accuracy and efficiency. Different from standard methods, which often result in delayed or inaccurate subtitles, the proposed method integrates advanced annotation techniques and machine learning algorithms to increase subtitle recognition and extraction. Annotation techniques in this study include systematically labeling spoken elements like commentary and dialogue, enabling accurate subtitle recognition and real-time adjustments in live sports broadcasts to ensure both accuracy and contextual relevance. These novel ideas allow for seamless adjustments to multiple language types, including the voices of commentators, off-site hosts, and athletes, while maintaining critical information within strict word count limits. Key improvements include faster processing times and increased translation precision, which are crucial for the dynamic environment of live sports broadcasts. The study builds on past studies in audiovisual translation, specifically tailoring its strategy to the unique demands of sports media. By emphasizing the importance of clear and contextually appropriate real-time subtitles, this research presents significant advancements over existing methods, providing valuable insights for future translation projects in sports and similar contexts. The results contribute to a more effective subtitle translation framework, enhancing the accessibility and viewing experience for audiences during live sports events.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation on Noise Reduction in Subtitle Generator for Videos

Automatic Truecasing of Video Subtitles Using BERT: A Multilingual Adaptable Approach

A Glimpse into Audiovisual Translation: Subtitling as a Case Study

Data availability

No datasets were generated or analysed during the current study.

Abbreviations

NLP:: Natural language processing
CV:: Computer vision
ML:: Machine learning
DL:: Deep learning
SLR:: Systematic literature review
NER:: Named entity recognition
DHH:: Deaf and hard of hearing
IBM:: International Business Machines Corporation
HMM:: Hidden Markov model
EM:: Expectation–maximization
GB:: Gigabytes
OCR:: Optical character recognition
BP:: Length-based penalty factor
Bleu:: Bilingual evaluation understudy
LS:: Length statistics
CLM:: Character-level modeling
ES:: Experimental setup
TT:: Training time
ASD:: Abnormal subtitle displays
RA:: Relationship analysis
N:: Gray level series of the image/total pixels
z:: Detection framework
S:: Mean value of the gray level difference of the adjacent frames of the whole video
$\left( {P_{r} \left( {\overline{e}_{l} ,\overline{f}} \right)} \right)$ :: Number of times the phrase pair appears in the corpus
M:: Length of the video frame sequence
r:: Window size
W:: Inter-frame difference measurement of each frame
g:: Function
A:: Gray value histogram
L:: Inter-frame difference measurement of each frame
j:: Frame index
k:: Cumulative number of blocks
D:: Euclidean distance
F:: Sobel gradient amplitude
k:: The weighting factor for Sobel operator
L:: Inter-frame difference measurement of each frame
∏:: Product operator
G:: Horizontal template for convolution
g:: Vertical template for convolution
x,y:: Pixel coordinates
E:: Translation probability estimation
ξ:: Normalization factor
γ:: Number of times a phrase appears in the target sentence
δ:: Translation probability
τ:: Number of times a word appears in the target sentence
n:: Number of word pairs
E:: Translation probability estimation
a_j (x):: Gray value histogram of frame J
b_k (y):: Gray value histogram of frame g
F_j (x,y):: Gray value at pixel point (x,y) in frame j
F_k (x,y):: Gray value at pixel point (x,y) in frame k
G_x :: Sobel gradient in the horizontal direction
G_y :: Sobel gradient in the vertical direction
U₁ :: Gradient matrix from horizontal template
U₂ :: Gradient matrix from vertical template
$Ecount$ $\left( {P_{r} \left( {\overline{e}_{l} ,\overline{f}} \right)} \right)$ :: Parallel bilingual phrase pair
w(e_i,f_i):: The Lexicalized weighted feature between words e_i and f_i
count(f_i,e_i):: Number of times the word pair (f_i,e_i) appears in the corpus
w_n :: Corresponding weight of co-occurrence n-ary words.
p_n :: Precision of n-ary words

References

Zhang, B., Chen, D.: Resource scheduling of green communication network for large sports events based on edge computing. Comput. Commun. 159, 299–309 (2020)
Article MATH Google Scholar
Zhang, H., Li, Y., Zhang, H.: Risk early warning safety model for sports events based on back propagation neural network machine learning. Saf. Sci. 118, 332–336 (2019)
Article MATH Google Scholar
Le, T.M., Le, V., Venkatesh, S., Tran, T.: Hierarchical conditional relation networks for multimodal video question answering. Int. J. Comput. Vis. 129(11), 3027–3050 (2021)
Article MATH Google Scholar
Yan, H., Xu, X.: End-to-end video subtitle recognition via a deep residual neural network. Pattern Recognit. Lett. 131, 368–375 (2020)
Article MATH Google Scholar
Barbero, J.M., de la Riva, I.R., Páez, M.S.S.: Multilanguage subtitle platform for production, distribution and diffusion of live sports events. Technol. Disabil. 27, 127–139 (2015). https://doi.org/10.3233/TAD-150435
Article MATH Google Scholar
Pražák, A., Loose, Z., Psutka, J.V., Radová, V., Psutka, J.: Live TV subtitling through respeaking with remote cutting-edge technology. Multimed. Tools Appl. 79(1), 1203–1220 (2020). https://doi.org/10.1007/s11042-019-08235-3
Article Google Scholar
Khan, A.A., Shao, J., Ali, W., Tumrani, S.: Content-aware summarization of broadcast sports videos: an audio-visual feature extraction approach. Neural. Process. Lett. 52(3), 1945–1968 (2020). https://doi.org/10.1007/s11063-020-10200-3
Article MATH Google Scholar
Petrova, X.Y., Anisimovsky, V.V., Rychagov, M.N.: Real-time detection of sports broadcasts using video content analysis. In: Rychagov, M.N., Tolstaya, E.V., Sirotenko, M.Y. (eds.) Smart Algorithms for Multimedia and Imaging, pp. 193–217. Springer International Publishing, Cham (2021)
Chapter MATH Google Scholar
Bastas, G., Kaliakatsos-Papakostas, M., Paraskevopoulos, G., Kaplanoglou, P., Christantonis, K., Tsioustas, C., Mastrogiannopoulos, D., Panga, D., Fotinea, E., Katsamanis, A.: Towards a DHH accessible theater: real-time synchronization of subtitles and sign language videos with ASR and NLP solutions. In: Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 653–661 (2022)
Moores, Z.: The NERLE model—a tool for assessing the quality of intralingual subtitles at live events. Univers. Access Inf. Soc. 23(2), 589–607 (2024). https://doi.org/10.1007/s10209-023-01050-6
Article MATH Google Scholar
Mkhallati, H., Cioppa, A., Giancola, S., Ghanem, B., Van Droogenbroeck, M.: Soccernet-caption: dense video captioning for soccer broadcasts commentaries. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5073–5084 (2023)
Masiello-Ruiz, J.M., Ruiz-Mezcua, B., Martinez, P., Gonzalez-Carrasco, I.: Synchro-Sub, an adaptive multi-algorithm framework for real-time subtitling synchronisation of multi-type TV programmes. Computing 105(7), 1467–1495 (2023)
Article Google Scholar
Kehkashan, T., Alsaeedi, A., Yafooz, W.M.S., Ismail, N.A., Al-Dhaqm, A.: Combinatorial analysis of deep learning and machine learning video captioning studies: a systematic literature review. IEEE Access. 12, 35048–35080 (2024). https://doi.org/10.1109/ACCESS.2024.3357980
Article Google Scholar
Andrews, P., Nordberg, O.E., Borch, N., Guribye, F., Fjeld, M.: Designing for automated sports commentary systems. In: Proceedings of the 2024 ACM International Conference on Interactive Media Experiences, pp. 75–93 (2024)
Campos, V.P., de Araújo, T.M.U., de Souza Filho, G.L., Gonçalves, L.M.G.: CineAD: a system for automated audio description script generation for the visually impaired. Univers. Access Inf. Soc. 19, 99–111 (2020)
Article Google Scholar
Salem, N., Alharbi, S., Khezendar, R., Alshami, H.: Real-time glove and android application for visual and audible Arabic sign language translation. Proc. Comput. Sci. 163, 450–459 (2019)
Article Google Scholar
Tian, M., Guan, B., Xing, Z., Fraundorfer, F.: Efficient ego-motion estimation for multi-camera systems with decoupled rotation and translation. Ieee Access. 8, 153804–153814 (2020)
Article Google Scholar
Manjunath, A., Li, H., Song, S., Zhang, Z., Liu, S., Kahrobai, N., Gowda, A., Seffens, A., Zou, J., Kumar, I.: Comprehensive analysis of 2.4 million patent-to-research citations maps the biomedical innovation and translation landscape. Nat. Biotechnol. 39(6), 678–683 (2021)
Article Google Scholar
Chen, J., Brunner, A.-D., Cogan, J.Z., Nuñez, J.K., Fields, A.P., Adamson, B., Itzhak, D.N., Li, J.Y., Mann, M., Leonetti, M.D.: Pervasive functional translation of noncanonical human open reading frames. Science (1979) 367(6482), 1140–1146 (2020)
Google Scholar
Li, H., Sha, J., Shi, C.: Revisiting back-translation for low-resource machine translation between Chinese and Vietnamese. IEEE Access. 8, 119931–119939 (2020)
Article MATH Google Scholar
Araújo, M., Pereira, A., Benevenuto, F.: A comparative study of machine translation for multilingual sentence-level sentiment analysis. Inf. Sci. (N Y). 512, 1078–1102 (2020)
Article MATH Google Scholar
Su, J., Chen, J., Jiang, H., Zhou, C., Lin, H., Ge, Y., Wu, Q., Lai, Y.: Multi-modal neural machine translation with deep semantic interactions. Inf. Sci. (N Y). 554, 47–60 (2021)
Article MathSciNet Google Scholar
Liu, C.-H., Karakanta, A., Tong, A.N., Aulov, O., Soboroff, I.M., Washington, J., Zhao, X.: Introduction to the second issue on machine translation for low-resource languages. Mach. Transl. 35, 1–2 (2021)
Article Google Scholar
Guo, D., Zhou, W., Li, A., Li, H., Wang, M.: Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation. IEEE Trans. Image Process. 29, 1575–1590 (2019)
Article MathSciNet MATH Google Scholar
Tao, R., Li, Z., Tao, R., Li, B.: ResAttr-GAN: unpaired deep residual attributes learning for multi-domain face image translation. IEEE Access. 7, 132594–132608 (2019)
Article MATH Google Scholar
Chatzikoumi, E.: How to evaluate machine translation: a review of automated and human metrics. Nat. Lang. Eng. 26(2), 137–161 (2020)
Article Google Scholar
Castilho, S., Gaspari, F., Moorkens, J., Popović, M., Toral, A.: Editors’ foreword to the special issue on human factors in neural machine translation. Mach. Transl. 33(1–2), 1–7 (2019)
Article MATH Google Scholar
Yuan, R., Zhang, Z., Song, P., Zhang, J., Qin, L.: Construction of virtual video scene and its visualization during sports training. IEEE Access. 8, 124999–125012 (2020)
Article MATH Google Scholar
Felipe, J.L., Garcia-Unanue, J., Viejo-Romero, D., Navandar, A., Sánchez-Sánchez, J.: Validation of a video-based performance analysis system (Mediacoach®) to analyze the physical demands during matches in LaLiga. Sensors. 19(19), 4113 (2019)
Article Google Scholar
Jian, M., Zhang, S., Wu, L., Zhang, S., Wang, X., He, Y.: Deep key frame extraction for sport training. Neurocomputing 328, 147–156 (2019)
Article MATH Google Scholar
Jiang, T.-Q., Xu, X.-M., Zhang, Q.-C., Wang, Z.: A sentiment classification model based on bi-directional LSTM with positional attention for fresh food consumer reviews. In: 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C), pp. 589–594. IEEE (2020)
Liu, W.L., Yang, H.: Improved simulation research of dynamic data fusion algorithm. Comput. Simul. 37(4), 294–297 (2020)
MATH Google Scholar
Lingxin, K., Yajun, M.: Big data adaptive migration and fusion simulation based on fuzzy matrix. Comput. Simul. 37(3), 4 (2020)
MATH Google Scholar

Download references

Acknowledgements

The manuscript has been read and approved by all the authors, the requirements for authorship, as stated earlier in this document, have been met, and each author believes that the manuscript represents honest work.

Funding

This research is supported by Gansu Province Philosophy and Social Science Planning Project Periodical Achievement (2021YB019).

Author information

Authors and Affiliations

Department of Physical Education Teaching and Research, Lanzhou University, Lanzhou, 730000, Gansu, China
Zeng Zhiliang
Department of Physical Education, Tangshan Normal University, Hengshui, 063000, Hebei, China
Wang Lei
Yancheng Kindergarten Teachers College, Yancheng, 224005, Jiangsu, China
Liu Qiang

Authors

Zeng Zhiliang
View author publications
You can also search for this author in PubMed Google Scholar
Wang Lei
View author publications
You can also search for this author in PubMed Google Scholar
Liu Qiang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Liu Qiang: Writing—original draft preparation, conceptualization, supervision, project administration. Zeng Zhiliang: formal analysis, methodology. Wang Lei: software, validation.

Corresponding author

Correspondence to Liu Qiang.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

All authors have been personally and actively involved in substantial work leading to the paper, and will take public responsibility for its content.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhiliang, Z., Lei, W. & Qiang, L. A method for real-time translation of online video subtitles in sports events. SIViP 19, 146 (2025). https://doi.org/10.1007/s11760-024-03606-2

Download citation

Received: 22 June 2024
Revised: 16 October 2024
Accepted: 28 October 2024
Published: 18 December 2024
DOI: https://doi.org/10.1007/s11760-024-03606-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A method for real-time translation of online video subtitles in sports events

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation on Noise Reduction in Subtitle Generator for Videos

Automatic Truecasing of Video Subtitles Using BERT: A Multilingual Adaptable Approach

A Glimpse into Audiovisual Translation: Subtitling as a Case Study

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A method for real-time translation of online video subtitles in sports events

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation on Noise Reduction in Subtitle Generator for Videos

Automatic Truecasing of Video Subtitles Using BERT: A Multilingual Adaptable Approach

A Glimpse into Audiovisual Translation: Subtitling as a Case Study

Data availability

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation