Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2020

Abstract

Programming screencasts have become a pervasive resource on the Internet, which is favoured by many developers for learning new programming skills. For developers, the source code in screencasts is valuable and important. However, the streaming nature of screencasts limits the choice that they have for interacting with the code. Many studies apply the Optical Character Recognition (OCR) technique to convert screen images into text, which can be easily searched and indexed. However, we observe that the noise in the screen images significantly affects the quality of OCRed code.In this paper, we develop a tool named psc2code, which has two components, denoising code extraction from screencasts and enhancing programming video interaction. Experiment results on 1142 programming screencasts from YouTube show psc2code can effectively identify frames containing valid code region with a F1-score of 0.88 and improve the quality of OCRed code by fixing 46% of the errors. We also conduct a user study to evaluate the applicability of psc2code in enhancing video interaction, which shows it helps participants learn the knowledge in tutorials more efficiently.

Keywords

Programming, videos, code extraction, computer vision

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

ESEC/FSE '20: Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering: 8-13 November, online

First Page

1581

Last Page

1585

ISBN

9781450370431

Identifier

10.1145/3368089.3417925

Publisher

ACM

City or Country

New York

Copyright Owner and License

Publisher

Additional URL

https://doi.org/10.1145/3368089.3417925

Share

COinS