research-article

ROC comment: automated descriptive and subjective captioning of behavioral videos

Authors:
Mohammad Rafayet Ali

University of Rochester

University of Rochester
View Profile

,
Facundo Ciancio

University of Rochester

University of Rochester
View Profile

,
Ru Zhao

University of Rochester

University of Rochester
View Profile

,
Iftekhar Naim

University of Rochester

University of Rochester
View Profile

,
Mohammed (Ehsan) Hoque

University of Rochester

University of Rochester
View Profile

UbiComp '16: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous ComputingSeptember 2016Pages 928–933https://doi.org/10.1145/2971648.2971743

Published:12 September 2016Publication History

UbiComp '16: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing

Pages 928–933

ABSTRACT

We present an automated interface, ROC Comment, for generating natural language comments on behavioral videos. We focus on the domain of public speaking, which many people consider their greatest fear. We collect a dataset of 196 public speaking videos from 49 individuals and gather 12,173 comments, generated by more than 500 independent human judges. We then train a k-Nearest-Neighbor (k-NN) based model by extracting prosodic (e.g., volume) and facial (e.g., smiles) features. Given a new video, we extract features and select the closest comments using k-NN model. We further filter the comments by clustering them using DBScan, and eliminating the outliers. Evaluation of our system with 30 participants conclude that while the generated comments are helpful, there is room for improvement in further personalizing them. Our model has been deployed online, allowing individuals to upload their videos and receive open-ended and interpretative comments. Our system is available at http://tinyurl.com/roccomment.

References

Nazia Ali and Ruchi Nagar. 2013. To study the effectiveness of occupational therapy intervention in the management of fear of public speaking in school going children aged between 12-17 years Methodology: 45, 3: 21--25.Google Scholar
E Boath, a Stewart, and a Carryer. 2012. Tapping for PEAS : Emotional Freedom Technique (EFT) in reducing Presentation Expression Anxiety Syndrome (PEAS) in University students. Innovative Practice in Higher Education 1, April: 1--12.Google Scholar
Paul Boersma and David Weenink. Praat: doing phonetics by computer. Retrieved from http://www.fon.hum.uva.nl/praat/Google Scholar
Yejin Choi, Tamara L Berg, U N C Chapel Hill, Chapel Hill, and Stony Brook. 2014. TREE TALK : Composition and Compression of Trees for Image Descriptions. 2: 351--362.Google Scholar
Purvinis Dalia and Susnienė Rūta. 2010. Insights on Problems of Public Speaking and Ways of Overcoming It. Nation & Language: Modern Aspects of Socio-Linguistic Developmen;2010, p106.Google Scholar
Jacob Devlin, Saurabh Gupta, Ross Girshick, Margaret Mitchell, and C Lawrence Zitnick. 2015. Exploring Nearest Neighbor Approaches for Image Captioning. arXiv preprint arXiv:1505.04467.Google Scholar
Martin Ester, Hans P Kriegel, Jorg Sander, and Xiaowei Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Second International Conference on Knowledge Discovery and Data Mining: 226--231. http://doi.org/10.1.1.71.1980 Google ScholarDigital Library
Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, et al. 2010. Every picture tells a story: Generating sentences from images. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6314 LNCS, PART 4: 15--29. http://doi.org/10.1007/978-3-642-15561-1_2Google Scholar
Michelle Fung, Yina Jin, Ru Zhao, and Mohammed Ehsan Hoque. 2015. ROC Speak: Semi-Automated Personalized Feedback on Nonverbal Behavior from Recorded Videos. Proceedings of 17th International Conference on Ubiquitous Computing (Ubicomp). Google ScholarDigital Library
Kishore Papineni, Salim Roukos, Todd Ward, and Wj Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting on Computational Linguistics (ACL), July: 311--318. http://doi.org/10.3115/1073083.1073135 Google ScholarDigital Library
Polly Anne Rice. Emotional Freedom Techniques (EFT): Tap Into Empowerment. Retrieved from http://happyrealhealth.com/emotional-freedom-techniques-eft/Google Scholar
Marcus Rohrbach, Wei Qiu, Ivan Titov, Stefan Thater, Manfred Pinkal, and Bernt Schiele. 2013. Translating video content to natural language descriptions. Proceedings of the IEEE International Conference on Computer Vision, December: 433--440. http://doi.org/10.1109/ICCV.2013.61 Google ScholarDigital Library
Bahador Saket, Sijie Yang, Hong Tan, Koji Yatani, and Darren Edge. 2014. TalkZones: Section-based Time Support for Presentations. Proceedings of the 16th international conference on Human-computer interaction with mobile devices & services (MobileHCI '14): 263--272. http://doi.org/10.1145/2628363.2628399 Google ScholarDigital Library
M Iftekhar Tanveer, Emy Lin, and Mohammed Ehsan Hoque. 2015. Rhema: A Real-Time In-Situ Intelligent Interface to Help People with Public Speaking. IUI 2015: Proceedings of the 20th International Conference on Intelligent User Interfaces, 286--295. http://doi.org/10.1145/2678025.2701386 Google ScholarDigital Library
Ha Trinh, Koji Yatani, and Darren Edge. 2014. PitchPerfect. Proceedings of the 32nd annual ACM conference on Human factors in computing systems - CHI '14: 1571--1580. http://doi.org/10.1145/2556288.2557286 Google ScholarDigital Library
Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, and Kate Saenko. 2014. Translating videos to natural language using deep recurrent neural networks. arXiv preprint arXiv: 1412.4729.Google Scholar
Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2014. Show and Tell: A Neural Image Caption Generator. Retrieved from http://arxiv.org/abs/1411.4555Google Scholar
R Xu, C Xiong, W Chen, and Jj Corso. 2015. Jointly modeling deep video and compositional text to bridge vision and language in a unified framework. Proceedings of AAAI. Retrieved from http://www.acsu.buffalo.edu/~rxu2/xu_corso_AAAI2015_v2t.pdf Google ScholarDigital Library
SHORE^TM - Object and Face Recognition. Retrieved from http://www.iis.fraunhofer.de/en/ff/bsy/tech/bildanalyse/shore-gesichtsdetektion.htmlGoogle Scholar

Index Terms

ROC comment: automated descriptive and subjective captioning of behavioral videos
1. Human-centered computing

Recommendations

A Comparative Study on Method Comment and Inline Comment
Code comments are one of the important documents to help developers review and comprehend source code. In recent studies, researchers have proposed many deep learning models to generate the method header comments (i.e., method comment), which have ...
Read More
Towards automatically generating summary comments for Java methods
ASE '10: Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering

Studies have shown that good comments can help programmers quickly understand what a method does, aiding program comprehension and software maintenance. Unfortunately, few software projects adequately comment the code. One way to overcome the lack of ...
Read More
Deep code comment generation
ICPC '18: Proceedings of the 26th Conference on Program Comprehension

During software maintenance, code comments help developers comprehend programs and reduce additional time spent on reading and navigating source code. Unfortunately, these comments are often mismatched, missing or outdated in the software projects. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
UbiComp '16: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing
September 2016
1288 pages
ISBN:9781450344616
DOI:10.1145/2971648
General Chairs:
Paul Lukowicz
DFKI and University of Kaiserslautern
,
Antonio Krüger
DFKI and Saarland University
,
Program Chairs:
Andreas Bulling
Max Planck Institute for Informatics
,
Youn-Kyung Lim
KAIST
,
Shwetak N. Patel
University of Washington
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 September 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automated video captioning
comment generation
objective feedback
public speaking
Qualifiers
- research-article
Conference

Acceptance Rates
UbiComp '16 Paper Acceptance Rate101of389submissions,26%Overall Acceptance Rate764of2,912submissions,26%
More
Upcoming Conference
UBICOMP '24

Sponsor:

sigchi

sigchi

UBICOMP '24: The 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing

October 5 - 9, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 238
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ROC comment: automated descriptive and subjective captioning of behavioral videos

UbiComp '16: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Comparative Study on Method Comment and Inline Comment

Towards automatically generating summary comments for Java methods

Deep code comment generation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media