Mining Videos from the Web for Electronic Textbooks

Agrawal, Rakesh; Christoforaki, Maria; Gollapudi, Sreenivas; Kannan, Anitha; Kenthapadi, Krishnaram; Swaminathan, Adith

doi:10.1007/978-3-319-07248-7_16

Rakesh Agrawal²²,
Maria Christoforaki²³,
Sreenivas Gollapudi²²,
Anitha Kannan²²,
Krishnaram Kenthapadi²² &
…
Adith Swaminathan²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8478))

Included in the following conference series:

International Conference on Formal Concept Analysis

683 Accesses
11 Citations

Abstract

We propose a system for mining videos from the web for supplementing the content of electronic textbooks in order to enhance their utility. Textbooks are generally organized into sections such that each section explains very few concepts and every concept is primarily explained in one section. Building upon these principles from the education literature and drawing upon the theory of Formal Concept Analysis, we define the focus of a section in terms of a few indicia, which themselves are combinations of concept phrases uniquely present in the section. We identify videos relevant for a section by ensuring that at least one of the indicia for the section is present in the video and measuring the extent to which the video contains the concept phrases occurring in different indicia for the section. Our user study employing two corpora of textbooks on different subjects from two countries demonstrate that our system is able to find useful videos, relevant to individual sections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Improving India’s Education System through Information Technology. IBM (2005)
Google Scholar
Amazon Mechanical Turk, Requester Best Practices Guide. Amazon Web Services (June 2011)
Google Scholar
Agrawal, R., Christoforaki, M., Gollapudi, S., Kannan, A., Kenthapadi, K., Swaminathan, A.: Mining videos from the web for electronic textbooks. Technical Report MSR-TR-2014-5, Microsoft Research (2014)
Google Scholar
Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Enriching textbooks with images. In: CIKM (2011)
Google Scholar
Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Identifying enrichment candidates in textbooks. In: WWW (2011)
Google Scholar
Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: ACM DEV (2010)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, ch. 12. AAAI/MIT Press (1996)
Google Scholar
Berger, J.: Ways of seeing. Penguin (2008)
Google Scholar
Blei, D., Ng, A.Y., Jordani, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3 (2003)
Google Scholar
Bruza, P.D., Song, D.W., Wong, K.-F.: Aboutness from a commonsense perspective. Journal of the American Society for Information Science 51(12) (2000)
Google Scholar
Carpineto, C., Romano, G.: Concept data analysis: Theory and applications. John Wiley & Sons (2004)
Google Scholar
Chakrabarti, S., Van den Berg, M., Dom, B.: Focused crawling: A new approach to topic-specific web resource discovery. Computer Networks 31(11) (1999)
Google Scholar
Chambliss, M., Calfee, R.: Textbooks for Learning: Nurturing Children’s Minds. Wiley-Blackwell (1998)
Google Scholar
Cigarrán, J.M., Peñas, A., Gonzalo, J., Verdejo, F.: Automatic selection of noun phrases as document descriptors in an FCA-based information retrieval system. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 49–63. Springer, Heidelberg (2005)
Chapter Google Scholar
Clarke, C.L.A., Craswell, N., Soboroff, I., Voorhees, E.M.: Overview of the TREC 2011 web track. Technical report, NIST (2011)
Google Scholar
Coiro, J., Knobel, M., Lankshear, C., Leu, D. (eds.): Handbook of research on new literacies. Lawrence Erlbaum (2008)
Google Scholar
Csomai, A., Mihalcea, R.: Linking educational materials to encyclopedic knowledge. In: AIED (2007)
Google Scholar
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. JASIS 41(6) (1990)
Google Scholar
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: IJCAI (2007)
Google Scholar
Ganter, B., Wille, R.: Formal concept analysis: Mathematical foundations. Springer (1999)
Google Scholar
Gillies, J., Quijada, J.: Opportunity to learn: A high impact strategy for improving educational outcomes in developing countries. USAID Educational Quality Improvement Program, EQUIP2 (2008)
Google Scholar
Gray, W., Leary, B.: What makes a book readable. University of Chicago Press (1935)
Google Scholar
Hearst, M.A.: TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23(1) (1997)
Google Scholar
Hjørland, B.: Towards a theory of aboutness, subject, topicality, theme, domain, field, content ... and relevance. Journal of the American Society for Information Science and Technology 52(9) (2001)
Google Scholar
Hu, W., Xie, D., Fu, Z., Zeng, W., Maybank, S.: Semantic-based surveillance video retrieval. IEEE Transactions on Image Processing 16(4) (2007)
Google Scholar
Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: SIGIR (2010)
Google Scholar
Hutchins, W.J.: On the problem of aboutness in document analysis. Journal of Informatics 1(1) (1977)
Google Scholar
Jurafsky, D., Martin, J.: Speech and language processing. Prentice Hall (2008)
Google Scholar
Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: SIGIR (2009)
Google Scholar
Kuznetsov, S.O.: On computing the size of a lattice and related decision problems. Order 18(4) (2001)
Google Scholar
Kuznetsov, S.O.: Complexity of learning in concept lattices from positive and negative examples. Discrete Applied Mathematics 142(1) (2004)
Google Scholar
Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press (2008)
Google Scholar
Mariooryad, S., Kannan, A., Hakkani-Tur, D., Shriberg, E.: Automatic characterization of speaking styles in educational videos. In: ICASSP (2014)
Google Scholar
Medelyan, O., Milne, D., Legg, C., Witten, I.: Mining meaning from Wikipedia. International Journal of Human-Computer Studies 67(9) (2009)
Google Scholar
Mihalcea, R., Csomai, A.: Wikify!: Linking documents to encyclopedic knowledge. In: CIKM (2007)
Google Scholar
Miller, M.: Integrating online multimedia into college course and classroom: With application to the social sciences. MERLOT Journal of Online Learning and Teaching 5(2) (2009)
Google Scholar
Moulton, J.: How do teachers use textbooks and other print materials: A review of the literature. The Improving Educational Quality Project (1994)
Google Scholar
Over, P., Awad, G., Fiscus, J., Antonishek, B., Michel, M., Smeaton, A., Kraaij, W., Qunot, G.: TRECVID 2011 – Goals, tasks, data, evaluation mechanisms and metrics. Technical report, NIST (2011)
Google Scholar
Paranjpe, D.: Learning document aboutness from implicit user feedback and document structure. In: CIKM (2009)
Google Scholar
Patel, B., Meshram, B.: Content based video retrieval. The International Journal of Multimedia & Its Applications (IJMA) 4(5) (2012)
Google Scholar
Pinson, M., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Transactions on Broadcasting 50(3) (2004)
Google Scholar
Poelmans, J., Ignatov, D.I., Kuznetsov, S.O., Dedene, G.: Formal concept analysis in knowledge processing: A survey on models and techniques. Expert Systems with Applications 40(16) (2013)
Google Scholar
Priss, U.: Formal concept analysis in information science. Annual Review of Information Science and Technology 40 (2006)
Google Scholar
Shah, C.: TubeKit: A query-based YouTube crawling toolkit. In: JCDL (2008)
Google Scholar
Smoliar, S.W., Zhang, H.: Content based video indexing and retrieval. IEEE MultiMedia 1(2) (1994)
Google Scholar
Stolcke, A., Chen, B., Franco, H., Gadde, V., Graciarena, M., Hwang, M., Kirchhoff, K., Mandal, A., Morgan, N., Lei, X., et al.: Recent innovations in speech-to-text transcription at SRI-ICSI-UW. IEEE Transactions on Audio, Speech, and Language Processing 14(5) (2006)
Google Scholar
Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: AAAI (2006)
Google Scholar
Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with TITANIC. Data and Knowledge Engineering 42(2) (2002)
Google Scholar
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: ICTAI (2007)
Google Scholar
Tantrarungroj, P.: Effect of embedded streaming video strategy in an online learning environment on the learning of neuroscience. PhD thesis, Indiana State University (2008)
Google Scholar
Tian, Y., Cao, L., Liu, Z., Zhang, Z.: Hierarchical filtered motion for action recognition in crowded videos. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 42(3) (2012)
Google Scholar
Verspoor, A., Wu, K.B.: Textbooks and educational development. Technical report, World Bank (1990)
Google Scholar
Wang, K., Thrasher, C., Viegas, E., Li, X., Hsu, P.: An overview of Microsoft Web N-gram corpus and applications. In: NAACL–HLT (2010)
Google Scholar
Wille, R.: Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 1–33. Springer, Heidelberg (2005)
Chapter Google Scholar
Xue, X., Huston, S., Croft, W.B.: Improving verbose queries using subset distribution. In: CIKM (2010)
Google Scholar
Yang, Y., Bansal, N., Dakka, W., Ipeirotis, P., Koudas, N., Papadias, D.: Query by document. In: WSDM (2009)
Google Scholar
Zhang, N., Duan, L.-Y., Li, L., Huang, Q., Du, J., Gao, W., Guan, L.: A generic approach for systematic analysis of sports videos. ACM Transactions on Intelligent Systems and Technology 3(3) (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research, USA
Rakesh Agrawal, Sreenivas Gollapudi, Anitha Kannan & Krishnaram Kenthapadi
Polytechnic Institute of New York University, USA
Maria Christoforaki
Cornell University, USA
Adith Swaminathan

Authors

Rakesh Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Maria Christoforaki
View author publications
You can also search for this author in PubMed Google Scholar
Sreenivas Gollapudi
View author publications
You can also search for this author in PubMed Google Scholar
Anitha Kannan
View author publications
You can also search for this author in PubMed Google Scholar
Krishnaram Kenthapadi
View author publications
You can also search for this author in PubMed Google Scholar
Adith Swaminathan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Technische Universität Dresden, 01062, Dresden, Germany
Cynthia Vera Glodeanu
INSA-Lyon, CNRS, LIRIS UMR 5205, 9621, 69621, Lyon, France
Mehdi Kaytoue
Babes-Bolyai University, 400084, Cluj, Romania
Christian Sacarea

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agrawal, R., Christoforaki, M., Gollapudi, S., Kannan, A., Kenthapadi, K., Swaminathan, A. (2014). Mining Videos from the Web for Electronic Textbooks. In: Glodeanu, C.V., Kaytoue, M., Sacarea, C. (eds) Formal Concept Analysis. ICFCA 2014. Lecture Notes in Computer Science(), vol 8478. Springer, Cham. https://doi.org/10.1007/978-3-319-07248-7_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-07248-7_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07247-0
Online ISBN: 978-3-319-07248-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics