Abstract
We propose a system for mining videos from the web for supplementing the content of electronic textbooks in order to enhance their utility. Textbooks are generally organized into sections such that each section explains very few concepts and every concept is primarily explained in one section. Building upon these principles from the education literature and drawing upon the theory of Formal Concept Analysis, we define the focus of a section in terms of a few indicia, which themselves are combinations of concept phrases uniquely present in the section. We identify videos relevant for a section by ensuring that at least one of the indicia for the section is present in the video and measuring the extent to which the video contains the concept phrases occurring in different indicia for the section. Our user study employing two corpora of textbooks on different subjects from two countries demonstrate that our system is able to find useful videos, relevant to individual sections.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Improving India’s Education System through Information Technology. IBM (2005)
Amazon Mechanical Turk, Requester Best Practices Guide. Amazon Web Services (June 2011)
Agrawal, R., Christoforaki, M., Gollapudi, S., Kannan, A., Kenthapadi, K., Swaminathan, A.: Mining videos from the web for electronic textbooks. Technical Report MSR-TR-2014-5, Microsoft Research (2014)
Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Enriching textbooks with images. In: CIKM (2011)
Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Identifying enrichment candidates in textbooks. In: WWW (2011)
Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: ACM DEV (2010)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, ch. 12. AAAI/MIT Press (1996)
Berger, J.: Ways of seeing. Penguin (2008)
Blei, D., Ng, A.Y., Jordani, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3 (2003)
Bruza, P.D., Song, D.W., Wong, K.-F.: Aboutness from a commonsense perspective. Journal of the American Society for Information Science 51(12) (2000)
Carpineto, C., Romano, G.: Concept data analysis: Theory and applications. John Wiley & Sons (2004)
Chakrabarti, S., Van den Berg, M., Dom, B.: Focused crawling: A new approach to topic-specific web resource discovery. Computer Networks 31(11) (1999)
Chambliss, M., Calfee, R.: Textbooks for Learning: Nurturing Children’s Minds. Wiley-Blackwell (1998)
Cigarrán, J.M., Peñas, A., Gonzalo, J., Verdejo, F.: Automatic selection of noun phrases as document descriptors in an FCA-based information retrieval system. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 49–63. Springer, Heidelberg (2005)
Clarke, C.L.A., Craswell, N., Soboroff, I., Voorhees, E.M.: Overview of the TREC 2011 web track. Technical report, NIST (2011)
Coiro, J., Knobel, M., Lankshear, C., Leu, D. (eds.): Handbook of research on new literacies. Lawrence Erlbaum (2008)
Csomai, A., Mihalcea, R.: Linking educational materials to encyclopedic knowledge. In: AIED (2007)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. JASISÂ 41(6) (1990)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: IJCAI (2007)
Ganter, B., Wille, R.: Formal concept analysis: Mathematical foundations. Springer (1999)
Gillies, J., Quijada, J.: Opportunity to learn: A high impact strategy for improving educational outcomes in developing countries. USAID Educational Quality Improvement Program, EQUIP2 (2008)
Gray, W., Leary, B.: What makes a book readable. University of Chicago Press (1935)
Hearst, M.A.: TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23(1) (1997)
Hjørland, B.: Towards a theory of aboutness, subject, topicality, theme, domain, field, content ... and relevance. Journal of the American Society for Information Science and Technology 52(9) (2001)
Hu, W., Xie, D., Fu, Z., Zeng, W., Maybank, S.: Semantic-based surveillance video retrieval. IEEE Transactions on Image Processing 16(4) (2007)
Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: SIGIR (2010)
Hutchins, W.J.: On the problem of aboutness in document analysis. Journal of Informatics 1(1) (1977)
Jurafsky, D., Martin, J.: Speech and language processing. Prentice Hall (2008)
Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: SIGIR (2009)
Kuznetsov, S.O.: On computing the size of a lattice and related decision problems. Order 18(4) (2001)
Kuznetsov, S.O.: Complexity of learning in concept lattices from positive and negative examples. Discrete Applied Mathematics 142(1) (2004)
Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press (2008)
Mariooryad, S., Kannan, A., Hakkani-Tur, D., Shriberg, E.: Automatic characterization of speaking styles in educational videos. In: ICASSP (2014)
Medelyan, O., Milne, D., Legg, C., Witten, I.: Mining meaning from Wikipedia. International Journal of Human-Computer Studies 67(9) (2009)
Mihalcea, R., Csomai, A.: Wikify!: Linking documents to encyclopedic knowledge. In: CIKM (2007)
Miller, M.: Integrating online multimedia into college course and classroom: With application to the social sciences. MERLOT Journal of Online Learning and Teaching 5(2) (2009)
Moulton, J.: How do teachers use textbooks and other print materials: A review of the literature. The Improving Educational Quality Project (1994)
Over, P., Awad, G., Fiscus, J., Antonishek, B., Michel, M., Smeaton, A., Kraaij, W., Qunot, G.: TRECVID 2011 – Goals, tasks, data, evaluation mechanisms and metrics. Technical report, NIST (2011)
Paranjpe, D.: Learning document aboutness from implicit user feedback and document structure. In: CIKM (2009)
Patel, B., Meshram, B.: Content based video retrieval. The International Journal of Multimedia & Its Applications (IJMA)Â 4(5) (2012)
Pinson, M., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Transactions on Broadcasting 50(3) (2004)
Poelmans, J., Ignatov, D.I., Kuznetsov, S.O., Dedene, G.: Formal concept analysis in knowledge processing: A survey on models and techniques. Expert Systems with Applications 40(16) (2013)
Priss, U.: Formal concept analysis in information science. Annual Review of Information Science and Technology 40 (2006)
Shah, C.: TubeKit: A query-based YouTube crawling toolkit. In: JCDL (2008)
Smoliar, S.W., Zhang, H.: Content based video indexing and retrieval. IEEE MultiMedia 1(2) (1994)
Stolcke, A., Chen, B., Franco, H., Gadde, V., Graciarena, M., Hwang, M., Kirchhoff, K., Mandal, A., Morgan, N., Lei, X., et al.: Recent innovations in speech-to-text transcription at SRI-ICSI-UW. IEEE Transactions on Audio, Speech, and Language Processing 14(5) (2006)
Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: AAAI (2006)
Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with TITANIC. Data and Knowledge Engineering 42(2) (2002)
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: ICTAI (2007)
Tantrarungroj, P.: Effect of embedded streaming video strategy in an online learning environment on the learning of neuroscience. PhD thesis, Indiana State University (2008)
Tian, Y., Cao, L., Liu, Z., Zhang, Z.: Hierarchical filtered motion for action recognition in crowded videos. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 42(3) (2012)
Verspoor, A., Wu, K.B.: Textbooks and educational development. Technical report, World Bank (1990)
Wang, K., Thrasher, C., Viegas, E., Li, X., Hsu, P.: An overview of Microsoft Web N-gram corpus and applications. In: NAACL–HLT (2010)
Wille, R.: Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 1–33. Springer, Heidelberg (2005)
Xue, X., Huston, S., Croft, W.B.: Improving verbose queries using subset distribution. In: CIKM (2010)
Yang, Y., Bansal, N., Dakka, W., Ipeirotis, P., Koudas, N., Papadias, D.: Query by document. In: WSDM (2009)
Zhang, N., Duan, L.-Y., Li, L., Huang, Q., Du, J., Gao, W., Guan, L.: A generic approach for systematic analysis of sports videos. ACM Transactions on Intelligent Systems and Technology 3(3) (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Agrawal, R., Christoforaki, M., Gollapudi, S., Kannan, A., Kenthapadi, K., Swaminathan, A. (2014). Mining Videos from the Web for Electronic Textbooks. In: Glodeanu, C.V., Kaytoue, M., Sacarea, C. (eds) Formal Concept Analysis. ICFCA 2014. Lecture Notes in Computer Science(), vol 8478. Springer, Cham. https://doi.org/10.1007/978-3-319-07248-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-07248-7_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07247-0
Online ISBN: 978-3-319-07248-7
eBook Packages: Computer ScienceComputer Science (R0)