Skip to main content

Mining Videos from the Web for Electronic Textbooks

  • Conference paper
Formal Concept Analysis (ICFCA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8478))

Included in the following conference series:

Abstract

We propose a system for mining videos from the web for supplementing the content of electronic textbooks in order to enhance their utility. Textbooks are generally organized into sections such that each section explains very few concepts and every concept is primarily explained in one section. Building upon these principles from the education literature and drawing upon the theory of Formal Concept Analysis, we define the focus of a section in terms of a few indicia, which themselves are combinations of concept phrases uniquely present in the section. We identify videos relevant for a section by ensuring that at least one of the indicia for the section is present in the video and measuring the extent to which the video contains the concept phrases occurring in different indicia for the section. Our user study employing two corpora of textbooks on different subjects from two countries demonstrate that our system is able to find useful videos, relevant to individual sections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Improving India’s Education System through Information Technology. IBM (2005)

    Google Scholar 

  2. Amazon Mechanical Turk, Requester Best Practices Guide. Amazon Web Services (June 2011)

    Google Scholar 

  3. Agrawal, R., Christoforaki, M., Gollapudi, S., Kannan, A., Kenthapadi, K., Swaminathan, A.: Mining videos from the web for electronic textbooks. Technical Report MSR-TR-2014-5, Microsoft Research (2014)

    Google Scholar 

  4. Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Enriching textbooks with images. In: CIKM (2011)

    Google Scholar 

  5. Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Identifying enrichment candidates in textbooks. In: WWW (2011)

    Google Scholar 

  6. Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: ACM DEV (2010)

    Google Scholar 

  7. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in Knowledge Discovery and Data Mining, ch. 12. AAAI/MIT Press (1996)

    Google Scholar 

  8. Berger, J.: Ways of seeing. Penguin (2008)

    Google Scholar 

  9. Blei, D., Ng, A.Y., Jordani, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3 (2003)

    Google Scholar 

  10. Bruza, P.D., Song, D.W., Wong, K.-F.: Aboutness from a commonsense perspective. Journal of the American Society for Information Science 51(12) (2000)

    Google Scholar 

  11. Carpineto, C., Romano, G.: Concept data analysis: Theory and applications. John Wiley & Sons (2004)

    Google Scholar 

  12. Chakrabarti, S., Van den Berg, M., Dom, B.: Focused crawling: A new approach to topic-specific web resource discovery. Computer Networks 31(11) (1999)

    Google Scholar 

  13. Chambliss, M., Calfee, R.: Textbooks for Learning: Nurturing Children’s Minds. Wiley-Blackwell (1998)

    Google Scholar 

  14. Cigarrán, J.M., Peñas, A., Gonzalo, J., Verdejo, F.: Automatic selection of noun phrases as document descriptors in an FCA-based information retrieval system. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 49–63. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  15. Clarke, C.L.A., Craswell, N., Soboroff, I., Voorhees, E.M.: Overview of the TREC 2011 web track. Technical report, NIST (2011)

    Google Scholar 

  16. Coiro, J., Knobel, M., Lankshear, C., Leu, D. (eds.): Handbook of research on new literacies. Lawrence Erlbaum (2008)

    Google Scholar 

  17. Csomai, A., Mihalcea, R.: Linking educational materials to encyclopedic knowledge. In: AIED (2007)

    Google Scholar 

  18. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. JASIS 41(6) (1990)

    Google Scholar 

  19. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: IJCAI (2007)

    Google Scholar 

  20. Ganter, B., Wille, R.: Formal concept analysis: Mathematical foundations. Springer (1999)

    Google Scholar 

  21. Gillies, J., Quijada, J.: Opportunity to learn: A high impact strategy for improving educational outcomes in developing countries. USAID Educational Quality Improvement Program, EQUIP2 (2008)

    Google Scholar 

  22. Gray, W., Leary, B.: What makes a book readable. University of Chicago Press (1935)

    Google Scholar 

  23. Hearst, M.A.: TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23(1) (1997)

    Google Scholar 

  24. Hjørland, B.: Towards a theory of aboutness, subject, topicality, theme, domain, field, content ... and relevance. Journal of the American Society for Information Science and Technology 52(9) (2001)

    Google Scholar 

  25. Hu, W., Xie, D., Fu, Z., Zeng, W., Maybank, S.: Semantic-based surveillance video retrieval. IEEE Transactions on Image Processing 16(4) (2007)

    Google Scholar 

  26. Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: SIGIR (2010)

    Google Scholar 

  27. Hutchins, W.J.: On the problem of aboutness in document analysis. Journal of Informatics 1(1) (1977)

    Google Scholar 

  28. Jurafsky, D., Martin, J.: Speech and language processing. Prentice Hall (2008)

    Google Scholar 

  29. Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: SIGIR (2009)

    Google Scholar 

  30. Kuznetsov, S.O.: On computing the size of a lattice and related decision problems. Order 18(4) (2001)

    Google Scholar 

  31. Kuznetsov, S.O.: Complexity of learning in concept lattices from positive and negative examples. Discrete Applied Mathematics 142(1) (2004)

    Google Scholar 

  32. Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press (2008)

    Google Scholar 

  33. Mariooryad, S., Kannan, A., Hakkani-Tur, D., Shriberg, E.: Automatic characterization of speaking styles in educational videos. In: ICASSP (2014)

    Google Scholar 

  34. Medelyan, O., Milne, D., Legg, C., Witten, I.: Mining meaning from Wikipedia. International Journal of Human-Computer Studies 67(9) (2009)

    Google Scholar 

  35. Mihalcea, R., Csomai, A.: Wikify!: Linking documents to encyclopedic knowledge. In: CIKM (2007)

    Google Scholar 

  36. Miller, M.: Integrating online multimedia into college course and classroom: With application to the social sciences. MERLOT Journal of Online Learning and Teaching 5(2) (2009)

    Google Scholar 

  37. Moulton, J.: How do teachers use textbooks and other print materials: A review of the literature. The Improving Educational Quality Project (1994)

    Google Scholar 

  38. Over, P., Awad, G., Fiscus, J., Antonishek, B., Michel, M., Smeaton, A., Kraaij, W., Qunot, G.: TRECVID 2011 – Goals, tasks, data, evaluation mechanisms and metrics. Technical report, NIST (2011)

    Google Scholar 

  39. Paranjpe, D.: Learning document aboutness from implicit user feedback and document structure. In: CIKM (2009)

    Google Scholar 

  40. Patel, B., Meshram, B.: Content based video retrieval. The International Journal of Multimedia & Its Applications (IJMA) 4(5) (2012)

    Google Scholar 

  41. Pinson, M., Wolf, S.: A new standardized method for objectively measuring video quality. IEEE Transactions on Broadcasting 50(3) (2004)

    Google Scholar 

  42. Poelmans, J., Ignatov, D.I., Kuznetsov, S.O., Dedene, G.: Formal concept analysis in knowledge processing: A survey on models and techniques. Expert Systems with Applications 40(16) (2013)

    Google Scholar 

  43. Priss, U.: Formal concept analysis in information science. Annual Review of Information Science and Technology 40 (2006)

    Google Scholar 

  44. Shah, C.: TubeKit: A query-based YouTube crawling toolkit. In: JCDL (2008)

    Google Scholar 

  45. Smoliar, S.W., Zhang, H.: Content based video indexing and retrieval. IEEE MultiMedia 1(2) (1994)

    Google Scholar 

  46. Stolcke, A., Chen, B., Franco, H., Gadde, V., Graciarena, M., Hwang, M., Kirchhoff, K., Mandal, A., Morgan, N., Lei, X., et al.: Recent innovations in speech-to-text transcription at SRI-ICSI-UW. IEEE Transactions on Audio, Speech, and Language Processing 14(5) (2006)

    Google Scholar 

  47. Strube, M., Ponzetto, S.: WikiRelate! Computing semantic relatedness using Wikipedia. In: AAAI (2006)

    Google Scholar 

  48. Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Computing iceberg concept lattices with TITANIC. Data and Knowledge Engineering 42(2) (2002)

    Google Scholar 

  49. Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: ICTAI (2007)

    Google Scholar 

  50. Tantrarungroj, P.: Effect of embedded streaming video strategy in an online learning environment on the learning of neuroscience. PhD thesis, Indiana State University (2008)

    Google Scholar 

  51. Tian, Y., Cao, L., Liu, Z., Zhang, Z.: Hierarchical filtered motion for action recognition in crowded videos. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 42(3) (2012)

    Google Scholar 

  52. Verspoor, A., Wu, K.B.: Textbooks and educational development. Technical report, World Bank (1990)

    Google Scholar 

  53. Wang, K., Thrasher, C., Viegas, E., Li, X., Hsu, P.: An overview of Microsoft Web N-gram corpus and applications. In: NAACL–HLT (2010)

    Google Scholar 

  54. Wille, R.: Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 1–33. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  55. Xue, X., Huston, S., Croft, W.B.: Improving verbose queries using subset distribution. In: CIKM (2010)

    Google Scholar 

  56. Yang, Y., Bansal, N., Dakka, W., Ipeirotis, P., Koudas, N., Papadias, D.: Query by document. In: WSDM (2009)

    Google Scholar 

  57. Zhang, N., Duan, L.-Y., Li, L., Huang, Q., Du, J., Gao, W., Guan, L.: A generic approach for systematic analysis of sports videos. ACM Transactions on Intelligent Systems and Technology 3(3) (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Agrawal, R., Christoforaki, M., Gollapudi, S., Kannan, A., Kenthapadi, K., Swaminathan, A. (2014). Mining Videos from the Web for Electronic Textbooks. In: Glodeanu, C.V., Kaytoue, M., Sacarea, C. (eds) Formal Concept Analysis. ICFCA 2014. Lecture Notes in Computer Science(), vol 8478. Springer, Cham. https://doi.org/10.1007/978-3-319-07248-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07248-7_16

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07247-0

  • Online ISBN: 978-3-319-07248-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics