Skip to main content

Matrices, Compression, Learning Curves: Formulation, and the GroupNteach Algorithms

  • Conference paper
  • First Online:
Book cover Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9652))

Included in the following conference series:

Abstract

Suppose you are a teacher, and have to convey a set of object-property pairs (‘lions eat meat’). A good teacher will convey a lot of information, with little effort on the student side. What is the best and most intuitive way to convey this information to the student, without the student being overwhelmed? A related, harder problem is: how can we assign a numerical score to each lesson plan (i.e., way of conveying information)? Here, we give a formal definition of this problem of forming learning units and we provide a metric for comparing different approaches based on information theory. We also design an algorithm, groupNteach, for this problem. Our proposed groupNteach is scalable (near-linear in the dataset size); it is effective, achieving excellent results on real data, both with respect to our proposed metric, but also with respect to encoding length; and it is intuitive, conforming to well-known educational principles. Experiments on real and synthetic datasets demonstrate the effectiveness of groupNteach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Read the web. http://rtw.ml.cmu.edu/rtw/

  2. Araujo, M., GĂ¼nnemann, S., Mateos, G., Faloutsos, C.: Beyond blocks: hyperbolic community detection. In: Calders, T., Esposito, F., HĂ¼llermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part I. LNCS, vol. 8724, pp. 50–65. Springer, Heidelberg (2014)

    Google Scholar 

  3. Bro, R., Papalexakis, E.E., Acar, E., Sidiropoulos, N.D.: Coclustering - a useful tool for chemometrics. J. Chemom. 26(6), 256–263 (2012)

    Article  Google Scholar 

  4. Chakrabarti, D., Papadimitriou, S., Modha, D.S., Faloutsos, C.: Fully automatic cross-associations. In: ACM KDD, pp. 79–88 (2004)

    Google Scholar 

  5. Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: 7th ACM SIGKDD, pp. 269–274. ACM (2001)

    Google Scholar 

  6. Ganter, B., Stumme, G., Wille, R.: Formal Concept Analysis: Foundations and Applications, vol. 3626. Springer Science & Business Media, New York (2005)

    MATH  Google Scholar 

  7. Gobet, F., Lane, P.C., Croker, S., Cheng, P.C., Jones, G., Oliver, I., Pine, J.M.: Chunking mechanisms in human learning. Trends Cogn. Sci. 5(6), 236–243 (2001)

    Article  Google Scholar 

  8. Karypis, G., Kumar, V.: METIS-unstructured graph partitioning and sparse matrix ordering system, version 2.0 (1995)

    Google Scholar 

  9. Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K., Mak, C., Neveu, V., et al.: Drugbank 3.0: a comprehensive resource for omics research on drugs. Nucleic Acids Res. 39(suppl 1), D1035–D1041 (2011)

    Article  Google Scholar 

  10. Koedinger, K.R., Booth, J.L., Klahr, D.: Instructional complexity and the science to constrain it. Science 342(6161), 935–937 (2013)

    Article  Google Scholar 

  11. Koedinger, K.R., Brunskill, E., de Baker, R.S.J., McLaughlin, E.A., Stamper, J.C.: New potentials for data-driven intelligent tutoring system development and optimization. AI Mag. 34(3), 27–41 (2013)

    Google Scholar 

  12. Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: an approach to modeling networks. JMLR 11, 985–1042 (2010)

    MathSciNet  MATH  Google Scholar 

  13. March, W.B., Ram, P., Gray, A.G.: Fast Euclidean minimum spanning tree: algorithm, analysis, and applications. In: ACM KDD, pp. 603–612 (2010)

    Google Scholar 

  14. Matsuda, N., Cohen, W.W., Koedinger, K.R.: Teaching the teacher: tutoring SimStudent leads to more effective cognitive tutor authoring. IJAIED 25, 1–34 (2014)

    Google Scholar 

  15. Murphy, B., Talukdar, P., Mitchell, T.: Selecting corpus-semantic models for neurolinguistic decoding. In: ACL *SEM, pp. 114–123. Association for Computational Linguistics (2012)

    Google Scholar 

  16. Tarjan, R.E., Yao, A.C.-C.: Storing a sparse table. CACM 22(11), 606–611 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  17. Zha, H., He, X., Ding, C., Simon, H., Gu, M.: Bipartite graph partitioning and data clustering. In: 10th CIKM, pp. 25–32. ACM (2001)

    Google Scholar 

Download references

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. IIS-1247489

Research was sponsored by the Army Research Laboratory and was accomplished under Cooperative Agreement Number W911NF-09-2-0053.

This work is also partially supported by an IBM Faculty Award and a Google Focused Research Award. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, or other funding parties. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bryan Hooi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Hooi, B., Song, H.A., Papalexakis, E., Agrawal, R., Faloutsos, C. (2016). Matrices, Compression, Learning Curves: Formulation, and the GroupNteach Algorithms. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9652. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31750-2_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31749-6

  • Online ISBN: 978-3-319-31750-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics