skip to main content
10.1145/2682571.2797062acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
research-article

Concept Hierarchy Extraction from Textbooks

Published: 08 September 2015 Publication History

Abstract

Concept hierarchies have been useful tools for presenting and organizing knowledge. With the rapid growth in the number of online knowledge resources, automatic concept hierarchy extraction is increasingly attractive. Here, we focus on concept extraction from textbooks based on the knowledge in Wikipedia. Given a book, we extract important concepts in each book chapter using Wikipedia as a resource and from this construct a concept hierarchy for that book. We define local and global features that capture both the local relatedness and global coherence embedded in that textbook. In order to evaluate the proposed features and extracted concept hierarchies, we manually construct concept hierarchies for three well used textbooks by labeling important concepts for each book chapter. Experiments show that our proposed local and global features achieve better performance than using only keyphrases to construct the concept hierarchies. Moreover, we observe that incorporating global features can improve the concept ranking precision and reaffirms the global coherence in the book.

References

[1]
R. Agrawal, S. Gollapudi, A. Kannan, and K. Kenthapadi. Data mining for improving textbooks. ACM SIGKDD Explorations Newsletter, 13(2):7--19, 2012.
[2]
H. Alani, S. Kim, D. E. Millard, M. J. Weal, W. Hall, P. H. Lewis, and N. R. Shadbolt. Automatic ontology-based knowledge extraction from web documents. Intelligent Systems, IEEE, 18(1):14--21, 2003.
[3]
R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, pages 9--16, 2006.
[4]
J. W. Coffey, R. R. Hoffman, A. J. Cañas, and K. M. Ford. A concept map-based knowledge modeling approach to expert knowledge sharing. IKS, pages 212--217, 2002.
[5]
S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. In EMNLP, 2007.
[6]
S. Downes. E-learning 2.0. elearn magazine, 10.2005. Online http://elearnmag. org/subpage. cfm, pages 29--1, 2005.
[7]
P. Ferragina and U. Scaiella. Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). pages 1625--1628, 2010.
[8]
S. E. Gordon, K. A. Schmierer, and R. T. Gill. Conceptual graph analysis: Knowledge acquisition for instructional system design. Human Factors: The Journal of the Human Factors and Ergonomics Society, 35(3):459--481, 1993.
[9]
A. C. Graesser and S. P. Franklin. Quest: A cognitive model of question answering. Discourse processes, 13(3):279--303, 1990.
[10]
X. Han and J. Zhao. Named entity disambiguation by leveraging wikipedia semantic knowledge. In CIKM, pages 215--224. ACM, 2009.
[11]
G.-J. Hwang, Y.-R. Shi, and H.-C. Chu. A concept map approach to developing collaborative mindtools for context-aware ubiquitous learning. British Journal of Educational Technology, 42(5):778--789, 2011.
[12]
T. Joachims. Training linear svms in linear time. In KDD, pages 217-226. ACM, 2006.
[13]
G. Koutrika, L. Liu, and S. Simske. Generating reading orders over document collections. 2015.
[14]
M. Larranaga, A. Conde, I. Calvo, J. A. Elorriaga, and A. Arruarte. Automatic generation of the domain module from electronic textbooks: Method and validation. Knowledge and Data Engineering, IEEE Transactions on, 26(1):69--82, 2014.
[15]
C. Liang, S. Wang, Z. Wu, K. Williams, B. Pursel, B. Brautigam, S. Saul, H. Williams, K. Bowen, and C. Giles. Bbookx: An automatic book creation framework. In The ACM Symposium on Document Engineering, 2015.
[16]
K. M. Markham, J. J. Mintzes, and M. G. Jones. The concept map as a research and evaluation tool: Further evidence of validity. Journal of research in science teaching, 31(1):91--101, 1994.
[17]
J. R. McClure, B. Sonak, and H. K. Suen. Concept map assessment of classroom learning: Reliability, validity, and logistical practicality. Journal of research in science teaching, 36(4):475--492, 1999.
[18]
O. Medelyan, E. Frank, and I. H. Witten. Human-competitive tagging using automatic keyphrase extraction. In EMNLP, pages 1318--1327. Association for Computational Linguistics, 2009.
[19]
R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In CIKM, pages 233--242. ACM, 2007.
[20]
D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM, pages 509--518. ACM, 2008.
[21]
A. M. Olney. Extraction of concept maps from textbooks for domain modeling. In Intelligent Tutoring Systems, pages 390--392. Springer, 2010.
[22]
M. Rajman and R. Besançon. Text mining-knowledge extraction from unstructured textual data. In Advances in Data Science and Classification, pages 473--480. Springer, 1998.
[23]
L. Ratinov, D. Roth, D. Downey, and M. Anderson. Local and global algorithms for disambiguation to wikipedia. pages 1375--1384, 2011.
[24]
W.-M. Roth and A. Roychoudhury. The social construction of scientific concepts or the concept map as conscription device and tool for social thinking in high school science. Science education, 76(5):531--57, 1992.
[25]
S. H. Usman and I. O. Oyefolahan. Encouraging knowledge sharing using web 2.0 technologies in higher education: A survey. arXiv preprint arXiv:1406.7437, 2014.
[26]
J. D. Wallace and J. J. Mintzes. The concept map as a research tool: Exploring conceptual change in biology. Journal of research in science teaching, 27(10):1033--1052, 1990.
[27]
K. Williams, J. Wu, and C. L. Giles. Simseerx: a similar document search engine. In DocEng'14, pages 143--146. ACM, 2014.
[28]
I. Witten and D. Milne. An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In AAAI'08, pages 25--30, 2008.
[29]
Z. Wu and C. L. Giles. Measuring term informativeness in context. In HLT-NAACL, pages 259--269, 2013.
[30]
Z. Wu, Z. Li, P. Mitra, and C. L. Giles. Can back-of-the-book indexes be automatically created? In CIKM, pages 1745--1750. ACM, 2013.
[31]
Z. Wu, P. Mitra, and C. L. Giles. Table of contents recognition and extraction for heterogeneous book documents. In ICDAR, pages 1205--1209. IEEE, 2013.
[32]
Y. Yang, H. Liu, J. Carbonell, and W. Ma. Concept graph learning from educational data. In WSDM, pages 159--168, 2015

Cited By

View all
  • (2025)Intelligent TextbooksInternational Journal of Artificial Intelligence in Education10.1007/s40593-024-00451-9Online publication date: 11-Feb-2025
  • (2023)A Programming Language Learning Service by Linking Stack Overflow with Textbooks2023 IEEE International Conference on Web Services (ICWS)10.1109/ICWS60048.2023.00043(234-245)Online publication date: Jul-2023
  • (2023)Annotation Protocol for Textbook Enrichment with Prerequisite Knowledge GraphTechnology, Knowledge and Learning10.1007/s10758-023-09682-629:1(197-228)Online publication date: 21-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '15: Proceedings of the 2015 ACM Symposium on Document Engineering
September 2015
248 pages
ISBN:9781450333078
DOI:10.1145/2682571
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 September 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concept hierarchy
  2. open education
  3. textbooks
  4. web knowledge

Qualifiers

  • Research-article

Conference

DocEng '15
Sponsor:
DocEng '15: ACM Symposium on Document Engineering 2015
September 8 - 11, 2015
Lausanne, Switzerland

Acceptance Rates

DocEng '15 Paper Acceptance Rate 11 of 31 submissions, 35%;
Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Intelligent TextbooksInternational Journal of Artificial Intelligence in Education10.1007/s40593-024-00451-9Online publication date: 11-Feb-2025
  • (2023)A Programming Language Learning Service by Linking Stack Overflow with Textbooks2023 IEEE International Conference on Web Services (ICWS)10.1109/ICWS60048.2023.00043(234-245)Online publication date: Jul-2023
  • (2023)Annotation Protocol for Textbook Enrichment with Prerequisite Knowledge GraphTechnology, Knowledge and Learning10.1007/s10758-023-09682-629:1(197-228)Online publication date: 21-Sep-2023
  • (2022)MEduKG: A Deep-Learning-Based Approach for Multi-Modal Educational Knowledge Graph ConstructionInformation10.3390/info1302009113:2(91)Online publication date: 15-Feb-2022
  • (2022)An Optimization Approach to Automatic Construction of Browsable Concept Index for Organizing Online Educational Content2022 IEEE International Conference on Knowledge Graph (ICKG)10.1109/ICKG55886.2022.00011(22-31)Online publication date: Nov-2022
  • (2022)Learning-Relevant Concept Extraction By Utilizing Automatically Generated Textbook Corpora2022 International Conference on Advanced Learning Technologies (ICALT)10.1109/ICALT55010.2022.00117(379-383)Online publication date: Jul-2022
  • (2022)PREP: Prerequisite Relationship Extraction Using Position-Biased Burst AnalysisPattern Recognition and Data Analysis with Applications10.1007/978-981-19-1520-8_17(217-228)Online publication date: 2-Sep-2022
  • (2022)Recent Advances in Intelligent Textbooks for Better LearningAI in Learning: Designing the Future10.1007/978-3-031-09687-7_15(247-261)Online publication date: 27-Nov-2022
  • (2021)A Knowledge Image Construction Method for Effective Information Filtering and Mining From Education Big DataIEEE Access10.1109/ACCESS.2021.30743839(77341-77348)Online publication date: 2021
  • (2021)Knowledge models from PDF textbooksNew Review of Hypermedia and Multimedia10.1080/13614568.2021.1889692(1-49)Online publication date: 28-Feb-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media