research-article

A graph-based approach to commonsense concept extraction and semantic similarity detection

Authors:

Dheeraj Rajagopal,

Kenneth KwokAuthors Info & Claims

WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Pages 565 - 570

https://doi.org/10.1145/2487788.2487995

Published: 13 May 2013 Publication History

Abstract

Commonsense knowledge representation and reasoning support a wide variety of potential applications in fields such as document auto-categorization, Web search enhancement, topic gisting, social process modeling, and concept-level opinion and sentiment analysis. Solutions to these problems, however, demand robust knowledge bases capable of supporting flexible, nuanced reasoning. Populating such knowledge bases is highly time-consuming, making it necessary to develop techniques for deconstructing natural language texts into commonsense concepts. In this work, we propose an approach for effective multi-word commonsense expression extraction from unrestricted English text, in addition to a semantic similarity detection technique allowing additional matches to be found for specific concepts not already present in knowledge bases.

References

[1]

S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. The Semantic Web, pages 722--735, 2007.

Digital Library

[2]

E. Cambria, N. Howard, J. Hsu, and A. Hussain. Sentic blending: Scalable multimodal fusion for continuous interpretation of semantics and sentics. In IEEE SSCI, Singapore, 2013.

[3]

E. Cambria and A. Hussain. Sentic Computing: Techniques, Tools, and Applications. Springer, Dordrecht, Netherlands, 2012.

Digital Library

[4]

E. Cambria, D. Rajagopal, D. Olsher, and D. Das. Big social data analysis. In R. Akerkar, editor, Big Data Computing, chapter 13. Chapman and Hall/CRC, 2013.

[5]

E. Cambria, Y. Song, H. Wang, and N. Howard. Semantic multi-dimensional scaling for open-domain sentiment analysis. IEEE Intelligent Systems. 2013.

[6]

A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. Hruschka, and T. Mitchell. Toward an architecture for never-ending language learning. In AAAI, pages 1306--1313, Atlanta, 2010.

Digital Library

[7]

G. Carroll and E. Charniak. Two experiments on learning probabilistic dependency grammars from corpora. AAAI technical report WS-92-01, Department of Computer Science, Univ., 1992.

[8]

E. Charniak. Statistical parsing with a context-free grammar and word statistics. In AAAI, pages 598--603, Providence, 1997.

Digital Library

[9]

S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391--407, 1990.

[10]

C. Eckart and G. Young. The approximation of one matrix by another of lower rank. Psychometrika, 1(3):211--218, 1936.

[11]

C. Fellbaum. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, 1998.

[12]

M. Grassi, E. Cambria, A. Hussain, and F. Piazza. Sentic web: A new paradigm for managing social media affective information. Cognitive Computation, 3(3):480--489, 2011.

[13]

R. Hwa. Sample selection for statistical grammar induction. In EMNLP, pages 45--52, Hong Kong, 2000.

Digital Library

[14]

J. Kandola, J. Shawe-Taylor, and N. Cristianini. Learning semantic similarity. Advances in neural information processing systems, 15:657--664, 2002.

[15]

D. Lenat and R. Guha. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Boston, 1989.

Digital Library

[16]

C. Manning. Part-of-speech tagging from 97% to 100%: Is it time for some linguistics? In A. Gelbukh, editor, Computational Linguistics and Intelligent Text Processing, volume 6608 of Lecture Notes in Computer Science, pages 171--189. Springer, 2011.

Digital Library

[17]

D. Olsher. COGPARSE: Brain-inspired knowledge-driven full semantics parsing. In Advances in Brain Inspired Cognitive Systems, pages 1--11, 2012.

[18]

D. Olsher. Full spectrum opinion mining: Integrating domain, syntactic and lexical knowledge. In ICDM SENTIRE, pages 693--700, Brussels, 2012.

Digital Library

[19]

D. Olsher. COGVIEW & INTELNET: Nuanced energy-based knowledge representation and integrated cognitive-conceptual framework for realistic culture, values, and concept-affected systems simulation. In IEEE SSCI, Singapore, 2013.

[20]

C. Paice. Another stemmer. SIGIR Forum, 24(3):56--61, 1990.

Digital Library

[21]

H. Park and C. Jun. A simple and fast algorithm for k-medoids clustering. Expert Systems with Applications, 36(2):3336--3341, 2009.

Digital Library

[22]

M. Sahami and T. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In WWW, pages 377--386, Edinburgh, 2006.

Digital Library

[23]

R. Speer and C. Havasi. ConceptNet 5: A large semantic network for relational knowledge. In E. Hovy, M. Johnson, and G. Hirst, editors, Theory and Applications of Natural Language Processing, chapter 6. Springer, 2012.

[24]

N. Tandon, D. Rajagopal, and G. De Melo. Markov chains for robust graph-based commonsense information extraction. In COLING, pages 439--446, Mumbai, 2012.

[25]

M. Tang, X. Luo, S. Roukos, et al. Active learning for statistical natural language parsing. In ACL, pages 120--127, Philadelphia, 2002.

Digital Library

[26]

K. Toutanova, D. Klein, C. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL, pages 173--180, Stroudsburg, 2003.

Digital Library

[27]

M. Wall, A. Rechtsteiner, and L. Rocha. Singular value decomposition and principal component analysis. In D. Berrar, W. Dubitzky, and M. Granzow, editors, A Practical Approach to Microarray Data Analysis, pages 91--109. Springer, 2003.

Cited By

Sharma BManral CDeogaonkar ASinha KVarshney NCharu (2024)Integration of Symbolic and Implicit Knowledge Representation for Open Domain2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI)10.1109/ICoICI62503.2024.10696090(1023-1027)Online publication date: 28-Aug-2024
https://doi.org/10.1109/ICoICI62503.2024.10696090
He MFang TWang WSong Y(2024)Acquiring and Modeling Abstract Commonsense Knowledge via ConceptualizationArtificial Intelligence10.1016/j.artint.2024.104149(104149)Online publication date: May-2024
https://doi.org/10.1016/j.artint.2024.104149
Süerdem A(2024)Exploring New Horizons in Word Sense Disambiguation and Topic Modeling: Potential of Deep Learning Based Transformers ModelsDigital Humanities Looking at the World10.1007/978-3-031-48941-9_26(341-356)Online publication date: 20-Apr-2024
https://doi.org/10.1007/978-3-031-48941-9_26
Show More Cited By

Index Terms

A graph-based approach to commonsense concept extraction and semantic similarity detection
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
2. Information systems
  1. Information retrieval
    1. Document representation
      1. Content analysis and feature selection

Recommendations

A New Approach for Calculating Semantic Similarity between Words Using WordNet and Set Theory
Abstract
Calculating semantic similarity between words is a challenging task of a lot of domains such as Natural language processing (NLP), information retrieval and plagiarism detection. WordNet is a lexical dictionary conceptually organized, where each ...
Building a Concept-Level Sentiment Dictionary Based on Commonsense Knowledge

Sentiment dictionaries are essential for research in the sentiment analysis field. A two-step method integrates iterative regression and random walk with in-link normalization to build a concept-level sentiment dictionary. The approach uses ConceptNet ...
Robust semantic text similarity using LSA, machine learning, and linguistic resources

Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the *SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

May 2013

1636 pages

ISBN:9781450320382

DOI:10.1145/2487788

General Chairs:
Daniel Schwabe
PUC-Rio - Brazil
,
Virgílio Almeida
UFMG - Brazil
,
Hartmut Glaser
CGI.br - Brazil
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Labs - Spain & Chile
,
Sue Moon
KAIST - South Korea

Copyright © 2013 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
CGIBR: Comite Gestor da Internet no Brazil

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '13

Sponsor:

NICBR
CGIBR

WWW '13: 22nd International World Wide Web Conference

May 13 - 17, 2013

Rio de Janeiro, Brazil

Acceptance Rates

WWW '13 Companion Paper Acceptance Rate 831 of 1,250 submissions, 66%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

63
Total Citations
View Citations
506
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sharma BManral CDeogaonkar ASinha KVarshney NCharu (2024)Integration of Symbolic and Implicit Knowledge Representation for Open Domain2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI)10.1109/ICoICI62503.2024.10696090(1023-1027)Online publication date: 28-Aug-2024
https://doi.org/10.1109/ICoICI62503.2024.10696090
He MFang TWang WSong Y(2024)Acquiring and Modeling Abstract Commonsense Knowledge via ConceptualizationArtificial Intelligence10.1016/j.artint.2024.104149(104149)Online publication date: May-2024
https://doi.org/10.1016/j.artint.2024.104149
Süerdem A(2024)Exploring New Horizons in Word Sense Disambiguation and Topic Modeling: Potential of Deep Learning Based Transformers ModelsDigital Humanities Looking at the World10.1007/978-3-031-48941-9_26(341-356)Online publication date: 20-Apr-2024
https://doi.org/10.1007/978-3-031-48941-9_26
Shen KKejriwal M(2024)Substructure Discovery in Commonsense Relations Using Graph Representation LearningIntelligent Systems and Applications10.1007/978-3-031-47721-8_48(714-734)Online publication date: 10-Jan-2024
https://doi.org/10.1007/978-3-031-47721-8_48
Shen KKejriwal M(2023)An experimental study measuring the generalization of fine‐tuned language representation models across commonsense reasoning benchmarksExpert Systems10.1111/exsy.1324340:5Online publication date: 10-Feb-2023
https://doi.org/10.1111/exsy.13243
Teo AWang ZPen HSubagdja BHo SQuek B(2023)Knowledge Graph enhanced Aspect-Based Sentiment Analysis Incorporating External Knowledge2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00107(791-798)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00107
Tan YTeo Huiying NZhe Ghe EYi Fong JWang Z(2023)Video Sentiment Analysis for Child Safety2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00106(783-790)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00106
Ong Kvan der Heever WSatapathy RCambria EMengaldo G(2023)FinXABSA: Explainable Finance through Aspect-Based Sentiment Analysis2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00105(773-782)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00105
Nanetti APavlopoulos JCambria E(2023)Sentiment Analysis of Primary Historical Sources2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00104(767-772)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00104
Leitter ZCambria E(2023)Non-Fungible Tokens: What Makes Them Valuable?2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00102(750-756)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ICDMW60847.2023.00102
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten