Mining Hidden Knowledge from the Counterterrorism Dataset Using Graph-Based Approach

Jha, Kishlay; Jin, Wei

doi:10.1007/978-3-319-41754-7_29

Mining Hidden Knowledge from the Counterterrorism Dataset Using Graph-Based Approach

Kishlay Jha¹⁸ &
Wei Jin¹⁸

Conference paper
First Online: 17 June 2016

2195 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9612))

Abstract

Information overloaded is now a matter of fact. These enormous stack of information poses huge potential to discover previously uncharted knowledge. In this paper, we propose a graph based approach integrated with statistical correlation measure to discover latent but valuable information buried under huge corpora. For given two concepts, \(C_i\) and \(C_j\) (e.g. bush and bin ladin), we find the best set of intermediate concepts interlinking them by gleaning across multiple documents. We perform query enrichment on input concepts using Longest Common Substring (LCSubstr) algorithm to enhance the level of granularity. Moreover, we use Kulczynski correlation measure to determine the strength of interdependence between concepts and demote associations with relatively meager statistical significance. Finally, we present our users with ranked paths, along with sentence level evidence to facilitate better interpretation of underlying context. Counterterrorism dataset is used to demonstrate the effectiveness and applicability of our technique.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Davies, R.: The creation of new knowledge by information retrieval and classification. J. Documentation 45(4), 273–301 (1989)
Article Google Scholar
Swanson, D.R., Smalheiser, N.R.: Implicit text linkage between medline records; using arrowsmith as an aid to scientific discovery. Libr. Trends 48, 48–59 (1999)
Google Scholar
Swanson, D.R., Smalheiser, N.R.: An interactive system for finding complementary literatures. Artif. Intell. 91, 183–203 (1997)
Article MATH Google Scholar
Jin, W., Srihari, R.K.: Knowledge discovery across documents through concept chain queries. In: Proceedings of the 6th IEEE International Conference on Data Mining Workshop on Foundation of Data Mining and Novel Techniques in High Dimensional Structural and Unstructred Data, pp. 448–452 (2006)
Google Scholar
Ben-Dov, M., Wu, W., Feldman, R., Cairns, P.A., House, R.: Improving knowledge discovery by combining text-mining and link analysis techniques. In: Proceedings of the SIAM International Conference on Data Mining (2004)
Google Scholar
Swanson, D.: Fish oil, raynauds syndrome, and undiscovered public knowledge. Perspect. Biol. Med. 30, 7–18 (1986)
Article Google Scholar
Weeber, M., Klein, H., Berg, L., Vos, R.: Using concepts in literature-based discovery: simulating swansons raynaud-fish oil and migraine-magnesium discoveries. J. Am. Soc. Inf. Sci. 52(7), 548–557 (2001)
Article Google Scholar
Gordon, M., Dumais, S.: Using latent semantic indexing for literature based discovery. JASIS 49(8), 674–685 (1998)
Article Google Scholar
Lindsay, R., Gordon, M.: Literature-based discovery by lexical statistics. JASIS 50(7), 574–587 (1999)
Article Google Scholar
Gordon, M., Lindsay, R.: Toward discovery support systems: a replication, re-examination, and extension of swansons work on literature based discovery of a connection between raynauds and fish oil. JASIS 47(2), 116–128 (1996)
Article Google Scholar
Srinivasan, P.: Text mining: generating hypotheses from medline. JASIS 55(5), 396–413 (2004)
Article Google Scholar
Yetisgen-Yildiz, M., Pratt, W.: Using statistical and knowledge-based approaches for literature-based discovery. J. Biomed. Inf. 39(6), 600–611 (2006)
Article Google Scholar
Hu, X., Zhang, X., Yoo, I., Wang, X., Feng, J.: Mining hidden connections among biomedical concepts from disjoint biomedical literature sets through semantic-based association rule. Int. J. Intell. Syst. 25, 207–223 (2010)
Google Scholar
Hu, X., Zhang, X., Yoo, I., Zhang, Y.: A semantic approach for mining hidden links from complementary and non-interactive biomedical literature. In: SDM, pp. 200–209 (2006)
Google Scholar
Srihari, R., Lamkhede, S., Bhasin, A.: Unapparent information revelation: a concept chain graph approach. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 200–209 (2005a)
Google Scholar
Srihari, R.K., Li, W., Niu, C., Cornell, T.: infoxtract: a customizable intermediate level information extraction engine. J. Nat. Lang. Eng. 14(01), 33–69 (2008)
Google Scholar
Jin, W., Srihari, R.K., Ho, H.H.: A text mining model for hypothesis generation. In: 19th IEEE International Conference on Tools with Artificial Intelligence 2007, ICTAI 2007, vol. 2, pp. 156–162. IEEE (2007)
Google Scholar

Download references

Acknowledgments

This work was supported by National Science Foundation grant IIS-1452898.

Author information

Authors and Affiliations

North Dakota State University, Fargo, ND, USA
Kishlay Jha & Wei Jin

Authors

Kishlay Jha
View author publications
You can also search for this author in PubMed Google Scholar
Wei Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kishlay Jha .

Editor information

Editors and Affiliations

ConservatoireNational desArts et Métiers, Paris, France
Elisabeth Métais
University of Salford, Salford, United Kingdom
Farid Meziane
University of Salford, Salford, United Kingdom
Mohamad Saraee
Oakland University, Rochester, Michigan, USA
Vijayan Sugumaran
University of Salford, Salford, United Kingdom
Sunil Vadera

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jha, K., Jin, W. (2016). Mining Hidden Knowledge from the Counterterrorism Dataset Using Graph-Based Approach. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2016. Lecture Notes in Computer Science(), vol 9612. Springer, Cham. https://doi.org/10.1007/978-3-319-41754-7_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-41754-7_29
Published: 17 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41753-0
Online ISBN: 978-3-319-41754-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics