Abstract
Social network analysis and link diagrams are popular tools among intelligence analysts for analyzing and understanding criminal and terrorist organizations. A bottleneck in the use of such techniques is the manual effort needed to create the network to analyze from available source information. We describe how text mining techniques can be used for extraction of named entities and the relations among them, in order to enable automatic construction of networks from unstructured text. Since the text mining techniques used, viz. algorithms for named entity recognition and relation extraction, are not perfect, we also describe a method for incorporating information about uncertainty when constructing the networks and when doing the social network analysis. The presented approach is applied on text documents describing terrorist activities in Indonesia.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A downside with closeness centrality is that it is not applicable to networks with several disconnected components. A possible solution for this is to consider the inverse closeness centrality instead.
- 2.
Many real-world networks are scale-free, i.e., their number of edges follow a power law distribution [15].
- 3.
A more complete description of the workings of the NER in NLTK can be found in [46].
- 4.
References
Raab J, Milward HB (2003) Dark networks as problems. J Public Adm Res Theory 13:413–439
Svenson P, Svensson P, Tullberg H (2006) Social network analysis and information fusion for anti-terrorism. In: Proceedings of the conference on civil and military readiness 2006
Zhu B, Watts S, Chen H (2010) Visualizing social network concepts. Decis Support Syst 49:151–161
Geffre JL, Deckro RF, Knighton SA (2009) Determining critical members of layered operational terrorist networks. J Defense Model Simul, Appl Methodol Technol 6:97–109
Hougham V (2005) Sociological skills used in the capture of Saddam Hussein. http://www.asanet.org/footnotes/julyaugust05/fn3.html
Koelle D, Pfautz J, Farry M, Cox Z, Catto G, Campolongo J (2006) Applications of Bayesian belief networks in social network analysis. In: Proceedings of the 4th Bayesian modeling applications workshop during the 22nd annual conference on uncertainty in artificial intelligence
Fellegi IP, Sunter AB (1969) A theory for record linkage. J Am Stat Assoc 64(328):1183–1210
Dahlin J (2011) Entity matching. Swedish Defence Research Agency, Tech Rep
Frantz TL, Cataldo M, Carley KM (2009) Robustness of centrality measures under uncertainty: examining the role of network topology. Comput Math Organ Theory 303–328
Freeman LC (1979) Centrality in social networks: conceptual clarification. Soc Netw 1(3):215–239
Scott J (2000) Social network analysis, 2nd edn. Sage, Thousand Oaks
Newman MEJ (2001) Scientific collaboration networks. ii. Shortest paths, weighted networks, and centrality. Phys Rev E 64:016132
Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge
de Nooy W, Mrvar A, Batagelj V (2005) Exploratory social network analysis with Pajek. Structural analysis in the social sciences. Cambridge University Press, Cambridge
Barabási A-L, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks. In: Adar E, Hurst M, Finin T, Glance NS, Nicolov N, Tseng BL (eds) Proceedings of the 3rd international AAAI conference on weblogs and social media
Batagelj V, Mrvar A (2002) Pajek—analysis and visualization of large networks. In: Mutzel P, Jünger M, Leipert S (eds) Graph drawing. Lecture Notes in Computer Science, vol 2265. Springer, Berlin, pp 8–11
Blondel V, Guillaume J, Lambiotte R, Mech E (2008) Fast unfolding of communities in large networks. J Stat Mech, Theory Exp P10008
Adar E, Ré C (2007) Managing uncertainty in social networks. IEEE Data Eng Bull 30(2):23–31
Kossinets G (2006) Effects of missing data in social networks. Soc Netw 28:247–268
Costenbader E, Valente TW (2003) The stability of centrality measures when networks are sampled. Soc Netw 25:283–307
Borgatti SP, Carley KM, Krackhardt D (2004) On the robustness of centrality measures under conditions of imperfect data. Soc Netw 28(2):124–136
Svenson P (2008) Social network analysis of uncertain networks. In: Proceedings of the 2nd Skövde workshop on information fusion topics
Dahlin J, Svenson P (2011) A method for community detection in uncertain networks. In: Proceedings of the European intelligence and security informatics conference, EISIC 2011
Yager RR (2008) Intelligent social network analysis using granular computing. Int J Intell Syst 23:1196–1219
Dahlin J (2011) Community detection in imperfect networks. Master’s thesis, Umeå University
Opsahl T, Agneessens F, Skvoretz J (2010) Node centrality in weighted networks: generalizing degree and shortest paths. Soc Netw 32(3):245–251
Newman MEJ (2004) Analysis of weighted networks. Phys Rev E 70:056131
Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci (PNAS) 101:3747
Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25:163–177
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci (PNAS) 99(12):7821–7826
Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci USA 103(23):8577–8582
Feldman R, Sanger J (2007) The text mining handbook—advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Linguist Investig 30(1):3–26
Hasegawa T, Sekine S, Grishman R (2004) Discovering relations among named entities from large corpora. In: Proceedings of the 42nd annual meeting on association for computational linguistics
Doddington G, Mitchell A, Przybock M, Ramshaw L, Strassel S, Weischedel R (2004) The automatic content extraction (ACE) program: tasks, data, and evaluation. In: Proceedings of LREC’04
Banko M, Etzioni O (2008) The tradeoffs between open and traditional relation extraction. In: Proceedings of ACL-08: HLT, pp 28–36
Zelenko D, Aone C, Richardella A (2003) Kernel methods for relation extraction. J Mach Learn Res 3:1083–1106
Mesquita F, Merhav Y, Barbosa D (2010) Extracting information networks from the blogosphere: state-of-the-art and challenges. In: Proceedings of the fourth international conference on weblogs and social media
Banko M, Cafarella MJ, Soderl S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 2670–2676
Zhu J, Nie Z, Liu X, Zhang B, Wen J-R (2009) Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th international conference on world wide web, ser. WWW ’09, pp 101–110
GuoDong Z, Jian S, Jie Z, Min Z (2005) Exploring various knowledge in relation extraction. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 427–434
Morris JF, Anthony K, Kennedy KT, Deckro RF (2011) Extraction distractions: a comparison of social network model construction methods. In: Proceedings of the 2011 European intelligence and security informatics conference, EISIC2011
Makrehchi M, Kamel MS (2005) Building social networks from web documents: a text mining approach. In: Proceedings of the 2nd LORNET scientific conference
Elson DK, Dames N, McKeown KR (2010) Extracting social networks from literary fiction. In: Proceedings of the 48th annual meeting of the association for computational linguistics, pp 138–147
Bird S, Klein E, Loper E (2009) Natural language processing with python: analyzing text with the natural language toolkit. O’Reilly Media
Fang Y, Chang KC-C (2011) Searching patterns for relation extraction over the Web: rediscovering the pattern-relation duality. In: Proceedings of the fourth ACM international conference on Web search and data mining, ser. WSDM ’11, pp 825–834
Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton
Acknowledgements
This work was supported by the R&D programme of the Swedish Armed Forces. We would like to express our thanks to the other members of the FOI Information Fusion and Data Mining group and the VIA project for fruitful discussions and valuable feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Johansson, F., Svenson, P. (2013). Constructing and Analyzing Uncertain Social Networks from Unstructured Textual Data. In: Özyer, T., Erdem, Z., Rokne, J., Khoury, S. (eds) Mining Social Networks and Security Informatics. Lecture Notes in Social Networks. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6359-3_3
Download citation
DOI: https://doi.org/10.1007/978-94-007-6359-3_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6358-6
Online ISBN: 978-94-007-6359-3
eBook Packages: Computer ScienceComputer Science (R0)