Abstract
We address the relative value of three information sources: costly interviews conducted in the field, newspaper articles that mention the areas in which the interviews took place, and keywords used to index the newspaper articles. Our research questions concern: (1) whether there is overlap in the information obtained from each source and (2) how the three information acquisition-extraction strategies employed can inform one another. This research project uses network text analysis as a framework for a mixed method approach to knowledge discovery. We show that concepts as well as the network structure of information obtained from interviews may be almost completely covered by networks representing the information extracted from a large number of news articles from a wide variety of sources, while the information overlap of interviews with article keywords was less straightforward. We also show how a conceptual network constructed from a small number of interviews can be used in a semantic pattern search that localizes interview topics in a larger network of news article topics. This approach thus uses newspaper articles to frame and elaborate the narratives of interviews in a larger cultural context.





Similar content being viewed by others
Notes
Articles contained all interview concepts except one. Interviews contained a knowledge concept, “local”, which was not found in the news articles.
References
Althaus SL, Edy JA, Phalen PF (2001) Using substitutes for full-text news stories in content analysis: which text is best? Am J Polit Sci 45(3):707–723
Althaus SL, Swigger N, Chernykh S, Hendry DJ, Wals SC, Tiwald C (2011) Assumed transmission in political science: a call for bringing description back in. J Polit 73(4):1065–1080
Barranco J, Wisler D (1999) Validity and systematicity of newspaper data in event analysis. Eur Sociol Rev 15(3):301–322
Baur N (2011) Mixing process-generated data in market sociology. Qual Quant 45:1233–1251
Bearman P, Stovel K (2000) Becoming a Nazi: a model for narrative networks. Poet 27(2–3):69–90
Bengston DN, Reed DP, Fan P, Goldhor-Wilcock A et al (2011) Rapid issue tracking: a method for taking the pulse of the public discussion of environmental policy. Environ Commun J Nat Cult 3(3):367–385
Bernard HR, Pelto PJ, Werner O, Boster J, Romney AK, Johnson A, Ember CR, Kasakoff A (1986) The construction of primary data in cultural anthropology. Curr Anthropol 27(4):382–396
Biroscak BJ, Smith PK, Post LA (2006) A practical approach to public health surveillance of violent deaths related to intimate partner relationships. Public Heal Rep 121(4):393–399
Bonacich P (1972) Factoring and weighting approaches to status scores and clique identification. J Math Sociol 2:113–120
Brier A, Hopp B (2011) Computer assisted text analysis in social science. Qual Quant 45:103–128
Carley KM (1993) Coding choices for textual analysis: a comparison of content analysis and map analysis. Sociol Methodol 23:75–126
Carley KM (1997a) Network text analysis: the network position of concepts. In: Roberts C (ed) Text analysis for the social sciences. Lawerence Erlbaum Associates, Mahwah, NJ 79-100
Carley KM (1997b) Extracting team mental models through textual Analysis. J Organ Behav 18:533–538
Carley KM (2002) Smart agents and organizations of the future. In: Lievrouw LA, Livingstone S (eds) The handbook of new media. Sage Pubn Inc, Thousand Oaks
Carley KM, Columbus D, Azoulay A (2012a) AutoMap user’s guide 2012. Carnegie Mellon University, School of Computer Science, Institute for Software Research, Technical Report, CMU-ISR-12-106
Carley KM, Bigrigg MW, Diallo B (2012b) Data-to-model: a mixed initiative approach for rapid ethnographic assessment. Comput Math Organ Theory (in press)
Chua AYK, Razikin K, Goh DH (2011) Social tags as news event detectors. J Inf Sci 37(1):3–18
Cucchiarelli A, D’Antonio F, Velardi P (2012) Semantically interconnected social networks. Soc Netw Anal Min 2(1):69–95
Davenport C, Ball P (2002) Views to a kill: exploring the implications of source selection in the case of Guatemalan state terror, 1977–1995. J Confl Resol 46(3):427–450
Deacon D (2007) Yesterday’s papers and today’s technology: digital newspaper archives and ‘push button’ content analysis. Eur J Commun 22(1):5–25
Diesner J, Carley KM (2005). Revealing social structure from texts: meta-matrix text analysis as a novel method for network text analysis. In: Narayanan VK, Armstrong DJ (eds) Causal mapping for information systems and technology research. Idea Group Publishing, Harrisburg
Diesner J, Carley KM (2008) Conditional random fields for entity extraction and ontological text coding. Comput Math Organ Theory 14:248–262
Diesner J, Carley KM (2010). Mapping socio-cultural networks of Sudan from open-source, large-scale text data. In: Proceedings of the 29th annual conference of the Sudan Studies Association, West Lafayette, May 2010
Dixon-Woods M, Seale C, Young B, Findlay M, Heney D (2003) Representing childhood cancer: accounts from newspapers and parents. Sociol Health Illn 25(2):143–164
Earl J, Martin A, McCarthy JD, Soule SA (2004) The use of newspaper data in the study of collective action. Annu Rev Sociol 30:65–80
Fan W, Wallace L, Rich S, Zhang Z (2006) Tapping the power of text mining. Commun ACM 49(9):77–82
Franzosi R (1987) The press as a source of socio-historical data: issues in the methodology of data collection from newspapers. Hist Methods 20:5–16
Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40:35–41
Freeman LC (1979) Centrality in social networks: conceptual clarification. Soc Netw 1:215–239
Gupta A (1995) Blurred boundaries: the discourse of corruption, the culture of politics, and the imagined state. Am Ethnol 22(2):375–402
Hakam J (2009) The “cartoons controversy”: a critical discourse analysis of English-language Arabic newspaper discourse. Discourse Soc 20(1):33–57
Hanneman RA, Shelton CR (2010) Applying modality and equivalence concepts to pattern finding in social process-produced data. Soc Netw Anal Min 1(1):59–72
Jin Y, Lin CY, Matsuo Y, Ishizuka M (2012) Mining dynamic social networks from public news articles for company value prediction. Soc Netw Anal Min 2(3):217–228
Johnson RB, Onwuegbuzie AJ, Turner LA (2007) Toward a definition of mixed methods research. J Mix Methods Res 1(2):112–133
Kas M, Carley KM, Carley LR (2012) Trends in science networks: understanding structures and statistics of scientific networks. Soc Netw Anal Min 2(2):169–187
Kirilenko A, Stepchenkova S, Romsdahl R, Mattis K (2012) Computer-assisted analysis of public discourse: a case study of the precautionary principle in the US and UK press. Qual Quan 46:501–522
McCall RB, Appelbaum MI (1991) Some issues of conducting secondary analyses. Dev Psychol 27(6):911–917
Mingers J (2003) The paucity of multimethod research: a review of the information systems literature. Inf Syst J 13:233–249
Oliver PE, Myer DT (1999) How events enter the public sphere: conflict, location, and sponsorship in local newspaper coverage of public events. Am J Sociol 105(1):38–87
Pfeffer J, Carley KM (2012a) Rapid modeling and analyzing networks extracted from pre-structured news articles. Comput Math Organ Theory. doi:10.1007/s10588-012-9122-1
Pfeffer J, Carley KM (2012b) Social NEtworks, social media, social change. In: Proceedings of the 2nd international conference on cross-cultural decision making: focus 2012, San Francisco
Popping R (2003) Knowledge graphs and network text analysis. Soc Sci Inf 42(1):91–106
Ready J, White MD, Fisher C (2006) Shock value: a comparative analysis of news reports and official police records on TASER deployments. Policing An Int J Police Strateg Manag 32(1):148–170
Roberts CW (2000) A conceptual framework for quantitative text analysis: on joining probabilities and substantive inferences about texts. Qual Quan 34:259–274
Sandelowski M, Voils CI, Knafl G (2009) On quantizing. J Mix Methods Res 3(3):208–222
Small ML (2011) How to conduct a mixed methods study: recent trends in a rapidly growing literature. Annu Rev Sociol 37:57–86
Snyder D, Kelly WR (1977) Conflict intensity, media sensitivity and the validity of newspaper data. Am Sociol Rev 42(1):105–123
Tremblay MC, Berndt DJ, Luther, Foulis SL, French DD et al (2009) Identifying fall-related injuries: text mining the electronic medical record. Inf Technol Manag 10:253–265
Watts D, Strogatz S (1998) Collective dynamics of small world networks. Nat 393:440–442
Weaver DA, Bimber B (2008) Finding news stories: a comparison of searches using LexisNexis and Google. Journalism Mass Commun Q 85(3):515–530
Weiner M (2009) Elite versus grassroots: disjunctures between parents’ and civil rights organizations demands for New York City’s public schools. Sociol Q 50(1):89–119
Woolley JT (2000) Using media-based data in studies of politics. Am J Political Sci 44(1):156–173
Acknowledgments
This work is supported in part by the Office of Naval Research (ONR), United States Navy (ONR MURI N000140811186, ONR MMT N00014060104). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Office of Naval Research or the U.S. government.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Martin, M.K., Pfeffer, J. & Carley, K.M. Network text analysis of conceptual overlap in interviews, newspaper articles and keywords. Soc. Netw. Anal. Min. 3, 1165–1177 (2013). https://doi.org/10.1007/s13278-013-0129-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-013-0129-5