Towards a wider perspective in the social sciences using a network of variables based on thousands of results

Zhitomirsky-Geffet, Maayan; Bergman, Ofer; Hilel, Shir

doi:10.1007/s11192-020-03446-0

Towards a wider perspective in the social sciences using a network of variables based on thousands of results

Published: 06 May 2020

Volume 123, pages 1385–1406, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

Maayan Zhitomirsky-Geffet¹,
Ofer Bergman¹ &
Shir Hilel¹

417 Accesses
Explore all metrics

Abstract

This paper addresses the problem of information burying in social sciences, where a large amount of experimental findings reported in multiple scientific articles may be missed by scholars due to the lack of an active accumulation, organization and synthesis of these findings into a centralized information system. To tackle the information burying problem, in this paper we present a new network-based data model and methodology for aggregating, organizing, linking and mining quantitative results published in multiple academic articles in particular sub-fields of social sciences. The goal of the proposed methodology is to provide researchers with a wider perspective when viewing scientific results in their own fields and utilize it for their research. To validate the proposed approach, we conducted a manual experiment with a corpus of 41 scientific articles in the field of personal information management. The experiment indicates that the constructed network-based information system can be effectively used to explore the relationships between the results of various articles, raising new research questions and hypotheses based on results from multiple articles that tested similar variables. The proposed system can serve as a catalyst for the advancement of research in various fields of social science.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering the interdisciplinary nature of Big Data research through social network analysis and visualization

Article 08 May 2017

A social network analysis of academic journals in public administration in the early twenty-first century: examining journal level bibliometrics with network analysis

Article 09 November 2023

Social Networks Analysis: Tools, Measures and Visualization

Notes

National Science Board (2018). Science and engineering indicators 2018. Arlington, VA: National Science Foundation. Retrieved from https://www.nsf.gov/statistics/2018/nsb20181/report/sections/academic-research-and-development/outputs-of-s-e-research-publications.
National Science Board (2016). Science and engineering indicators 2016. Arlington, VA: National Science Foundation. Retrieved from https://www.nsf.gov/statistics/2016/nsb20161/uploads/1/nsb20161.pdf.
Retrieved from http://www.w3.org/TR/rdf-sparql-query/.

References

Bechhofer, S., Buchan, I., De Roure, D., Missier, P., Ainsworth, J., Bhagat, J., et al. (2013). Why linked data is not enough for scientists. Future Generation Computer Systems,29(2), 599–611.
Article Google Scholar
Bergman, O., & Whittaker, S. (2016). The science of managing our digital stuff. Cambridge, MA: MIT Press.
Book Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008.
Article Google Scholar
Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology,92(5), 1170–1182.
Article Google Scholar
Borenstein, M., Hedges, L. V., Higgins, Julian P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. Hoboken, NJ: Wiley.
Book Google Scholar
Borgatti, S. P. (2005). Centrality and network flow. Social Networks,27(1), 55–71.
Article MathSciNet Google Scholar
Brown, P. O., & Botstein, D. (1999). Exploring the new world of the genome with DNA microarrays. Nature Genetics,21(1s), 33.
Article Google Scholar
Casillas, L., & Daradoumis, T. (2012). An ontological structure for gathering and sharing knowledge among scientists through experiment modeling. In Collaborative and distributed E-research: Innovations in technologies, strategies and applications (pp. 165–179). IGI Global.
Cheadle, C., Cao, H., Kalinin, A., & Hodgkinson, J. (2017). Advanced literature analysis in a Big Data world. Annals of the New York Academy of Sciences,1387(1), 25–33.
Article Google Scholar
Chen, L., & Friedman, C. (2004). Extracting phenotypic information from the literature via natural language processing (pp. 758–762). San-Francisco: Medinfo.
Google Scholar
Ciccarese, P., Elizabeth, W., Wong, G., Ocana, M., Kinoshita, J., Ruttenberg, A., et al. (2008). The SWAN biomedical discourse ontology. Journal of Biomedical Informatics,41(5), 739–751.
Article Google Scholar
De Roure, D., Goble, C., Aleksejevs, S., Bechhofer, S., Bhagat, J., Cruickshank, D., et al. (2010). The evolution of myexperiment. In 2010 IEEE Sixth International Conference on e-Science (e-Science).
Etzioni, O., Banko, M., Soderland, S., & Weld, D. S. (2008). Open information extraction from the web. Communications of the ACM,51(12), 68–74.
Article Google Scholar
Feichtinger, J., McFarlane, R. J., & Larcombe, L. D. (2012). CancerMA: A web-based tool for automatic meta-analysis of public cancer microarray data. Database 2012.
Fiszman, M., Demner-Fushman, D., Kilicoglu, H., & Rindflesch, T. C. (2009). Automatic summarization of MEDLINE citations for evidence-based medical treatment: A topic-oriented evaluation. Journal of Biomedical Informatics,42(5), 801–813.
Article Google Scholar
Friedman, C., Kra, P., Yu, H., Krauthammer, M., & Rzhetsky, A. (2001). GENIES: A natural-language processing system for the extraction of molecular pathways from journal articles. In Proceedings of ISMB (supplement of bioinformatics) conference, Copenhagen, Denmark (pp. 74–82).
Gruber, T. R. (1995). Toward principles for the design of ontologies used for knowledge sharing? International Journal of Human-Computer Studies,43(5–6), 907–928.
Article Google Scholar
Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th conference on computational linguistics (Vol. 2).
Hearst, M. A. (2006). Clustering versus faceted categories for information exploration. Communications of the ACM,49(4), 59–61. https://doi.org/10.1145/1121949.1121983.
Article Google Scholar
Henderson, J., & Popa, D. N. (2016). A vector space for distributional semantics for entailment. arXiv preprint arXiv:1607.03780.
Higgins, J. P. T., & Green, S. (2005). Cochrane handbook for systematic reviews of interventions. Version.
Holzinger, A., Simonic, K.-M., & Yildirim, P. (2012). Disease–disease relationships for rheumatic diseases: Web-based biomedical textmining an knowledge discovery to assist medical decision making. In 2012 IEEE 36th annual computer software and applications conference.
Jankowski, N. W. (2007). Exploring e-science: An introduction. Journal of Computer-Mediated Communication,12(2), 549–562.
Article Google Scholar
Keshtkaran, A., Yuhaniz, S. S., & Ibrahim, S. (2017). An overview of cross-document coreference resolution. In 2017 international conference on computer and drone applications (IConDA).
Kotlerman, L., Dagan, I., Szpektor, I., & Zhitomirsky-Geffet, M. (2010). Directional distributional similarity for lexical inference. Natural Language Engineering,16(4), 359–389.
Article Google Scholar
Kozareva, Z., & Hovy, E. (2010). A semi-supervised method to learn and construct taxonomies using the web. In Proceedings of the 2010 conference on empirical methods in natural language processing.
Krallinger, M., Leitner, F., Rabal, O., Vazquez, M., Oyarzabal, J., & Valencia, A. (2015). CHEMDNER: The drugs and chemical names extraction challenge. Journal of Cheminformatics,7(1), S1.
Article Google Scholar
Krauthammer, M., & Nenadic, G. (2004). Term identification in the biomedical literature. Journal of Biomedical Informatics,37(6), 512–526.
Article Google Scholar
Lapesa, G., Kawaletz, L., Plag, I., Andreou, M., Kisselew, M., & Pado, S. (2018). Disambiguation of newly derived nominalizations in context: A distributional semantics approach. Word Structure, 11(3), 277–312.
Article Google Scholar
Larsen, K. R., & Bong, C. H. (2016). A tool for addressing construct identity in literature reviews and meta-analyses. MIS Quarterly,40(3), 529–551.
Article Google Scholar
Liu, Y., Bill, R., Fiszman, M., Rindflesch, T., Pedersen, T., Melton, G. B., et al. (2012). Using SemRep to label semantic relations extracted from clinical text. In AMIA annual symposium proceedings.
Liu, K., Hogan, W. R., & Crowley, R. S. (2011). Natural language processing methods and systems for biomedical ontology learning. Journal of Biomedical Informatics,44(1), 163–179.
Article Google Scholar
Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing (Vol. 999). Cambridge: MIT Press.
MATH Google Scholar
McGuinness, D. L., & Van Harmelen, F. (2004). OWL web ontology language overview. W3C Recommendation,10(10), 2004.
Google Scholar
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Mirkin, S., Dagan, I., & Geffet, M. (2006). Integrating pattern-based and distributional similarity methods for lexical entailment acquisition. In Proceedings of the COLING/ACL on Main conference poster sessions.
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of machine learning. Cambridge: MIT Press.
MATH Google Scholar
Mons, B. (2005). Which gene did you mean? BMC Bioinformatics,6(1), 142.
Article Google Scholar
Moretti, F. (2005). Graphs, maps, trees: Abstract models for a literary history. London: Verso.
Google Scholar
Mueller, R., & Abdullaev, S. (2019). DeepCause: Hypothesis extraction from information systems papers with deep learning for theory ontology learning. In Proceedings of the 52nd Hawaii international conference on system sciences.
Mueller, R. M., & Huettemann, S. (2018). Extracting causal claims from information systems papers with natural language processing for theory ontology learning. In Proceedings of the 51st Hawaii international conference on system sciences.
Nickel, M., Murphy, K., Tresp, V., & Gabrilovich, E. (2016). A review of relational machine learning for knowledge graphs. Proceedings of the IEEE,104(1), 11–33.
Article Google Scholar
Noy, N. F, & McGuinness, D. L. (2001). Ontology development 101: A guide to creating your first ontology. Stanford knowledge systems laboratory technical report KSL-01-05 and Stanford medical informatics technical report SMI-2001-0880, Stanford, CA.
Panchenko, A., Faralli, S., Ruppert, E., Remus, S., Naets, H., Fairon, C., et al. (2016). TAXI at SemEval-2016 Task 13: A taxonomy induction method based on lexico-syntactic patterns, substrings and focused crawling. In Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016).
Polhill, J. G., Pignotti, E., Gotts, N. M., Edwards, P., & Preece, A. (2007). A semantic grid service for experimentation with an agent-based model of land-use change. Journal of Artificial Societies and Social Simulation,10(2), 2.
Google Scholar
Pontika, N., Knoth, P., Cancellieri, M., & Pearce, S. (2015). Fostering open science to research using a taxonomy and an eLearning portal. In Proceedings of the 15th international conference on knowledge technologies and data-driven business.
Rinaldi, F., Schneider, G., Kaljurand, K., Hess, M., Andronis, C., Konstandi, O., et al. (2007). Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach. Artificial Intelligence in Medicine,39(2), 127–136.
Article Google Scholar
Segura-Bedmar, I., Suárez-Paniagua, V., & Martínez, P. (2015). Exploring word embedding for drug name recognition. In Proceedings of the sixth international workshop on health text mining and information analysis.
Stern, A., & Dagan, I. (2014). Recognizing implied predicate–argument relationships in textual inference. In Proceedings of the 52nd annual meeting of the association for computational linguistics (Vol. 2: Short Papers).
Tchoua, R. B., Chard, K., Audus, D. J., Ward, L. T., Lequieu, J., De Pablo, J. J., & Foster, I. T. (2017). Towards a hybrid human-computer scientific information extraction pipeline. In 2017 IEEE 13th international conference on e-Science (e-Science).
Tsoi, L. C., Patel, R., Zhao, W., & Zheng, W. J. (2009). Text-mining approach to evaluate terms for ontology development. Journal of Biomedical Informatics,42(5), 824–830.
Article Google Scholar
Vandenbroeck, P., Goossens, J., & Clemens, M. (2007). Tackling obesities: Future choices—obesity system Atlas. edited by Government Office for Science. London.
Wities, R., Shwartz, V., Stanovsky, G., Adler, M., Shapira, O., Upadhyay, S., et al. (2017). A consolidated open knowledge representation for multiple texts. In Proceedings of the 2nd workshop on linking models of lexical, sentential and discourse-level semantics.
Xu, C., Cao, H., Zhang, F., & Cheadle, C. (2018). Comprehensive literature data-mining analysis reveals a broad genetic network functionally associated with autism spectrum disorder. International Journal of Molecular Medicine, 42(5), 2353–2362.
Google Scholar
Zhou, D., & He, Y. (2008). Extracting interactions between proteins from the literature. Journal of Biomedical Informatics,41(2), 393–407.
Article Google Scholar

Download references

Acknowledgements

We thank our research assistants Natalie Friedman and Sarah Cohen for their excellent work. This study was partly supported by Google Faculty Research Award 2014_R2_79.1.

Author information

Authors and Affiliations

Department of Information Science, Bar-Ilan University, Ramat Gan, Israel
Maayan Zhitomirsky-Geffet, Ofer Bergman & Shir Hilel

Authors

Maayan Zhitomirsky-Geffet
View author publications
You can also search for this author in PubMed Google Scholar
Ofer Bergman
View author publications
You can also search for this author in PubMed Google Scholar
Shir Hilel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maayan Zhitomirsky-Geffet.

Additional information

This paper is dedicated to the memory of Judit Bar-Ilan (1958–2019), an outstanding scholar and an inimitable friend and colleague.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 16 kb)

Appendix

See Tables 1, 2, 3 and 4 and Figs. 4, 5, 6, 7, 8 and 9.

Table 1 The top-10 central variables in the corpus

Full size table

Table 2 Five largest variable community groups

Full size table

Table 3 Table of meta-analyses presenting the results of 5 meta-analyses based on a fixed model

Full size table

Table 4 Variable currently not connected to the variable ‘retrieval method’ and their hypothesized effect on it

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhitomirsky-Geffet, M., Bergman, O. & Hilel, S. Towards a wider perspective in the social sciences using a network of variables based on thousands of results. Scientometrics 123, 1385–1406 (2020). https://doi.org/10.1007/s11192-020-03446-0

Download citation

Published: 06 May 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11192-020-03446-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards a wider perspective in the social sciences using a network of variables based on thousands of results

Abstract

Access this article

Similar content being viewed by others

Discovering the interdisciplinary nature of Big Data research through social network analysis and visualization

A social network analysis of academic journals in public administration in the early twenty-first century: examining journal level bibliometrics with network analysis

Social Networks Analysis: Tools, Measures and Visualization

Notes

References

Acknowledgements