Abstract
This chapter presents a methodology and a software application to support the analysis of collaborations and collaboration content in scientific communities. High-quality terminology extraction, semantic graphs, and clustering techniques are used to identify the relevant research topics. Traditional and novel social analysis tools are then used to study the emergence of interests around certain topics, the evolution of collaborations around these themes, and to identify potential for better cooperation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Note that specialized domains are those with higher potential of application for social analysts: thematic blogs, research communities, networked companies, etc.
- 2.
- 3.
- 4.
- 5.
- 6.
The measure is described in detail in [4].
- 7.
Singular value decomposition (SVD) is used to reduce data sparseness of the similarity matrix X.
- 8.
Evaluation on data sets in different applications makes no sense, since k-means++ and Repeated Bisections have been already evaluated in the literature. What matters here is to measure the added value of the feature extraction methodology.
- 9.
All the authors had direct responsibilities in the research network monitoring and management tasks.
- 10.
The clustering tendency of concepts is measurable by computing the entropy of the related similarity vectors. Sparse distribution of values over the vector’s dimensions indicates low clustering tendency.
- 11.
For example, if the analysis reveals that partner X and partner Y have common research interests but do not cooperate, this can easily be verified by looking at the partners’ publications and activities in the INTEROP knowledge-base, the KMap [18].
- 12.
Clearly, if a document (a paper) is authored by members of different groups, it contributes to more than one centroid calculation.
- 13.
- 14.
As a practical example, the partners from Ancona (UNIVPM-DIIGA) and Roma (UoR) were more oriented on research on natural language processing and on information retrieval, initially not a shared theme in the INTEROP community. During the project, a fruitful application of our techniques to interoperability problems has led to a better integration of our organizations within the NoE, as well as to the emergence of NLP-related concepts among the “hot” INTEROP research themes.
- 15.
References
Arthur D, Vassilvitskii S (2007) k-Means++: The advantages of careful seeding. In: SODA ’07: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 1027–1035
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley
Blei DM, Ng AY, Jordan MI, Lafferty J (2003) Latent dirichlet allocation. J Machine Learn Res 3
Cucchiarelli A, D’Antonio F, Navigli R, Velardi P (2009) Semantically interconnected social networks. TKDD (submitted)
Finin T, Ding L, Zhou L, Joshi A (2005) Social networking on the semantic web. Learn Organ 12(5):418–435
Joshi D, Gatica-Perez D (2006) Discovering groups of people in google news. In: HCM ’06: Proceedings of the 1st ACM international workshop on Human-centered multimedia. ACM, New York, NY, pp 55–64
Jung JJ, Euzenat J (2007) Towards semantic social networks. In: ESWC ’07: Proceedings of the 4th European conference on the semantic web. Springer, Berlin, pp 267–280
Legány C, Juhász S, Babos A (2006) Cluster validity measurement techniques. In: AIKED’06: Proceedings of the 5th WSEAS international conference on artificial intelligence, knowledge engineering and data bases. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, WI, pp 388–393
Mccallum A, Corrada-Emmanuel A, Wang X (2005) Topic and role discovery in social networks, pp 786–791
Mei Q, Cai D, Zhang D, Zhai C (2008) Topic modeling with network regularization. In: WWW ’08: Proceeding of the 17th international conference on World Wide Web. ACM, New York, NY, pp 101–110
Nallapati RM, Ahmed A, Xing EP, Cohen WW (2008) Joint latent topic models for text and citations. In: KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, NY, pp 542–550
Navigli R, Velardi P (2007) Glossextractor: A web application to automatically create a domain glossary. In: AI*IA ’07: Proceedings of the 10th congress of the Italian association for artificial intelligence on AI*IA 2007. Springer, Berlin, pp 339–349
Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, New York, NY
Sclano F, Velardi P (2007) Termextractor: A web application to learn the shared terminology of emergent web communities. In: Proceedings of the 3rd international conference on interoperability for enterprise software and applications (I-ESA 2007), Funchal, Portugal
Scott JP (2000) Social network analysis: A handbook. Sage Publications
Tagarelli A, Karypis G (2008) A segment-based approach to clustering multi-topic documents. In: Text mining workshop, SIAM datamining conference
Steinbach KT (2006) Introduction to data mining. Addison-Wesley
Velardi P, Cucchiarelli A, Petit M (2007) A taxonomy learning method and its application to characterize a scientific web community. IEEE Trans Knowl Data Eng 19(2):180–191
Wasserman S, Faust K, Iacobucci D (1994) Social network analysis : Methods and applications (structural analysis in the social sciences). Cambridge University Press
Zhao Y, Karypis G (2004) Empirical and theoretical comparisons of selected criterion functions for document clustering. Mach. Learn. 55(3):311–331
Zhou D, Ji X, Zha H, Lee Giles C (2006) Topic evolution and social interactions: How authors effect research. In: CIKM ’06: Proceedings of the 15th ACM international conference on Information and knowledge management. ACM, New York, NY, pp 248–257
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag London Limited
About this chapter
Cite this chapter
Cucchiarelli, A., D’Antonio, F., Velardi, P. (2010). Analyzing Collaborations Through Content-Based Social Networks. In: Abraham, A., Hassanien, AE., Sná¿el, V. (eds) Computational Social Network Analysis. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-1-84882-229-0_15
Download citation
DOI: https://doi.org/10.1007/978-1-84882-229-0_15
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-84882-228-3
Online ISBN: 978-1-84882-229-0
eBook Packages: Computer ScienceComputer Science (R0)