skip to main content
research-article

Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection

Published: 03 March 2008 Publication History

Abstract

In this article, we demonstrate the applicability of semantic techniques for detection of Conflict of Interest (COI). We explain the common challenges involved in building scalable Semantic Web applications, in particular those addressing connecting-the-dots problems. We describe in detail the challenges involved in two important aspects on building Semantic Web applications, namely, data acquisition and entity disambiguation (or reference reconciliation). We extend upon our previous work where we integrated the collaborative network of a subset of DBLP researchers with persons in a Friend-of-a-Friend social network (FOAF). Our method finds the connections between people, measures collaboration strength, and includes heuristics that use friendship/affiliation information to provide an estimate of potential COI in a peer-review scenario. Evaluations are presented by measuring what could have been the COI between accepted papers in various conference tracks and their respective program committee members. The experimental results demonstrate that scalability can be achieved by using a dataset of over 3 million entities (all bibliographic data from DBLP and a large collection of FOAF documents).

References

[1]
Adamic, L. A., Buyukkokten, O., and Adar, E. 2003. A social network caught in the Web. First Monday 8, 6.
[2]
Aleman-Meza, B., Halaschek-Wiener, C., Arpinar, I. B., Ramakrishnan, C., and Sheth, A. P. 2005. Ranking complex relationships on the semantic web. IEEE Internet Comput. 9, 3, 37-- 44.
[3]
Aleman-Meza, B., Nagarajan, M., Ramakrishnan, C., Ding, L., Kolari, P., Sheth, A. P., Arpinar, I. B., Joshi, A., and Finin, T. 2006. Semantic analytics on social networks: Experiences addressing the problem of conflict of interest detection. In Proceedings of the 13th International World Wide Web Conference, Edinburgh. Scotland. 407--416.
[4]
Aleman-Meza, B., Hakimpour, F., Arpinar, I. B., and Sheth, A. P. 2007. SwetoDblp ontology of computer science publications, J. Web Semant. 5, 6, 151--155.
[5]
Anderson, R. and Khattak, A. 1998. The use of information retrieval techniques for intrusion detection. In Proceedings of the 1st International Workshop on Recent Advances in Intrusion Detection. Louvain-la-Neuve, Berlin, Germany.
[6]
Anyanwu, K. and Sheth, A. P. 2003. ρ-Queries: Enabling querying for semantic associations on the semantic web. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary. 690--699.
[7]
Anyanwu, K., Maduko, A., and Sheth, A. P. 2007. SPARQ2L: Towards support for subgraph extraction queries in RDF databases. In Proceedings of the 14th International World Wide Web Conference. Banff, Alberta, Canada.
[8]
Aswani, N., Bontcheva, K., and Cunningham, H. 2006. Mining information for instance unification. In Proceedings of the 5th International Semantic Web Conference. Athens, GA. 329--342.
[9]
Barabási, A.-L. 2002. Linked---The New Science of Networks. Perseus Publishing, Cambridge, MA.
[10]
Berkowitz, S. D. 1982. Introduction to Structural Analysis: The Network Approach to Social Research. Butterworth, Toronto, Canada.
[11]
Bergamaschi, S., Castano, S., and Vincini, M. 1999. Semantic integration of semistructured and structured data sources.SIGMOD Rec. 28, 1, 54--59.
[12]
Bhattacharya, I. and Getoor, L. 2006. Entity resolution in graphs. In L. B. Holder and D. J. Cook, Eds. Mining Graph Data. John Wiley & Sons.
[13]
Chen, C. 1999. Visualising semantic spaces and author co-citation networks in digital libraries. Inform. Proc. Manag. 35, 3, 401--420.
[14]
Chen, C. and Carr, L. 1999. Trailblazing the literature of hypertext: Author co-citation analysis (1989--1998). In Proceedings of the 10th ACM Conference on Hypertext and Hypermedia: Returning to Our Diverse Roots. Darmstadt, Germany, 51--60.
[15]
Crescenzi, V., Mecca, G., and Merialdo, P. 2001. RoadRunner: Towards automatic data extraction from large Web sites. In Proceedings of the 27th International Conference on Very Large Data Bases. Rome, Italy.
[16]
Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R. V., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, J. A., and Zien, J. Y. 2003. SemTag and seeker: Bootstrapping the semantic Web via automated semantic annotation. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary. 178--186.
[17]
Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R. S., Peng, Y., Reddivari, P., Doshi, V., and Sachs, J. 2004. Swoogle: A search and metadata engine for the semantic Web. In Proceedings of the International Conference on Information and Knowledge Management. Washington, DC.
[18]
Ding, L., Finin, T., Zhou, L., and Joshi, A. 2005a. Social networking on the semantic web. Learn. Orga. 5, 12.
[19]
Ding, L., Zhou, L., Finin, T., and Joshi, A. 2005b. How the Semantic Web is being used: An analysis of FOAF documents. In Proceedings of the 38th Hawaii International Conference on System Sciences. Big Island, HI.
[20]
Dong, X., Halevy, A., and Madhavan, J. 2005. Reference reconciliation in complex information spaces. In Proceedings of the ACM SIGMOD Conference. Baltimore. MD.
[21]
Garton, L., Haythornthwaite, C., and Wellman, B. 1997. Studying online social networks. J. Comput.-Mediated Comm. 3, 1.
[22]
Guha, R., Mccool, R., and Miller, E. 2003. Semantic search. In Proceedings of the 12th International World Wide Web Conference. Budapest, Hungary.
[23]
Hammond, B., Sheth, A., and Kochut, K. 2002. Semantic enhancement engine: A modular document enhancement platform for semantic applications over heterogeneous content. In V. Kashyap and L. Shklar Eds. Real World Semantic Web Applications. Ios Press. Inc. 29--49.
[24]
Hassell, J., Aleman-Meza, B., and Arpinar, I. B. 2006. Ontology-driven automatic entity disambiguation in unstructured text. In Proceedings of the 5th International Semantic Web Conference, Athens, GA.
[25]
Hollywood, J., Snyder, D., Mckay, K. N., and Boon, J. E. 2004. Out of the Ordinary: Finding Hidden Threats by Analyzing Unusual Behavior. RAND Corporation.
[26]
Horrocks, I. and Tessaris, S. 2002. Querying the semantic web: A formal approach. In Proceedings of the 1st International Semantic Web Conference. Sardinia, Italy.
[27]
Janik, M. and Kochut, K. 2005. BRAHMS: A WorkBench RDF store and high performance memory system for semantic association discovery. In Proceedings of the 4th International Semantic Web Conference. Galway, Ireland.
[28]
Jonyer, I., Holder, L. B., and Cook, D. J. 2000. Graph-based hierarchical conceptual clustering. In Proceedings of the 13th International Florida Artificial intelligence Research Society Conference. AAAI Press, 91--95.
[29]
Kalashnikov, D., Mehrotra, S., and Chen, Z. 2005. Exploiting relationships for domain-independent data cleaning. In Proceedings of the SIAM Data Mining Conference.
[30]
Karvounarakis, G., Alexaki, S., Christophides, V., Plexousakis, D., and Scholl, M. 2002. RQL: A declarative query language for RDF. In Proceedings of the 11th International World Wide Web Conference. Honolulu, HI, 592--603.
[31]
Kautz, H., Selman, B., and Shah, M. 1997. The hidden web. AI Mag. 18, 2, 27--36.
[32]
Kempe, D., Kleinberg, J. M., and Tardos, E. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 137--146.
[33]
Kochut, K. and Janik, M. 2007. SPARQLeR: Extended SPARQL for semantic association discovery. In Proceedings of the 4th European Semantic Web Conference. Innsbruck, Austria.
[34]
Laender, A. H. F., Ribeiro-Neto, B. A., Da Silva, A. S., and Teixeira, J. S. 2002. A brief survey of web data extraction tools. SIGMOD Rec. 31, 2, 84--93.
[35]
Laz, T., Fisher, K., Kostich, M., and Atkinson, M. 2004. Connecting the dots. Modern Drug Discovery, 33--36.
[36]
Lee, Y. L. 2005. Apps make semantic web a reality. SD Times.
[37]
Mika, P. 2005. Flink: Semantic Web technology for the extraction and analysis of social networks. J. Web Semant. 3, 2--3, 211--223.
[38]
Miller, E. 2005. The Semantic Web is Here. In Proceedings of the Semantic Technology Conference 2005. San Francisco, CA.
[39]
Nascimento, M. A., Sander, J., and Pound, J. 2003. Analysis of SIGMOD's CoAuthorship graph. SIGMOD Rec. 32, 3.
[40]
Neville, J., Adler, M., and Jensen, D. 2003. Clustering relational data using attribute and link information. In Proceedings of the Text Mining and Link Analysis Workshop.
[41]
Newman, M. E. J. 2001a. The structure of scientific collaboration networks. In Proceedings of the National Academy of Sciences 98, 2, 404--409.
[42]
Newman, M. E. J. 2001b. Scientific collaboration networks: II. Shortest paths, weighted networks, and centrality. Phys. Rev. E 64, 016132.
[43]
Papagelis, M., Plexousakis, D., and Nikolaou, P. N. 2005. CONFIOUS: Managing the electronic submission and reviewing process of scientific conferences. In Proceedings of the 6th International Conference on Web Information Systems Engineering. New York, NY.
[44]
Ramakrishnan, C., Milnor, W. H., Perry, M., and Sheth, A. P. 2005. Discovering informative connection subgraphs in multi-relational graphs. SIGKDD Exp. 7, 2, 56--63.
[45]
Sheth, A. P. 2005a. Enterprise applications of semantic Web: The sweet spot of risk and compliance. In Proceedings of the IFIP International Conference on Industrial Applications of Semantic Web. Jyväskylä, Finland.
[46]
Sheth, A. P. 2005b. From semantic search & integration to analytics. In Proceedings of the Dagstuhl Seminar: Semantic Interoperability and Integration. IBFI, Schloss Dagstuhl, Germany.
[47]
Sheth, A. P., Aleman-Meza, B., Arpinar, I. B., Halaschek, C., Ramakrishnan, C., Bertram, C., Warke, Y., Avant, D., Arpinar, F. S., Anyanwu, K., and Kochut, K. 2005. Semantic association identification and knowledge discovery for national security applications. J. Datab. Manag. 16, 1, 33--53.
[48]
Sheth, A. P., Bertram, C., Avant, D., Hammond, B., Kochut, K., and Warke, Y. 2002. Managing semantic content for the Web. IEEE Internet Computing 6, 4, 80--87.
[49]
Smeaton, A. F., Keogh, G., Gurrin, C., McDonald, K., and Sodring, T. 2002. Analysis of papers from twenty-five years of SIGIR conferences: What have we been doing for the last quarter of a century. SIGIR For. 36, 2.
[50]
Townley, J. 2000. The streaming search engine that reads your mind. Streaming Media World.
[51]
Wasserman, S. and Faust, K. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge, UK.
[52]
Wellman, B. 1998. Structural analysis: From method and metaphor to theory and substance. In B. Wellman and S. D. Berkowitz. Eds. Social Structures: A Network Approach. Cambridge University Press, Cambridge, 19--61.
[53]
Winkler, W. E. 1999. The state of record linkage and current research problems. RR99/03, U.S. Census Bureau.
[54]
Xu, J. and Chen, H. 2003. Untangling criminal networks: A case study. In Proceedings of Intelligence and Security Informatics, 1st NSF/NIJ Symposium, 232--248.

Cited By

View all

Index Terms

  1. Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 2, Issue 1
      February 2008
      280 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/1326561
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 March 2008
      Accepted: 01 November 2007
      Revised: 01 October 2007
      Received: 01 March 2007
      Published in TWEB Volume 2, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. DBLP
      2. RDF
      3. Semantic Web
      4. conflict of interest
      5. data fusion
      6. entity disambiguation
      7. ontologies
      8. peer review process
      9. semantic analytics
      10. semantic associations
      11. social networks
      12. swetoDblp

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)A graph grammar and -type tournament-based approach to detect conflicts of interest in a social networkKnowledge and Information Systems10.1007/s10115-020-01525-563:2(497-539)Online publication date: 1-Feb-2021
      • (2019)Political Messaging in Digital SpacesCivic Engagement and Politics10.4018/978-1-5225-7669-3.ch060(1203-1221)Online publication date: 2019
      • (2019)Using Twitter in Political CampaignsCivic Engagement and Politics10.4018/978-1-5225-7669-3.ch035(710-726)Online publication date: 2019
      • (2019)Resolving Conflict of Interests and Recommending Expert Reviewers for Academic Publications Using Linked Open Data2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS)10.1109/SNAMS.2019.8931826(91-98)Online publication date: Oct-2019
      • (2019)Detection of Conflicts of Interest in Social NetworksComplex Networks and Their Applications VIII10.1007/978-3-030-36683-4_15(179-190)Online publication date: 25-Nov-2019
      • (2018)Full-fledged semantic indexing and querying model designed for seamless integration in legacy RDBMSData & Knowledge Engineering10.1016/j.datak.2018.07.007117(133-173)Online publication date: Sep-2018
      • (2017)Political Messaging in Digital SpacesPolitics, Protest, and Empowerment in Digital Spaces10.4018/978-1-5225-1862-4.ch005(72-90)Online publication date: 2017
      • (2017)A bimodal social network analysis to recommend points of interest to touristsSocial Network Analysis and Mining10.1007/s13278-017-0431-87:1Online publication date: 21-Apr-2017
      • (2016)An Overview on XML Semantic Disambiguation from Unstructured Text to Semi-Structured DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2016.252576828:6(1383-1407)Online publication date: 1-Jun-2016
      • (2016)A General Multimedia Representation Space Model toward Event-Based Collective Knowledge Management2016 IEEE Intl Conference on Computational Science and Engineering (CSE) and IEEE Intl Conference on Embedded and Ubiquitous Computing (EUC) and 15th Intl Symposium on Distributed Computing and Applications for Business Engineering (DCABES)10.1109/CSE-EUC-DCABES.2016.234(512-521)Online publication date: Aug-2016
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media