skip to main content
10.1145/1835449.1835690acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
extended-abstract

Entity information management in complex networks

Published:19 July 2010Publication History

ABSTRACT

Entity information management (EIM) deals with organizing, processing and delivering information about entities. Its emergence is a result of satisfying more sophisticated information needs that go beyond document search. In the recent years, entity retrieval has attracted much attention in the IR community. INEX has started the XML Entity Ranking track since 2007 and TREC has launched the Entity track since 2009 to investigate the problem of related entity finding. Some EIM problems go beyond retrieval and ranking such as: 1) entity profiling, which is about characterizing a specific entity, and 2) entity distillation, which is about discovering the trend about an entity. These problems have received less attention while they have many important applications.

On the other hand, the entities in the real world or in the Web environment are usually not isolated. They are connected or related with each other in one way or another. For example, the coauthorship makes the authors with similar research interests be connected. The emergence of social media such as Facebook, Twitter and Youtube has further interweaved the related entities in a much larger scale. Millions of users in these sites can become friends, fans or followers of others, or taggers or commenters of different types of entities (e.g., bookmarks, photos and videos). These networks are complex in the sense that they are heterogeneous with multiple types of entities and of interactions, they are large-scale, they are multi-lingual, and they are dynamic. These features of the complex networks go beyond traditional social network analysis and require further research.

In this proposed research, I investigate entity information management in the environment of complex networks. The main research question is: how can the EIM tasks be facilitated by modeling the content and structure of complex networks? The research is in the intersection of content based information retrieval and complex network analysis, which deals with both unstructured text data and structured networks. The specific targeting EIM tasks are entity retrieval, entity profiling and entity distillation. In addition to the main research question, the following questions are considered: How can we accomplish a EIM task involving diverse entity and interaction types? How to model the evolution of entity profiles as well as the underlying complex networks? How can the existing cross-language IR work be leveraged to build entity profiles with multi-lingual evidence?

I propose to use probabilistic models and discriminative models in particular to address the above research questions. In my research, I have developed discriminative models for expert search to integrate arbitrary document features [3] and to learn flexible combination strategies to rank experts in heterogeneous information sources [1]. Discriminative graphical models are proposed to jointly discover homepages by inference on the homepage dependence network [2]. The dependence of table elements is exploited to collectively perform the entity retrieval task [4]. These works have shown the power of discriminative models for entity search and the benefits of utilizing the dependencies among related entities. What I would like to do next is to develop a unified probabilistic framework to investigate the research questions raised in this proposal.

References

  1. Y. Fang, L. Si, and A. Mathur. Ranking experts with discriminative probabilistic models. In Proceedings of SIGIR Workshops, 2009.Google ScholarGoogle Scholar
  2. Y. Fang, L. Si, and A. Mathur. Discriminative graphical models for faculty homepage discovery. Information Retrieval, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Y. Fang, L. Si, and A. Mathur. Discriminative models of integrating document evidence and document-candidate associations for expert search. In Proceedings of SIGIR, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Y. Fang, L. Si, Z. Yu, Y. Xian, and Y. Xu. Entity retrieval by hierarchical relevance model, exploiting the structure of tables and learning homepage classifiers. In Proceedings of TREC-18, 2009.Google ScholarGoogle Scholar

Index Terms

  1. Entity information management in complex networks

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
          July 2010
          944 pages
          ISBN:9781450301534
          DOI:10.1145/1835449

          Copyright © 2010 Copyright is held by the owner/author(s)

          Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 July 2010

          Check for updates

          Qualifiers

          • extended-abstract

          Acceptance Rates

          SIGIR '10 Paper Acceptance Rate87of520submissions,17%Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader