Skip to main content

A Generalized Joint Inference Approach for Citation Matching

  • Conference paper
AI 2008: Advances in Artificial Intelligence (AI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5360))

Included in the following conference series:

Abstract

Citation matching is the problem of extracting bibliographic records from citation lists in technical papers, and merging records that represent the same publication. Generally, there are three types of data- sets in citation matching, i.e., sparse, dense and hybrid types. Typical approaches for citation matching are Joint Segmentation (Jnt-Seg) and Joint Segmentation Entity Resolution (Jnt-Seg-ER). Jnt-Seg method is effective at processing sparse type datasets, but often produces many errors when applied to dense type datasets. On the contrary, Jnt-Seg-ER method is good at dealing with dense type datasets, but insufficient when sparse type datasets are presented. In this paper we propose an alternative joint inference approach–Generalized Joint Segmentation (Generalized-Jnt-Seg). It can effectively deal with the situation when the dataset type is unknown. Especially, in hybrid type datasets analysis there is often no a priori information for choosing Jnt-Seg method or Jnt-Seg-ER method to process segmentation and entity resolution. Both methods may produce many errors. Fortunately, our method can effectively avoid error of segmentation and produce well field boundaries. Experimental results on both types of citation datasets show that our method outperforms many alternative approaches for citation matching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lawrence, S., Bollacker, K., Giles, C.L.: Autonomous citation matching. In: Proc. International Conference on Autonomous Agents, pp. 392–393 (1999)

    Google Scholar 

  2. Pasula, H., Marthi, B., Milch, B., Russell, S., Shpitser, I.: Identity uncertainty and citation matching. In: Proc. NIPS 2003, pp. 1425–1432 (2003)

    Google Scholar 

  3. Poon, H., Domingos, P.: Joint inference in information extraction. In: Proc. AAAI 2005, pp. 913–918 (2007)

    Google Scholar 

  4. Lowd, D., Domingos, P.: Recursive Random Fields. In: IJCAI, pp. 950–955 (2007)

    Google Scholar 

  5. Richardson, M., Domingos, P.: Markov logic networks. Machine Learning 62, 107–136 (2006)

    Article  Google Scholar 

  6. Singla, P., Domingos, P.: Discriminative training of Markov logic networks. In: Proc. AAAI 2005, pp. 868–873 (2005)

    Google Scholar 

  7. Singla, P., Domingos, P.: Entity resolution with Markov logic. In: Proc. ICDM 2006, pp. 572–582 (2006)

    Google Scholar 

  8. Singla, P., Domingos, P.: Memory-Efficient Inference in Relational Domain. In: Proc. AAAI 2006 (2006)

    Google Scholar 

  9. Lowd, D., Domingos, P.: Efficient Weight Learning for Markov Logic Networks. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 200–211. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Wellner, B., McCallum, A., Peng, F., Hay, M.: An integrated, conditional model of information extraction and coreference with application to citation matching. In: Proc. UAI 2004, pp. 593–601 (2004)

    Google Scholar 

  11. Fellegi, I., Sunter, A.: A theory for record linkage. J. American Statistical Association 64, 1183–1210 (1969)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liao, Z., Zhang, Z. (2008). A Generalized Joint Inference Approach for Citation Matching. In: Wobcke, W., Zhang, M. (eds) AI 2008: Advances in Artificial Intelligence. AI 2008. Lecture Notes in Computer Science(), vol 5360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89378-3_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89378-3_61

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89377-6

  • Online ISBN: 978-3-540-89378-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics