skip to main content
10.1145/1998076.1998161acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
poster

Detecting academic papers on the web

Published: 13 June 2011 Publication History

Abstract

Our research goal is to develop a search engine for open access to academic papers. English and Japanese test sets were built for detection of academic papers from 20,000 PDF files in each language using five annotators. Six classifiers were trained using similar features for each language. We report F1 of 0.74 for English and 0.54 for Japanese and argue that similar features could easily be generated for other languages as well.

References

[1]
Meier, J.J. and Conkling, T.W. Google Scholar's Coverage of the Engineering Literature: An Empirical Study. Journal of Academic Librarianship, 2008, 34(3), 196--201.
[2]
About CiteSeerX, http://citeseer.ist.psu.edu/about/site
[3]
Ishita, E. et al. A Search Engine for Japanese Academic Papers. JCDL 2010, p.379.
[4]
Agata, T. et al. Automatic identification of academic articles in Japanese PDF files. Library and Information Science. 2006, No.56, pp.43--63 (in Japanese).
[5]
Ikeuchi, A. et al. Automatic Detection for Academic Articles Using Pooling Method. IPSJ SIG Technical Report, 2007, Vol. 2007, No. 34, FI-86, pp. 33--40 (in Japanese).
[6]
Japanese WordNet, http://nlpwww.nict.go.jp/wn-ja/index.en.html
[7]
ipadic-2.7.0, http://en.sourceforge.jp/projects/ipadic/
[8]
Weka 3, http://www.cs.waikato.ac.nz/ml/weka/

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '11: Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
June 2011
500 pages
ISBN:9781450307444
DOI:10.1145/1998076

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. academic papers
  2. pdf
  3. search engine

Qualifiers

  • Poster

Conference

JCDL '11
Sponsor:
JCDL '11: Joint Conference on Digital Libraries
June 13 - 17, 2011
Ontario, Ottawa, Canada

Acceptance Rates

Overall Acceptance Rate 415 of 1,482 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 157
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media