skip to main content
10.1145/1242572.1242732acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Automatic searching of tables in digital libraries

Published: 08 May 2007 Publication History

Abstract

Tables are ubiquitous. Unfortunately, no search engine supportstable search. In this paper, we propose a novel table specificsearching engine, TableSeer, to facilitate the table extracting, indexing, searching, and sharing. In addition, wepropose an extensive set of medium-independent metadata to precisely present tables. Given a query, TableSeer ranks the returned results using an innovative ranking algorithm - TableRank with a tailored vector space model and a novel term weightingscheme. Experimental results show that TableSeer outperforms existing search engines on table search. In addition, incorporating multiple weighting factors can significantly improve the ranking results.

References

[1]
R. Baeza-Yates and B. Ribeiro-Neto. Modern information retrieval. In ACM Press/Addison-Wesley, 1999.
[2]
C. B. G. Salton. Term-weighting approaches in automatic text retrieval. In Information Processing and Management 24(5), pages 513--523, 1988.
[3]
Y. Liu, K. Bai, P. Mitra, and C. L. Giles. Tableseer: Automatic table metadata extraction and searching in digital libraries. In Technical Report, 2006.
[4]
Y. Liu, P. Mitra, C. L. Giles, and K. Bai. Automatic extraction of table metadata from digital documents. In JCDL, pages 339--340, 2006.
[5]
P. Pyreddy and W. Croft. Tintin: A system for retrieval in text tables. In In Proceedings of the Second International Conference on Digital Libraries, pages 193--200, 1997.
[6]
J. Wang and J. Hu. A machine learning based approach for table detection on the web. In Proceedings of the 11th Int'l Conf. on World Wide Web (WWW'02), pages 242--250, Nov 2002.

Cited By

View all
  • (2022)Table understanding: Problem overviewWIREs Data Mining and Knowledge Discovery10.1002/widm.148213:1Online publication date: 21-Nov-2022
  • (2011)Enabling efficient browsing and manipulation of web tables on smartphoneProceedings of the 14th international conference on Human-computer interaction: towards mobile and intelligent interaction environments - Volume Part III10.5555/2027296.2027311(117-126)Online publication date: 9-Jul-2011
  • (2011)Annotation based classification of the PDF document for semantic web2011 3rd International Conference on Electronics Computer Technology10.1109/ICECTECH.2011.5941625(370-376)Online publication date: Apr-2011
  • Show More Cited By

Index Terms

  1. Automatic searching of tables in digital libraries

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '07: Proceedings of the 16th international conference on World Wide Web
    May 2007
    1382 pages
    ISBN:9781595936547
    DOI:10.1145/1242572
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 May 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. table crawler
    2. table extraction
    3. table indexing
    4. table metadata
    5. table ranking
    6. table search

    Qualifiers

    • Article

    Conference

    WWW'07
    Sponsor:
    WWW'07: 16th International World Wide Web Conference
    May 8 - 12, 2007
    Alberta, Banff, Canada

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Table understanding: Problem overviewWIREs Data Mining and Knowledge Discovery10.1002/widm.148213:1Online publication date: 21-Nov-2022
    • (2011)Enabling efficient browsing and manipulation of web tables on smartphoneProceedings of the 14th international conference on Human-computer interaction: towards mobile and intelligent interaction environments - Volume Part III10.5555/2027296.2027311(117-126)Online publication date: 9-Jul-2011
    • (2011)Annotation based classification of the PDF document for semantic web2011 3rd International Conference on Electronics Computer Technology10.1109/ICECTECH.2011.5941625(370-376)Online publication date: Apr-2011
    • (2011)Enabling Efficient Browsing and Manipulation of Web Tables on SmartphoneHuman-Computer Interaction. Towards Mobile and Intelligent Interaction Environments10.1007/978-3-642-21616-9_14(117-126)Online publication date: 2011
    • (2010)Enhancing browsing experience of table and image elements in web pagesInternational Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction10.1145/1891903.1891935(1-8)Online publication date: 8-Nov-2010
    • (2009)Enabling Interactive Access to Web TablesProceedings of the 13th International Conference on Human-Computer Interaction. Part I: New Trends10.1007/978-3-642-02574-7_85(760-768)Online publication date: 14-Jul-2009
    • (2008)State of the art in metadata abstraction crawlers2008 IEEE International Conference on Industrial Technology10.1109/ICIT.2008.4608573(1-6)Online publication date: Apr-2008
    • (2007)ChemXSeerProceedings of the ACM first workshop on CyberInfrastructure: information management in eScience10.1145/1317353.1317356(7-10)Online publication date: 9-Nov-2007

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media