research-article

Free Access

GetPt: Graph-enhanced General Table Pre-training with Alternate Attention Network

Authors:
Ran Jia

Microsoft, Beijing, China

Microsoft, Beijing, China

0009-0004-3540-5910
View Profile

,
Haoming Guo

University of California, Berkeley, Berkeley, CA, USA

University of California, Berkeley, Berkeley, CA, USA

0009-0000-3168-4167
View Profile

,
Xiaoyuan Jin

ETH Zürich, Zürich, Switzerland

ETH Zürich, Zürich, Switzerland

0009-0006-9191-9275
View Profile

,
Chao Yan

Peking University, Beijing, China

Peking University, Beijing, China

0000-0001-5929-8233
View Profile

,
Lun Du

Microsoft, Beijing, China

Microsoft, Beijing, China

0000-0002-7625-0650
View Profile

,
Xiaojun Ma

Microsoft, Beijing, China

Microsoft, Beijing, China

0000-0001-6757-3055
View Profile

,
Tamara Stankovic

Microsoft, Belgrade, Serbia

Microsoft, Belgrade, Serbia

0009-0007-0644-7398
View Profile

,
Marko Lozajic

Microsoft, Belgrade, Serbia

Microsoft, Belgrade, Serbia

0009-0006-2234-1692
View Profile

,
Goran Zoranovic

Microsoft, Belgrade, Serbia

Microsoft, Belgrade, Serbia

0009-0005-0560-8327
View Profile

,
Igor Ilic

Microsoft, Belgrade, Serbia

Microsoft, Belgrade, Serbia

0009-0002-5119-7993
View Profile

,
Shi Han

Microsoft, Beijing, China

Microsoft, Beijing, China

0000-0002-0360-6089
View Profile

,
Dongmei Zhang

Microsoft, Beijing, China

Microsoft, Beijing, China

0000-0002-9230-2799
View Profile

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data MiningAugust 2023Pages 941–950https://doi.org/10.1145/3580305.3599366

Published:04 August 2023Publication History

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 941–950

ABSTRACT

Tables are widely used for data storage and presentation due to their high flexibility in layout. The importance of tables as information carriers and the complexity of tabular data understanding attract a great deal of research on large-scale pre-training for tabular data. However, most of the works design models for specific types of tables, such as relational tables and tables with well-structured headers, neglecting tables with complex layouts. In real-world scenarios, there are many such tables beyond their target scope that cannot be well supported. In this paper, we propose GetPt, a unified pre-training architecture for general table representation applicable even to tables with complex structures and layouts. First, we convert a table to a heterogeneous graph with multiple types of edges to represent the layout of the table. Based on the graph, a specially designed transformer is applied to jointly model the semantics and structure of the table. Second, we devise the Alternate Attention Network (AAN) to better model the contextual information across multiple granularities of a table including tokens, cells, and the table. To better support a wide range of downstream tasks, we further employ three pre-training objectives and pre-train the model on a large table dataset. We fine-tune and evaluate GetPt model on two representative tasks, table type classification, and table structure recognition. Experiments show that GetPt outperforms existing state-of-the-art methods on these tasks.

Supplemental Material

rtfp1324-2min-promo.mp4

mp4

21.1 MB

Download

References

Chandra Sekhar Bhagavatula, Thanapon Noraset, and Doug Downey. 2015. Tabel: Entity linking in web tables. In International Semantic Web Conference. Springer, 425--441.Google ScholarDigital Library
Zhoujun Cheng, Haoyu Dong, Ran Jia, Pengfei Wu, Shi Han, Fan Cheng, and Dongmei Zhang. 2021. FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining. arXiv preprint arXiv:2109.07323 (2021).Google Scholar
Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020).Google Scholar
Eric Crestan and Patrick Pantel. 2011. Web-scale table census and classification. In Proceedings of the fourth ACM international conference on Web search and data mining. 545--554.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/abs/1810.04805Google Scholar
Haoyu Dong, Zhoujun Cheng, Xinyi He, Mengyu Zhou, Anda Zhou, Fan Zhou, Ao Liu, Shi Han, and Dongmei Zhang. 2022. Table pretraining: A survey on model architectures, pretraining objectives, and downstream tasks. arXiv preprint arXiv:2201.09745 (2022).Google Scholar
Haoyu Dong, Shijie Liu, Zhouyu Fu, Shi Han, and Dongmei Zhang. 2019a. Semantic Structure Extraction for Spreadsheet Tables with a Multi-task Learning Architecture.Google Scholar
Haoyu Dong, Shijie Liu, Zhouyu Fu, Shi Han, and Dongmei Zhang. 2019b. Semantic structure extraction for spreadsheet tables with a multi-task learning architecture. In Workshop on Document Intelligence at NeurIPS 2019.Google Scholar
Haoyu Dong, Jiong Yang, Shi Han, and Dongmei Zhang. 2020. Learning formatting style transfer and structure extraction for spreadsheet tables with a hybrid neural network architecture. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2389--2396.Google ScholarDigital Library
Lun Du, Fei Gao, Xu Chen, Ran Jia, Junshan Wang, Jiang Zhang, Shi Han, and Dongmei Zhang. 2021. TabularNet: A neural network architecture for understanding semantic structures of tabular data. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 322--331.Google ScholarDigital Library
Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).Google Scholar
Julian Eberius, Katrin Braunschweig, Markus Hentsch, Maik Thiele, Ahmad Ahmadov, and Wolfgang Lehner. 2015a. Building the Dresden Web Table Corpus: A Classification Approach. 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC) (2015), 41--50.Google Scholar
Julian Eberius, Katrin Braunschweig, Markus Hentsch, Maik Thiele, Ahmad Ahmadov, and Wolfgang Lehner. 2015b. Building the dresden web table corpus: A classification approach. In 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC). IEEE, 41--50.Google ScholarCross Ref
Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, and William W. Cohen. 2021. MATE: Multi-view Attention for Table Transformer Efficiency. In Conference on Empirical Methods in Natural Language Processing.Google Scholar
Majid Ghasemi-Gol, Jay Pujara, and Pedro A. Szekely. 2019. Tabular Cell Classification Using Pre-Trained Cell Embeddings. 2019 IEEE International Conference on Data Mining (ICDM) (2019), 230--239.Google Scholar
Majid Ghasemi-Gol and Pedro Szekely. 2018. TabVec: Table Vectors for Classification of Web Tables. https://arxiv.org/abs/1802.06290Google Scholar
Jonathan Herzig, Pawel Krzysztof Nowak, Thomas Müller, Francesco Piccinno, and Julian Eisenschlos. 2020. TaPas: Weakly Supervised Table Parsing via Pre-training. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. https://doi.org/10.18653%2Fv1%2F2020.acl-main.398Google ScholarCross Ref
Nathan Hurst, Kim Marriott, and Peter Moulder. 2005. Toward tighter tables. In Proceedings of the 2005 ACM symposium on Document engineering. 74--83.Google ScholarDigital Library
Hiroshi Iida, Dung Thai, Varun Manjunatha, and Mohit Iyyer. 2021. TABBIE: Pretrained Representations of Tabular Data. https://arxiv.org/abs/2105.02584Google Scholar
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).Google Scholar
Elvis Koci, Maik Thiele, Josephine Rehak, Oscar Romero, and Wolfgang Lehner. 2019. DECO: A Dataset of Annotated Spreadsheets for Layout and Table Recognition. 2019 International Conference on Document Analysis and Recognition (ICDAR) (2019), 1280--1285.Google Scholar
Guillaume Lample and Alexis Conneau. 2019. Cross-lingual Language Model Pretraining. https://arxiv.org/abs/1901.07291Google Scholar
Larissa R Lautert, Marcelo M Scheidt, and Carina F Dorneles. 2013. Web table taxonomy and formalization. ACM SIGMOD Record, Vol. 42, 3 (2013), 28--33.Google ScholarDigital Library
Weihong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, and Qiang Huo. 2022. TSRFormer: Table Structure Recognition with Transformers. In Proceedings of the 30th ACM International Conference on Multimedia. 6473--6482.Google ScholarDigital Library
Xiaofan Lin. 2006. Active layout engine: Algorithms and applications in variable data printing. Computer-Aided Design, Vol. 38, 5 (2006), 444--456.Google ScholarDigital Library
Hao Liu, Xin Li, Bing Liu, Deqiang Jiang, Yinsong Liu, and Bo Ren. 2022. Neural collaborative graph machines for table structure recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4533--4542.Google ScholarCross Ref
Qian Liu, Bei Chen, Jiaqi Guo, Zeqi Lin, and Jian-guang Lou. 2021. Tapex: Table pre-training via learning a neural sql executor. arXiv preprint arXiv:2107.07653 (2021).Google Scholar
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).Google Scholar
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM, Vol. 38, 11 (1995), 39--41.Google ScholarDigital Library
Kyosuke Nishida, Kugatsu Sadamitsu, Ryuichiro Higashinaka, and Yoshihiro Matsuo. 2017a. Understanding the Semantic Structures of Tables with a Hybrid Deep Neural Network Architecture. In AAAI.Google Scholar
Kyosuke Nishida, Kugatsu Sadamitsu, Ryuichiro Higashinaka, and Yoshihiro Matsuo. 2017b. Understanding the semantic structures of tables with a hybrid deep neural network architecture. In Thirty-First AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
Shah Rukh Qasim, Hassan Mahmood, and Faisal Shafait. 2019. Rethinking table recognition using graph neural networks. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 142--147.Google ScholarCross Ref
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. https://arxiv.org/abs/1910.01108Google Scholar
Kexuan Sun, Harsha Rayudu, and Jay Pujara. 2021. A Hybrid Probabilistic Approach for Table Understanding. In AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. https://arxiv.org/pdf/1706.03762.pdfGoogle Scholar
Petar Velivc kovi? , Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google Scholar
Daheng Wang, Prashant Shiralkar, Colin Lockard, Binxuan Huang, Xin Luna Dong, and Meng Jiang. 2021b. TCN: Table Convolutional Network for Web Table Interpretation. In Proceedings of the Web Conference 2021. ACM. https://doi.org/10.1145/3442381.3450090Google ScholarDigital Library
Fei Wang, Kexuan Sun, Muhao Chen, Jay Pujara, and Pedro Szekely. 2021c. Retrieving Complex Tables with Multi-Granular Graph Representation Learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. https://doi.org/10.1145%2F3404835.3462909Google ScholarDigital Library
Haoyu Wang, Muhao Chen, Hongming Zhang, and Dan Roth. 2020. Joint constrained learning for event-event relation extraction. arXiv preprint arXiv:2010.06727 (2020).Google Scholar
Zhiruo Wang, Haoyu Dong, Ran Jia, Jia Li, Zhiyi Fu, Shi Han, and Dongmei Zhang. 2021a. Tuta: Tree-based transformers for generally structured table pre-training. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1780--1790.Google ScholarDigital Library
Kan Wu, Houwen Peng, Minghao Chen, Jianlong Fu, and Hongyang Chao. 2021. Rethinking and Improving Relative Position Encoding for Vision Transformer. 2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2021), 10013--10021.Google Scholar
Wenyuan Xue, Baosheng Yu, Wen Wang, Dacheng Tao, and Qingyong Li. 2021. Tgrnet: A table graph reconstruction network for table structure recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1295--1304.Google ScholarCross Ref
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. https://arxiv.org/abs/1906.08237Google Scholar
Pengcheng Yin, Graham Neubig, Wen-tau Yih, and Sebastian Riedel. 2020. TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. https://arxiv.org/abs/2005.08314Google Scholar
Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. 2021. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems , Vol. 34 (2021), 28877--28888.Google Scholar
Guangyu Zhou, Muhao Chen, Chelsea JT Ju, Zheng Wang, Jyun-Yu Jiang, and Wei Wang. 2020. Mutation effect estimation on protein--protein interactions using deep contextualized representation learning. NAR genomics and bioinformatics , Vol. 2, 2 (2020), lqaa015. ioGoogle Scholar

Index Terms

GetPt: Graph-enhanced General Table Pre-training with Alternate Attention Network
1. Applied computing
  1. Document management and text processing
2. Information systems
  1. Information retrieval
  2. World Wide Web

Recommendations

Schema dependency-enhanced curriculum pre-training for table semantic parsing
Abstract
Large pre-trained models exhibit improved table-semantic-parsing performances by leveraging large-scale corpora to enhance the representation learning ability of semantic parsers. However, existing table pre-training methods do not sufficiently ...
Read More
Automated 3D Pre-Training for Molecular Property Prediction
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Molecular property prediction is an important problem in drug discovery and materials science. As geometric structures have been demonstrated necessary for molecular property prediction,3D information has been combined with various graph learning ...
Read More
Poster: Boosting Adversarial Robustness by Adversarial Pre-training
CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

Vision Transformer (ViT) shows superior performance on various tasks, but, similar to other deep learning techniques, it is vulnerable to adversarial attacks. Due to the differences between ViT and traditional CNNs, previous works designed new ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2023
5996 pages
ISBN:9798400701030
DOI:10.1145/3580305
General Chairs:
Ambuj Singh
UC Santa Barbara, USA
,
Yizhou Sun
UC Los Angeles, USA
,
Program Chairs:
Leman Akoglu
Carnegie Mellon University, USA
,
Dimitrios Gunopulos
University of Athens, Greece
,
Xifeng Yan
UC Santa Barbara, USA
,
Ravi Kumar
Google, USA
,
Fatma Ozcan
Google, USA
,
Jieping Ye
Alibaba DAMO Academy
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 August 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
graph transformer
table pre-training
table understanding
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 403
  Total Downloads
- Downloads (Last 12 months)403
- Downloads (Last 6 weeks)44
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GetPt: Graph-enhanced General Table Pre-training with Alternate Attention Network

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Schema dependency-enhanced curriculum pre-training for table semantic parsing

Automated 3D Pre-Training for Molecular Property Prediction

Poster: Boosting Adversarial Robustness by Adversarial Pre-training