skip to main content
10.1145/3459637.3482197acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Tabular Data Concept Type Detection Using Star-Transformers

Published:30 October 2021Publication History

ABSTRACT

Tabular data is an invaluable information resource for search, in-formation extraction and question answering about the world. It is critical to understand the semantic concept types for table columns in order to fully exploit the information in tabular data. In this paper, we focus on learning-based approaches for column concept type detection without relying on any metadata or queries to existing knowledge bases. We propose a model that employs both statistical and semantic features of table columns, and use Star-Transformers to gather and scatter information across the whole table to boost the performance on individual columns. We apply distant supervision to construct a tabular dataset with columns annotated with DBpedia classes. Our experiment results show that our model achieves 93.57 accuracy on the dataset, exceeding that of the state-of-the-art baselines.

Skip Supplemental Material Section

Supplemental Material

CIKM_tabular_data_talk.mp4

mp4

79 MB

References

  1. Chandra Sekhar Bhagavatula, Thanapon Noraset, and Doug Downey. 2015. Tabel: Entity linking in web tables. In ISWC.Google ScholarGoogle Scholar
  2. Matteo Cannaviccio, Lorenzo Ariemma, Denilson Barbosa, and Paolo Merialdo. 2018. Leveraging wikipedia table schemas for knowledge graph augmentation. In WebDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jiaoyan Chen, Ernesto Jiménez-Ruiz, Ian Horrocks, and Charles Sutton. 2019. Learning semantic annotations for tabular data. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Zhiyu Chen, Haiyan Jia, Jeff Heflin, and Brian D Davison. 2018. Generating schema labels through dataset content analysis. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrew M Dai, Christopher Olah, and Quoc V Le. 2015. Document embedding with paragraph vectors. In NIPS Deep Learning Workshop.Google ScholarGoogle Scholar
  6. Xiang Deng, Huan Sun, Alyssa Lees, You Wu, and Cong Yu. 2021. Turl: Table understanding through representation learning. In VLDB. Google ScholarGoogle ScholarCross RefCross Ref
  7. Qipeng Guo, Xipeng Qiu, Pengfei Liu, Yunfan Shao, Xiangyang Xue, and Zheng Zhang. 2019. Star-transformer. In NAACL.Google ScholarGoogle Scholar
  8. Madelon Hulsebos, Kevin Hu, Michiel Bakker, Emanuel Zgraggen, Arvind Satyanarayan, Tim Kraska, cC agatay Demiralp, and César Hidalgo. 2019. Sherlock: A deep learning approach to semantic data type detection. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Girija Limaye, Sunita Sarawagi, and Soumen Chakrabarti. 2010. Annotating and searching web tables using entities, types and relationships. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In ICLR.Google ScholarGoogle Scholar
  11. Varish Mulwad, Tim Finin, and Anupam Joshi. 2013. Semantic message passing for generating linked data from tables. In ISWC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In EMNLP.Google ScholarGoogle Scholar
  13. Minh Pham, Suresh Alse, Craig A Knoblock, and Pedro Szekely. 2016. Semantic labeling: a domain-independent approach. In ISWC.Google ScholarGoogle Scholar
  14. Dominique Ritze and Christian Bizer. 2017. Matching web tables to dbpedia-a feature utility study. In EDBT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Natalia Rümmele, Yuriy Tyshetskiy, and Alex Collins. 2018. Evaluating approaches for supervised semantic labeling. In WWW Linked Data on the Web Workshop.Google ScholarGoogle Scholar
  16. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In CVPR.Google ScholarGoogle Scholar
  17. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Petros Venetis, Alon Y. Halevy, Jayant Madhavan, Marius Pasca, Warren Shen, Fei Wu, and Gengxin Miao. 2011. Recovering semantics of tables on the web. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Pengcheng Yin, Graham Neubig, Wen-tau Yih, and Sebastian Riedel. 2020. Tabert: Pretraining for joint understanding of textual and tabular data. In ACL.Google ScholarGoogle Scholar
  20. Dan Zhang, Yoshihiko Suhara, Jinfeng Li, Madelon Hulsebos, Caugatay Demiralp, and Wang-Chiew Tan. 2020. Sato: Contextual semantic type detection in tables. In VLDB. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ziqi Zhang. 2017. Effective and efficient semantic table interpretation using tableminer+. Semantic Web 8, 6 (2017), 921?957.Google ScholarGoogle Scholar

Index Terms

  1. Tabular Data Concept Type Detection Using Star-Transformers

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader