skip to main content
10.1145/3269206.3271781acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Type Prediction Combining Linked Open Data and Social Media

Published: 17 October 2018 Publication History

Abstract

Linked Open Data (LOD) and social media often contain the representations of the same real-world entities, such as persons and organizations. These representations are increasingly interlinked, making it possible to combine and leverage both LOD and social media data in prediction problems, complementing their relative strengths: while LOD knowledge is highly structured but also scarce and obsolete for some entities, social media data provide real-time updates and increased coverage, albeit being mostly unstructured. In this paper, we investigate the feasibility of using social media data to perform type prediction for entities in a LOD knowledge graph. We discuss how to gather training data for such a task, and how to build an efficient domain-independent vector representation of entities based on social media data. Our experiments on several type prediction tasks using DBpedia and Twitter data show the effectiveness of this representation, both alone and combined with knowledge graph-based features, suggesting its potential for ontology population.

References

[1]
Alessio Palmero Aprosio, Claudio Giuliano, and Alberto Lavelli. 2013. Automatic Expansion of DBpedia Exploiting Wikipedia Cross-Language Information. In The Semantic Web: Semantics and Big Data (ESWC) .
[2]
Antoine Bordes, Nicolas Usunier, Alberto Garc'i a-Durá n, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Proc. of 27th Conf. on Neural Information Processing Systems (NIPS). 2787--2795.
[3]
Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Global RDF Vector Space Embeddings. In Proc. of 16th Int. Semantic Web Conf. (ISWC) . 190--207.
[4]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A Library for Large Linear Classification. J. Mach. Learn. Res., Vol. 9 (June 2008), 1871--1874.
[5]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proc. of 22nd Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD). ACM, 855--864.
[6]
Mayank Kejriwal and Pedro Szekely. 2017. Supervised typing of big graphs using semantic embeddings. In Proc. of Int. Workshop on Semantic Big Data (SBD@SIGMOD). 3:1--3:6.
[7]
Jiwei Li, Alan Ritter, and Eduard Hovy. 2014. Weakly Supervised User Profile Extraction from Twitter. In Proc. of 52nd Annual Meeting of the Association for Computational Linguistics (ACL). 165--174.
[8]
André Melo, Heiko Paulheim, and Johanna Völker. 2016. Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification. In Proc. of 6th Int. Conf. on Web Intelligence, Mining and Semantics (WIMS) . 14:1--14:10.
[9]
Yaroslav Nechaev, Francesco Corcoglioniti, and Claudio Giuliano. 2017a. Concealing Interests of Passive Users in Social Media. In Proc of Re-coding Black Mirror ISWC Workshop .
[10]
Yaroslav Nechaev, Francesco Corcoglioniti, and Claudio Giuliano. 2017b. Linking Knowledge Bases to Social Media Profiles. In Proc. of 32nd Symposium on Applied Computing (SAC). 145--150.
[11]
Yaroslav Nechaev, Francesco Corcoglioniti, and Claudio Giuliano. 2017c. SocialLink: Linking DBpedia Entities to Corresponding Twitter Accounts. In Proc of Int. Semantic Web Conf. (ISWC). 165--174.
[12]
Yaroslav Nechaev, Francesco Corcoglioniti, and Claudio Giuliano. 2018. SocialLink: Exploiting Graph Embeddings to Link DBpedia Entities to Twitter Profiles. Progress in Artificial Intelligence (2018).
[13]
Eric W Noreen. 1989. Computer-intensive methods for testing hypotheses .Wiley New York.
[14]
Heiko Paulheim and Christian Bizer. 2013. Type Inference on Noisy RDF Data. In Proc. of 12th Int. Semantic Web Conf. (ISWC) . 510--525.
[15]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.
[16]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: online learning of social representations. In Proc. of 20th Int. Conf. on Knowledge Discovery and Data Mining (KDD). 701--710.
[17]
Guangyuan Piao and John G. Breslin. 2017. Inferring User Interests in Microblogging Social Networks: A Survey. CoRR, Vol. abs/1712.07691 (2017).
[18]
Mariano Rico, Idafen Santana-Perez, Pedro Pozo-Jimenez, and Asuncion Gomez-Perez. 2018. Inferring New Types on Large Datasets Applying Ontology Class Hierarchy Classifiers: The DBpedia Case. In Proc. of 15th Extended Semantic Web Conference (ESWC) . To appear.
[19]
Petar Ristoski and Heiko Paulheim. 2016. RDF2vec: RDF graph embeddings for data mining. In Proc. of 15th Int. Semantic Web Conf. (ISWC) . 498--514.
[20]
Petar Ristoski, Jessica Rosati, Tommaso Di Noia, Renato De Leone, and Heiko Paulheim. 2017. RDF2Vec: RDF Graph Embeddings and Their Applications. Semantic Web Journal (2017).
[21]
Noam Shazeer, Ryan Doherty, Colin Evans, and Chris Waterson. 2016. Swivel: Improving Embeddings by Noticing What's Missing. CoRR, Vol. abs/1602.02215 (2016).
[22]
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding. In Proc. of 24th Int. Conf. on World Wide Web (WWW). 1067--1077.
[23]
Elena Zheleva and Lise Getoor. 2009. To Join or Not to Join: The Illusion of Privacy in Social Networks with Mixed Public and Private User Profiles. In Proc. of 18th Int. Conf. on World Wide Web (WWW) . 531--540.

Cited By

View all
  • (2024)Higher-Order Vision-Language Alignment for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688999(11457-11463)Online publication date: 28-Oct-2024
  • (2023)Missing Types Prediction in Linked Data Using Deep Neural Network with Attention Mechanism: Case Study on DBpedia and UniProt DatasetsInformation Technology for Management: Approaches to Improving Business and Society10.1007/978-3-031-29570-6_11(212-231)Online publication date: 28-Mar-2023
  • (2022)Knowledge Graphs: A Practical Review of the Research LandscapeInformation10.3390/info1304016113:4(161)Online publication date: 23-Mar-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. linked open data
  2. machine learning
  3. ontology population
  4. semantic web
  5. social media
  6. type prediction

Qualifiers

  • Research-article

Conference

CIKM '18
Sponsor:

Acceptance Rates

CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Higher-Order Vision-Language Alignment for Social Media PredictionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3688999(11457-11463)Online publication date: 28-Oct-2024
  • (2023)Missing Types Prediction in Linked Data Using Deep Neural Network with Attention Mechanism: Case Study on DBpedia and UniProt DatasetsInformation Technology for Management: Approaches to Improving Business and Society10.1007/978-3-031-29570-6_11(212-231)Online publication date: 28-Mar-2023
  • (2022)Knowledge Graphs: A Practical Review of the Research LandscapeInformation10.3390/info1304016113:4(161)Online publication date: 23-Mar-2022
  • (2021)Open Data in Prediction Using Machine Learning: A Systematic ReviewInnovative Systems for Intelligent Health Informatics10.1007/978-3-030-70713-2_50(536-553)Online publication date: 6-May-2021
  • (2019)Knowledge Graph Embeddings over Hundreds of Linked DatasetsMetadata and Semantic Research10.1007/978-3-030-36599-8_13(150-162)Online publication date: 4-Dec-2019
  • (2018)Twitter User Recommendation for Gaining FollowersAI*IA 2018 – Advances in Artificial Intelligence10.1007/978-3-030-03840-3_40(539-552)Online publication date: 9-Nov-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media