research-article

Author Set Identification via Quasi-Clique Discovery

Authors:

Yanfang YeAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 771 - 780

https://doi.org/10.1145/3357384.3357966

Published: 03 November 2019 Publication History

Abstract

Author identification based on heterogeneous bibliographic networks, which is to identify potential authors given an anonymous paper, has been studied in recent years. However, most of the existing works merely consider the relationship between authors and anonymous papers, while ignore the relationships between authors. In this paper, we take the relationships among authors into consideration to study the problem of author set identification, which is to identify an author set rather than an individual author related to an anonymous paper. The proposed problem has important applications to new collaborator discovery and group building. We propose a novel Author Set Identification approach, namely ASI. ASI first extracts a task-guided embedding to learn the low-dimensional representations of nodes in bibliographic network. And then ASI leverages the learned embedding to construct a weighted paper-ego-network, which contains anonymous paper and candidate authors. Finally, converting the optimal author set identification to the quasi-clique discovery in the constructed network, ASI utilizes a local-search heuristic mechanism under the guidance of the devised density function to find the optimal quasiclique. Extensive experiments on bibliographic networks demonstrate that ASI outperforms the state-of-art baselines in author set identification.

References

[1]

Yuichi Asahiro, Kazuo Iwama, Hisao Tamaki, and Takeshi Tokuyama. 2000. Greedily finding a dense subgraph. Journal of Algorithms, Vol. 34, 2 (2000), 203--221.

Digital Library

[2]

Devora Berlowitz, Sara Cohen, and Benny Kimelfeld. 2015. Efficient Enumeration of Maximal k-Plexes. In Acm Sigmod International Conference on Management of Data .

Digital Library

[3]

Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravindan Vijayaraghavan. 2010. Detecting high log-densities: an O (n $1/4$) approximation for densest k-subgraph. In Proceedings of TOC. ACM, 201--210.

[4]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems. 2787--2795.

[5]

Jie Chen and Yousef Saad. 2012. Dense subgraph extraction with application to community detection. IEEE Transactions on Knowledge and Data Engineering, Vol. 24, 7 (2012), 1216--1230.

Digital Library

[6]

Ting Chen and Yizhou Sun. 2017. Task-guided and path-augmented heterogeneous network embedding for author identification. In Proceedings of WSDM. ACM, 295--304.

Digital Library

[7]

Michael Cogswell, Faruk Ahmed, Ross Girshick, Larry Zitnick, and Dhruv Batra. 2015. Reducing overfitting in deep networks by decorrelating representations. arXiv preprint arXiv:1511.06068 (2015).

[8]

Alessio Conte, Donatella Firmani, Caterina Mordente, Maurizio Patrignani, and Riccardo Torlone. 2017. Fast Enumeration of Large k-Plexes. In Acm Sigkdd International Conference on Knowledge Discovery & Data Mining .

Digital Library

[9]

Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of SIGKDD. ACM, 135--144.

Digital Library

[10]

Christos Giatsidis, Dimitrios M Thilikos, and Michalis Vazirgiannis. 2013. D-cores: measuring collaboration of directed graphs based on degeneracy. Knowledge and information systems, Vol. 35, 2 (2013), 311--343.

[11]

Andrew V Goldberg. 1984. Finding a maximum density subgraph .University of California Berkeley, CA.

Digital Library

[12]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of SIGKDD. ACM, 855--864.

Digital Library

[13]

Shawndra Hill and Foster Provost. 2003. The myth of the double-blind review?: author identification using only citations. Acm Sigkdd Explorations Newsletter, Vol. 5, 2 (2003), 179--184.

Digital Library

[14]

Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. Fraudar: Bounding graph fraud in the face of camouflage. In Proceedings of SIGKDD. ACM, 895--904.

Digital Library

[15]

Cheng-Kang Hsieh, Longqi Yang, Yin Cui, Tsung-Yi Lin, Serge Belongie, and Deborah Estrin. 2017. Collaborative metric learning. In Proceedings of WWW. 193--201.

Digital Library

[16]

Zhipeng Huang, Yudian Zheng, Reynold Cheng, Yizhou Sun, Nikos Mamoulis, and Xiang Li. 2016. Meta structure: Computing relevance in large heterogeneous information networks. In Proceedings of SIGKDD. ACM, 1595--1604.

Digital Library

[17]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[18]

Jing Li, Feng Xia, Wei Wang, Zhen Chen, Nana Yaw Asabere, and Huizhen Jiang. 2014. Acrec: a co-authorship based random walk model for academic collaboration recommendation. In Proceedings of WWW. ACM, 1209--1214.

Digital Library

[19]

Xiaozhong Liu, Yingying Yu, Chun Guo, and Yizhou Sun. 2014. Meta-path-based ranking with pseudo relevance feedback on heterogeneous graph for citation recommendation. In Proceedings of CIKM. ACM, 121--130.

Digital Library

[20]

Yuanfu Lu, Chuan Shi, Linmei Hu, and Zhiyuan Liu. 2019. Relation Structure-Aware Heterogeneous Information Network Embedding. In Proceedings of AAAI .

[21]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.

[22]

Mathias Payer, Ling Huang, Neil Zhenqiang Gong, Kevin Borgolte, and Mario Frank. 2015. What you submit is who you are: a multimodal approach for deanonymizing scientific publications. IEEE Transactions on Information Forensics and Security, Vol. 10, 1 (2015), 200--212.

[23]

Jian Pei, Daxin Jiang, and Aidong Zhang. 2005. On mining cross-graph quasi-cliques. In Proceedings of SIGKDD. ACM, 228--238.

Digital Library

[24]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of SIGKDD. ACM, 701--710.

Digital Library

[25]

Xiang Ren, Jialu Liu, Xiao Yu, Urvashi Khandelwal, Quanquan Gu, Lidan Wang, and Jiawei Han. 2014. Cluscite: Effective citation recommendation by information network-based clustering. In Proceedings of SIGKDD. ACM, 821--830.

Digital Library

[26]

Chuan Shi, Binbin Hu, Xin Zhao, and Philip Yu. 2018. Heterogeneous Information Network Embedding for Recommendation. IEEE Transactions on Knowledge and Data Engineering (2018).

[27]

Chuan Shi, Xiangnan Kong, Yue Huang, S Yu Philip, and Bin Wu. 2014. HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 10 (2014), 2479--2492.

[28]

Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and S Yu Philip. 2017. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, Vol. 29, 1 (2017), 17--37.

Digital Library

[29]

Mauro Sozio and Aristides Gionis. 2010. The community-search problem and how to plan a successful cocktail party. In Proceedings of SIGKDD. ACM, 939--948.

Digital Library

[30]

Yizhou Sun and Jiawei Han. 2012. Mining heterogeneous information networks: principles and methodologies. Synthesis Lectures on Data Mining and Knowledge Discovery, Vol. 3, 2 (2012), 1--159.

Digital Library

[31]

Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S Yu, and Tianyi Wu. 2011. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment, Vol. 4, 11 (2011), 992--1003.

Digital Library

[32]

Jian Tang, Meng Qu, and Qiaozhu Mei. 2015a. Pte: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of SIGKDD. ACM, 1165--1174.

Digital Library

[33]

Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015b. Line: Large-scale information network embedding. In Proceedings of WWW . 1067--1077.

Digital Library

[34]

Jie Tang, Sen Wu, Jimeng Sun, and Hang Su. 2012. Cross-domain collaboration recommendation. In Proceedings of SIGKDD. ACM, 1285--1293.

Digital Library

[35]

Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. 2008. Arnetminer: extraction and mining of academic social networks. In Proceedings of SIGKDD. ACM, 990--998.

Digital Library

[36]

Charalampos Tsourakakis, Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Maria Tsiarli. 2013. Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In Proceedings of SIGKDD. ACM, 104--112.

Digital Library

[37]

Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of SIGKDD. ACM, 1225--1234.

Digital Library

[38]

Xiao Wang, Yiding Zhang, and Chuan Shi. 2019. Hyperbolic Heterogeneous Information Network Embedding. In Proceedings of AAAI .

[39]

Kilian Q Weinberger and Lawrence K Saul. 2009. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, Vol. 10, Feb (2009), 207--244.

Digital Library

[40]

Chuxu Zhang, Chao Huang, Lu Yu, Xiangliang Zhang, and Nitesh V Chawla. 2018. Camel: Content-Aware and Meta-path Augmented Metric Learning for Author Identification. In Proceedings of WWW. 709--718.

Digital Library

Cited By

Chen HShuai HYang DLee WShi CYu PChen M(2021)Structure-Aware Parameter-Free Group Query via Heterogeneous Information Network Transformer2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00203(2075-2080)Online publication date: Apr-2021
https://doi.org/10.1109/ICDE51399.2021.00203
Shi CWang XS. Yu PShi CWang XYu P(2021)Heterogeneous Graph Representation for RecommendationHeterogeneous Graph Representation Learning and Applications10.1007/978-981-16-6166-2_7(175-208)Online publication date: 5-Nov-2021
https://doi.org/10.1007/978-981-16-6166-2_7

Index Terms

Author Set Identification via Quasi-Clique Discovery

Recommendations

Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification
WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining

In this paper, we study the problem of author identification under double-blind review setting, which is to identify potential authors given information of an anonymized paper. Different from existing approaches that rely heavily on feature engineering, ...
Coherent closed quasi-clique discovery from large dense graph databases
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining

Frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has been witnessed several applications and ...
Cross-Network Embedding for Multi-Network Alignment
WWW '19: The World Wide Web Conference

Recently, data mining through analyzing the complex structure and diverse relationships on multi-network has attracted much attention in both academia and industry. One crucial prerequisite for this kind of multi-network mining is to map the nodes ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
183
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen HShuai HYang DLee WShi CYu PChen M(2021)Structure-Aware Parameter-Free Group Query via Heterogeneous Information Network Transformer2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00203(2075-2080)Online publication date: Apr-2021
https://doi.org/10.1109/ICDE51399.2021.00203
Shi CWang XS. Yu PShi CWang XYu P(2021)Heterogeneous Graph Representation for RecommendationHeterogeneous Graph Representation Learning and Applications10.1007/978-981-16-6166-2_7(175-208)Online publication date: 5-Nov-2021
https://doi.org/10.1007/978-981-16-6166-2_7

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten