Abstract
Deep learning has been widely used to solve graph and combinatorial optimization problems. However, proper model deployment is critical for training a model and solving all problems. Existing frameworks mainly use reinforcement learning to learn to solve combinatorial optimization problems, in which a partial solution of the problem is regarded as an environmental state and each vertex of the corresponding graph is regarded as an action. As a result, using the sample data in model training effectively is challenging for different graphs. This study proposes a sampling-based, data-driven and distributed independent graph learning framework, based on decoupling graph structure learning and problem solving processes. To some extent, it facilitates industrial applications. Specifically, the framework consists of two independent parts: extracting graph structure and learning to solve the problem. Under this framework, the graph contrastive learning(GCL) is used to finish the graph structure learning process. Then by means of state-value aggregation on all of nodes in graphs, a global reinforcement learning method is established to learn to solve the graph problem, associated with repair policies to get improvement of performance. Experiments on synthetic graph datasets show that the graph contrastive learning is beneficial or has some advantages for training stability and improving the accuracy of solving the graph problem, and that the repair policies are stable for solution search. However, it also demonstrates that the graph neural network is not necessarily needed in the process of learning to solve the graph problem. Moreover, learning to solve MDP still has some challenges, such as decreasing learning performance with increasing edge existence probability of graphs, and it is unknown what kind of reward function is appropriate for solving MDP.
Similar content being viewed by others
Data availability and access
Authors agree that data and code in this work will be public. They are used only for scientific research. Please quote them if they are used. Data and code are available at: https://github.com/wujian1112/GCLMDP
References
Agasucci V, Grani G, Lamorgese L (2023) Solving the train dispatching problem via deep reinforcement learning. J Rail Trans Plan Manag 26:100394
Apicella A, Isgrò F, Pollastro A et al (2023) Adaptive filters in Graph Convolutional Neural Networks. Patt Recognit 144:109867
Chen T, Chen X, Chen W et al (2022) Learning to Optimize: A Primer and A Benchmark. J Mach Learn Res 23:1–59
Dai H, Khalil E B, Zhang Y et al. (2017) Learning combinatorial optimization algorithms over graphs. Adv Neural Inf Process Syst
Eroh L, Kang CX, Yi E (2020) The connected metric dimension at a vertex of a graph. Theor Comput Sci 806:53–69
Uno Fang, Li Jianxin Lu, Xuequan, et al (2023) Robust image clustering via context-aware contrastive graph learning. Patt Recognit 138:109340
Geneson J (2020) Metric dimension and pattern avoidance in graphs. Discret Appl Math 284:1–7
Hagberg A A, National L A, Alamos L et al. (2008) Exploring Network Structure, Dynamics, and Function using NetworkX. In Proceedings of the 7th python in science conference (SciPy2008). Gel Varoquaux, Travis Vaught, and Jarrod Millman (Eds), Pasadena, CA USA, pp 11–15
Hassani K, Khasahmadi AH (2020) Contrastive multi-view representation learning on graphs. In: Proceedings of the international conference on machine learning. PMLR, 4116–4126
Kallestad J, Hasibi R, Hemmati A et al (2023) A general deep reinforcement learning hyperheuristic framework for solving combinatorial optimization problems. Eur J Operat Res 309(1):446-468
Latifpour MH, Mills MS, Miri MA (2022) Combinatorial optimization with photonics-inspired clock models. Commun Phys 5:104
Kun Lei, Peng Guo, Yi Wang et al (2022) Solve routing problems with a residual edge-graph attention neural network. Neurocomputing 508:79–98
Liang H, Du X, Zhu B et al (2023) Graph contrastive learning with implicit augmentations. Neural Netw 163:156–164
Li Q, Chen W, Fang Z et al (2023) A multi-view contrastive learning for heterogeneous network embedding. Sci Rep 13:6732
Li S, Han L, Wang Y et al (2023) GCL: Contrastive learning instead of graph convolution for node classification. Neurocomputing 551:126491
Li W, Guo C, Liu Y et al (2023) Rumor source localization in social networks based on infection potential energy. Inf Sci 634:172–188
Ma F, Liu Z-M, Yang L et al (2021) Source localization in large-scale asynchronous sensor networks. Digit Signal Process 109:102920
Mazyavkina N, Sviridov S, Ivanov S, Burnaev E (2021) Reinforcement learning for combinatorial optimization: A survey. Comput Operat Res 134:105400
Danas MM (2023) The difference between several metric dimension graph invariants. Discret Appl Math 332:1–6
Mohseni N, McMahon PL, Byrnes T (2022) Ising machines as hardware solvers of combinatorial optimization problems. Nat Rev Phys 4:363–379
Nie KR, Xu KX (2023) Mixed metric dimension of some graphs. Appl Math Comput 442:127737
Padhye V, Lakshmanan K (2023) A deep actor critic reinforcement learning framework for learning to rank. Neurocomputing 547:126314
Pinto PC, Thiran P, Vetterli M (2012) Locating the Source of Diffusion in Large-Scale Networks. Phys Rev Lett 109(6):068702
Qin W, Zhuang Z, Huang Z, Huang H (2021) A novel reinforcement learning-based hyper-heuristic for heterogeneous vehicle routing problem. Comput Ind Eng 156:107252
Ribeiro LFR, Saverese PHP, Figueiredo DR (2017) struc2vec: Learning Node Representations from Structural Identity. In: Proceedings of the 23rd ACM SIGKDD International conference on knowledge discovery and data mining 385–394
Mashkaria S, Ódor G, Thiran P (2022) On the robustness of the metric dimension of grid graphs to adding a single edge. Discret Appl Math 316:1–27
Schulman J, Filip W, Dhariwal P et al. (2017) Proximal policy optimization algorithm. Mach Learn
Shen Y, Sun Y, Li X et al (2023) Adaptive solution prediction for combinatorial optimization. Eur J Operat Res 309(3):1392–1408
Staudt CL, Sazonovs A, Meyerhenke H (2015) NetworKit: A Tool Suite for Large-scale Complex Network Analysis. Netw Sci 2015
Tran VP, Garratt MA, Kasmarik K et al (2022) Multi-gas source localization and mapping by flocking robots. Inf Fusion 91:665–680
Wang H, Fu T, Du Y et al (2023) Scientific discovery in the age of artificial intelligence. Nature 620:47–60
Wang Q, Lai KH, Tang CL (2023) Solving combinatorial optimization problems over graphs with BERT-Based Deep Reinforcement Learning. Inf Sci 619:930–946
Wang Q, Tang C (2021) Deep reinforcement learning for transportation network combinatorial optimization: A survey. Knowl-Based Syst 233:107526
Wang Z, Sun C, Rui X et al (2021) Localization of multiple diffusion sources based on overlapping community detection. Knowl-Based Syst 226:106613
Wu L, Lin H, Gao Z et al (2023) Self-supervised Learning on Graphs: Contrastive, Generative, or Predictive. IEEE Trans Knowl Data Eng 35(1):857–876
Wu J, Zhao HX, Yang WH (2020) Computing Partition Metric Dimension of Graphs Based on Genetic Algorithm. Acta Math Appl Sin 43(6):1013–1028
Wu J, Wang L, Yang W (2022) Learning to compute the metric dimension of graphs. Appl Math Comput 432:127350
Wu ZH, Pan SR, Chen FW et al (2021) A Comprehensive Survey on Graph Neural Networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24
Yan D, Weng J, Huang S et al (2022) Deep reinforcement learning with credit assignment for combinatorial ptimization. Patt Recognit 124:108466
You Y, Chen T, Sui Y et al. (2020) Graph Contrastive Learning with Augmentations. NeurIPS
Zhang Z, Sun S, Ma G et al (2023) Line graph contrastive learning for link prediction. Patt Recognit 140:109537
Zhao J, Cheong KH (2023) Early identification of diffusion source in complex networks with evidence theory. Inf Sci 642:119061
Zhu T, Shi X, Xu X, Cao J (2023) An accelerated end-to-end method for solving routing problems. Neural Netw 164:535–545
Zhu Y, Xu Y, Yu F et al. (2020) Deep Graph Contrastive Representation Learning. ICML
Zhu Y, Xu Y, Yu F, et al. (2021) Graph Contrastive Learning with Adaptive Augmentation. WWW ’21: Proceedings of the Web Conference 2021,2069–2080
hang Y, Bai R, Qu R, et al (2022) A deep reinforcement learning based hyper-heuristic for combinatorial optimisation with uncertainties. Eur J Operat Res 300(2):418–427
Zhu Y, Xu Y, Liu Q, et al. (2021) An Empirical Study of Graph Contrastive Learning. NeurlPS
Acknowledgements
This work is supported by the Regional Innovation and Development Joint Fund of NSFC (No.U22A20167) and National key research and development program of China (No.2021YFB3300503); National Natural Science Foundation of China (Nos. 12102236); Natural Science Foundation of Shanxi Province (No.20210302124258; No.20210302123097; No.202103021224287); Philosophy and Social Science Planning Project of Shanxi Province(No. 2022YJ075). We thank the editors and reviewers for their valuable suggestions in revising this paper.
Author information
Authors and Affiliations
Contributions
Jian Wu developed methodology, code, and algorithms to solve the problem and prepare the original draft. Li Wang and Weihua Yang offered essential advice for completing this work, revising the draft, and adjusting the critical idea. Rui Wang and Jianji Cao generated original data to perform experiments and run the baselines. Fuhong Wei reviewed this work and developed the methods. Haixia Zhao developed codes in the 10-fold cross-validation experiment.
Corresponding authors
Ethics declarations
Ethical and informed consent for data used
Data used in this paper is generated by the authors. They are used only for scientific research. The data used in this paper does not have any ethical implications.
Competing interest
The authors declare that they have no known competing financial interests or personal relations that could have appeared to influence the work reported.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: model setup
1.1 A.1 GCN-NF model
The GNN-NF model consist of one graph convolutional network(GCN) layer and a projection layer. Moreover, the projection layer is a two-layer perceptron. Specifically, the input, hidden and output dimensions of the GCN model are 128,64,64 respectively. The input, hidden and output dimensions of the projection layer are 64,64,1 respectively. The ReLU activation function is used in the model. In the output layer of actor network, the Sigmoid() activation function is adopted. And in the output layer of critic network, no activation function is used.
1.2 A.2 MLP-NF model
The MLP-NF model is a two-layer perceptron, of which the input, hidden and output dimensions are 128,64,1 respectively. The ReLU activation function is used in the model. In the output layer of actor network, the Sigmoid() activation function is adopted. On the contrary, no activation function is used in the critic network’s output layer.
1.3 A.3 GCL model
In the graph contrastive learning, one GCL model is constructed. And the model are composed by a two-layer GIN graph neural network and a two-layer perceptron. In GIN module, the input, hidden and output dimensions are 64,32,32 respectively. In perceptron module, the input, hidden and output dimensions are 64,64,64 respectively. The ReLU activation function is used in the model. The representations from this model vill be sent to the graph classifier.
In addition, in order to carry out the GCL learning, the graph level graph augmentations are adopted. Specifically, the data augmentations can be written as [47]:
-
\(aug1 \!=\! A.RandomChoice([A.RWSampling(num\_seeds\)\(=\!1000, walk\_length\!=\!10), A.NodeDropping(pn=\)\(0.1), A.FeatureMasking(pf\!=\!0.1), A.EdgeRemoving\)\((pe=0.1),],1)\);
-
\(aug2 \!=\! A.RandomChoice([A.RWSampling(num\_seeds\)\(=1000, walk\_length\!=\!10), A.NodeDropping(pn=\)\(0.1), A.FeatureMasking(pf\!=\!0.1), A.EdgeRemoving\)\((pe=0.1)],1)\),
where the PyGCL library is used to finish the graph augmentations.
1.4 A.4 GCN-F model
This model, named GCN sampler, is used to learn to solve MDP (metric dimension problem of graphs) on a set of graphs. The input of the GCN sampler is the node representations generated by GCL model. This sampler’s architecture is the same as GCN-NF model.
1.5 A.5 MLP-F model
This model, named MLP sampler, is used to learn to solve MDP (metric dimension problem of graphs) on a set of graphs. The input of the MLP sampler is the node representations generated by GCL model. This sampler’s architecture is the same as MLP-NF model.
Appendix B: training and evaluation setup
The environment for running is: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz RAM 16 GB. The PyGCL library, Gurobi, Networkx, networkit and Pytorch software are used.
1.1 B.1 GCL training and evaluation
In the GCL training, Adam optimization algorithm is used, where the learning rate is 0.01, and the DualBranchContrast contrastive model with InfoNCE loss function is adopted [47]. In order to evaluate the quality of GCL learning, the node representations are sent to a graph classifier, where a graph pre-task of graph classification is given. In this paper, the performance of GCL learning is evaluated on test data by SVMEvaluator as a classifier in PyGCL library [47].
1.2 B.2 LS training and evaluation
In LS training process, the sampler is trained to learn to solve MDP on a set of graphs, named train data, and then it is evaluated on test data. Note that, we use PPO reinforcement learning, an unsupervised learning method, to train the sampler. Furthermore, we train the sampler in a manner of mini-batch with batch size 32.
For training of GCN-NF and MLP-NF samplers, the feature for each of the nodes in a graph is \({\textbf {1}}_{1\times d}\) vector with all of the entries are 1. On contrary, for training of GCN-F and MLP-F samplers, the node features in a graph are generated by the well-trained GCL model, named GNN encoder with its frozen parameters.
The Adam optimization algorithm is used to train the samplers, where the learning rate for actor and critic network is 0.01. The number of line search for samplers is 1. The parameter \(\lambda \) for computing the advantage function is selected in \(\{0.001,0.1,0.2,0.3,0.5\}\). The parameter for clip function is selected in \(\{0.1,0.2\}\). We run the training process 1000 episodes on each train data, with 50 as the early stop threshold value. In the reward function, parameters \(\alpha _1=2\), \(\alpha _2=2\), \(\beta =3\). The discount factor for calculating the expected reward is 0.98.
In the process of evaluation, we use the sampler to solve MDP on all of the train data and test data. Then we report the performance of sampler on train data and test data respectively. Where the relative ratio on each of the graphs is computed. Then the average relative ratio is calculated on all of the datasets. Particularly, the number of iteration for add-repair policy and reduce-repair policy is 10.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wu, J., Wang, L., Yang, W. et al. Learning to solve graph metric dimension problem based on graph contrastive learning. Appl Intell 53, 30300–30318 (2023). https://doi.org/10.1007/s10489-023-05130-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05130-1