1 Introduction

Bridge structures span obstacles, such as rivers, valleys and roads. The common forms of bridge structures include simple supported beams, continuous beams and rigid frames, and the bridge design process can be generally divided into three stages: conceptual design, detailed design and construction drawing design. At the conceptual design stage, the structural form of the bridge must be initially confirmed, usually with a trial-and-error procedure to satisfy varying requirements, i.e. the design plan may be modified several times. Thus, the new design plan must be completed as quickly as possible to guarantee design efficiency. However, the bridge design process relies heavily on professional knowledge and experience and is mainly completed manually by competent engineers, which is time-consuming. Consequently, the demand for automated bridge design methods is high.

In recent decades, optimisation methods, such as evolutionary algorithms [1, 2] and topology optimisation [3], have been widely used in automated building design to improve design efficiency and quality. Optimisation methods often rely on explicitly defined constraints and rules. However, for bridge design work, considering all the constraints and resolving conflicts between multiple objectives and constraints are prohibitively challenging. Moreover, constraint rules are difficult to be encoded into computer-executable codes. Consequently, optimisation methods cannot satisfy the requirements of automated bridge design. Recently, deep learning algorithms, which have the ability to learn from experience within existing design plans, provide a new option for addressing these difficulties [4]. Liao et al. [5] proposed an automated shear wall design method for high-rise buildings using generative adversarial networks. Moreover, Lu et al. [6] optimised the structural design accuracy through physics-enhanced methods. Zhao et al. [7] proposed an intelligent beam and slab design method for reinforced concrete shear wall structures using deep neural networks. Gan et al. [8] developed two deep learning models to predict the progressive collapse resistance of RC frames. Pizarro et al. [9, 10] predicted the wall thickness and length of reinforced concrete buildings based on previous projects and using a deep neural network. These studies show that the deep learning-based design method is highly efficient and can generate scheme designs of shear wall and frame structures.

Traditional deep neural networks are designed to process data in Euclidean space and may not be suitable for handling non-Euclidean data, such as graphs. Graph neural networks (GNNs) operate directly on graphs and explicitly encode crucial collaborative signals to enhance user and item representations. They have shown great success in architectural and structural design during the past few years. Chang and Cheng [11] represented the building structures as graphs and suggested the optimal cross-sections of columns and beams by two GNNs. Wang et al. [12] represented building information models as graphs and classified room types in residential apartments using GNNs. Zhao et al. [13] represented a shear wall structure in graph data form and trained GNN models using the data from predefined shear wall structures. Then, the well-trained GNN models generated a shear wall layout design similar to that of experienced engineers. Zhao et al. [14] trained GNN models using a predefined frame structure dataset and generated the beam layout of the frame structure using the well-trained GNN models. These methods mentioned above are mainly adopted within the scope of building structures, and research using GNN methods in the design of railway bridge structures has not been conducted thus far. Building construction is a form of point engineering that is typically situated at a specific location, making layout design a primary design concern. In contrast, railway bridges are classified as linear engineering projects that traverse obstacles, such as mountains and rivers, with design considerations focussing on span length, beam type, and pier placement.

An intelligent railway bridge design method named AGOAM is proposed to efficiently recommend the beam type of the bridge for railway route main control point. The features of the main control points and beam types are first converted to encoding vectors, and then an independent embedding vector is assigned to each feature. The embedding vectors are subsequently represented in the form of attribute graph, and the complex relationships between attributes are captured through inner interactions and cross-interactions. The prediction score of each beam type is obtained utilising graph matching technology. In addition, the accuracy of the recommendation results is improved through ontology-enhanced attribute interaction and attention mechanism-based graph pooling. The effectiveness of the proposed method is demonstrated with a real-world railway bridge design dataset.

2 Railway Bridge Design Procedure

Bridge structures cross obstacles, and route control points are special locations that constrain the layout of railway alignment. The span layout design of bridges involves selecting a reasonable bridge plan to extend across the control points beneath the bridge, ensuring a reasonable arrangement of bridge spans. This design is the primary objective of railway bridge planning.

In the span layout design of railway bridges, bridge engineers must consider how the railway bridge can reasonably cross all control points along the line. The main control point represents a key control point or a combination of control points on the railway route. The spanning plan must satisfy the relevant requirements, such as highways and railways. The non-main control points refer to control points that can be removed by the design engineer after weighing the pros and cons, such as village roads.

The main control point determines the selection of beam types at key holes on the railway route. The remaining beam types and hole span locations can be identified only by determining the beam type and hole span layout of the main control point. Therefore, the beam layout of the main control point is the most important step in the span layout design of railway bridges. Selection and arrangement of an appropriate beam type for the main control point have become a difficult problem in the intelligent layout of railway bridges. Railway bridge design engineers must select and adjust from the beam type library according to the type of the main control points and corresponding attributes until a suitable beam type is achieved.

The experts classified the control points in detail, and the classification and related attributes of 13 main control points are listed, as shown in Table 1. In the table, the main control points are classified by gradient priority, the smaller value indicates higher priority. Meanwhile, the lowest priority of the beam placed at the main control point is listed. That is, the priority of the beam type laid out at this main control point should be greater than or equal to this value. In addition, the main control point has length and angle attributes, which are determined by the specific main control point conditions.

Table 1 Common types of main control points and related attributes (partial)

The bridge selected and arranged for the main control points of the railway bridge is called the main girder. Different bridge types exhibit varying properties, which are also distinct in bridges with the same beam type but different lengths. The beam types are also classified in detail according to the cost and use scenarios of the bridges, and the classification and related attributes of several core beam types are listed in Table 2. The beam types mainly include 32 m simply supported beams and eight types of continuous beams with main spans ranging from 48 to 135 m. The table also shows that the beam types are classified by gradient priority. A smaller value indicates higher priority. In addition, each beam type has a fixed main span length. When railway bridge engineers select the beam type for each control point, they must consider the control point’s priority requirements for the beam type and satisfy the length requirements. Therefore, the main span of the recommended beam type should be larger than the influence range of the control points after processing the control point angle and gamma coefficient.

Table 2 Common beam types and related attributes of high-speed railway (partial, Unit: m)

3 Intelligent Railway Bridge Design Based on GNNs

Addressing issues, such as multiple controlled factors and complex design criteria in the conceptual bridge design stage, an intelligent main beam recommendation method is proposed using the existing bridge engineering case library and GNNs. The proposed method is named AGOAM and consists of a preprocessing layer, a subgraph construction layer, a node matching layer, a graph pooling layer and a graph matching layer. The architecture of the AGOAM model is shown in Fig. 1.

Fig. 1
figure 1

The architecture diagram of the AGOAM model

In the AGOAM model, the features of the main control points and beam types are mapped as embedding vectors, which are subsequently represented in the form of attribute graphs. The inner and cross-interactions are adopted on the main control point and beam type attribute graphs to exploit their complex relationships. Then, a global ontology structure is constructed to enhance the attribute interactions, and an attention mechanism is introduced into the graph pooling process to highlight the contributions of important attribute nodes. Finally, the prediction score of each beam type is estimated through graph matching technology, and the beam type with the highest prediction score is recommended.

3.1 Preprocessing Layer

The attributes of the main control point and beam type are expressed in the form of key–value pairs as (attr, val), where attr represents the attribute name and val represents the attribute value. For example, the key–value pair (height, 20) indicates that the height attribute of an item is 20. For a batch of data samples, the attribute values of all main control points and beam types are counted, and each attribute value is assigned an ID number in order.

In the preprocessing layer, the features of the main control points and beam types are initially encoded as sparse vectors and then mapped as dense embedding vectors. An na-dimensional vector table is initialised randomly, and the encoding vector of each attribute key–value pair is assigned according to the attribute name. In which, na denotes the number of all the main control points and beam type attribute names. Then, the encoding vectors are mapped as d-dimensional initialised embedding vectors, where d is the embedding size. For discrete features, the corresponding initialised embedding vector can be obtained directly through the ID number of the attribute value. For continuous features (such as the length attribute of the control point), a pretraining layer named AutoDis is utilised to perform embedding learning on the continuous attribute values of all main control points and beam types. As shown in Fig. 2, AutoDis is composed of three core modules: meta-embedding, automatic discretisation, and aggregation. AutoDis can assign an independent embedding to each continuous value and achieve high-capacity, end-to-end training of the model [15].

Fig. 2
figure 2

The architecture diagram of the AutoDis framework

3.2 Subgraph Construction Layer

In this study, the initialised embedding vectors are represented as attribute graphs, in which each node corresponds to an attribute and the edges correspond to the interactive relationships between the attributes. Thus, the interactions between attributes can be converted to the interactions between nodes in the attribute graph [16]. In this graph, every two nodes are connected by an edge, which represents the interactive relationship between the node pair. The nodes in the attribute graph are denoted as symbol V, and the edges are represented by E. The subgraph of a certain main control point or a certain beam type can be expressed as follows:

$$ G^{C} = \left\langle {V^{C} ,E^{C} } \right\rangle ,\;G^{B} = \left\langle {V^{B} ,E^{B} } \right\rangle , $$
(1)

where \(G^{C}\) denotes the subgraph of a certain main control point, \(G^{B}\) indicates the subgraph of a certain beam type, VC is the node set that contains all the nodes in the main control point attribute graph, and VB is the node set that contains all the nodes in the beam type attribute graph; EC is the edge set that contains all the edges in the main control point attribute graph, and EB is the edge set that includes all the edges in the beam type attribute graph.

3.3 Node Matching Layer

In the node matching layer, the attributes of the main control points and beam types are interacted to exploit the complex relationships between each other, achieving accurate matching between the main control points and beam types. The interactions between attributes can be divided into two different types: inner interaction and cross-interaction [17].

3.3.1 Inner Attribute Interaction

The inner interaction is the attribute interaction inside the main control point or the beam type. In this study, inner interaction is modelled through the message passing method, a procedure of aggregating neighbourhood information.

The interaction between node i and neighbour node j can be performed by the dot product operation as follows:

$$ u_{ij} = u_{i} \cdot u_{j} , $$
(2)

where \(u_{i}\) and \(u_{j}\) are the initial embeddings of node i and neighbour node j respectively, and \(u_{ij}\) is the preliminary interactive representation of node i and neighbour node j.

A fully connected neural network, multilayer perceptron (MLP), is used to model each inner interaction, and the architecture diagram of the MLP is shown in Fig. 3. The nodes in the present layer are connected to all the nodes in the next layer, and the information in node i and neighbour node j is aggregated through the fully connected layer. Subsequently, the output of each layer passes to the next layer through the nonlinear activation function ReLU. The interactive information of node i and neighbour node j is effectively captured through multiple layers, and the interactive embedding of node i and neighbour node j is obtained in the final output layer.

Fig. 3
figure 3

Architecture diagram of the MLP

Considering the preliminary interaction representation of node i and neighbour node j as the input of the MLP, the output of the MLP can be expressed as follows:

$$ z_{ij} = f_{MLP} \left( {u_{ij} } \right), $$
(3)

where \(f_{MLP} \in {\mathbb{R}}^{d} \to {\mathbb{R}}^{d}\) denotes the fully connected neural network MLP, and \(z_{ij}\) is the output of the MLP.

The output of the MLP can be regarded as the inner interaction result of node i and neighbour node j. This process can be expressed in the form of a function \(f_{neural} \in {\mathbb{R}}^{2 \times d} \to {\mathbb{R}}^{d}\) as follows:

$$ z_{ij} = f_{neural} \left( {u_{i} ,u_{j} } \right), $$
(4)

where \(z_{ij}\) is the inner interaction result of node i and neighbour node j.

The embeddings of node i and all neighbouring nodes are calculated sequentially, and the neighbourhood embeddings are aggregated as the message passing information. In this study, the mean message aggregation function of the GNN is used to aggregate the neighbourhood embeddings into node i. The aggregation result can be expressed as follows:

$$ z_{i} = \frac{1}{{n_{i} }}\sum\limits_{{j \in V_{i} }} {z_{ij} } , $$
(5)

where \(z_{i}\) is the message passing result of node i, Vi is the set of all nodes at neighbour node i, and ni is the number of nodes that neighbour node i.

3.3.2 Cross-Attribute Interaction

Cross-interaction refers to the attribute interaction between the main control point and the beam type, facilitating node matching. The cross-attribute interaction is used to describe the dependence of a certain attribute of the main control point on a certain attribute of the beam type. If a main control point attribute is strongly dependent on a beam type attribute, then their embeddings should be similar after model training. For example, if the interaction between the length attribute of the main control point and the main span length attribute of the beam type is highly dependent, then the embedding of the length attribute of the main control point and that of the main span length attribute of the beam type should be similar after model training. To achieve this condition, the Bi-interaction algorithm [18] is used for node matching so that the cross-interaction modelling results and attribute similarities maintain a monotonically increasing correlation. The cross-interaction using the bi-interaction algorithm can be expressed as follows:

$$ s_{ij} = u_{i} \odot \hat{u}_{j} , $$
(6)

where \(s_{ij}\) is the node matching result of node i and node j in another attribute graph;\(u_{i}\) is the embedding of node i in one attribute graph; \(\hat{u}_{j}\) is the embedding of node j in another attribute graph; \(\odot\) denotes the Hadamard product.

The node matching results of node i and all nodes in another attribute graph are calculated sequentially. Similar to the inner interaction, the mean message aggregation function of the GNN is used to aggregate the node matching results into node i. The aggregation result of node matching can be expressed as follows:

$$ s_{i} = \frac{1}{{\hat{n}_{i} }}\sum\limits_{{j \in \hat{V}}} {s_{ij} } , $$
(7)

where \(s_{i}\) is the aggregated node matching result of node i, \(\hat{V}\) is the set of all nodes in another attribute graph, and \(\hat{n}_{i}\) is the number of nodes in another attribute graph.

3.3.3 Node and Attribute Information Fusion

To capture the node-level matching results and the node information, the recurrent gated neural network is used to fuse the message passing results, the node matching results and the initial node embeddings. The fused results can be expressed as follows:

$$ u^{\prime}_{i} = f_{{{\text{fuse}}}} \left( {u_{i} ,z_{i} ,s_{i} } \right), $$
(8)

where \(u^{\prime}_{i}\) is the fused node representation of node i, \(f_{fuse}\) denotes the fusing function, and \(f_{fuse} \in {\mathbb{R}}^{3 \times d} \to {\mathbb{R}}^{d}\).

3.4 Graph Pooling Layer

3.4.1 Ontology Construction of Main Control Points and Beam Types

Ontology, an expression of the essence and relations of real objectives, mainly emphasises the hierarchical structure of concepts and the formal expression of mutual relations between concepts [19]. Ontology modelling primarily aims to clarify the relevant concepts in the ontology domain, the attributes and constraints of the concepts and the hierarchical relations between concepts. It also attempts to build a model with reasonable logic and a clear hierarchical structure utilising the above information. The general process of ontology construction includes determining the domain and scope of the ontology, enumerating important concepts in the ontology domain, determining ontology classes and their hierarchical structures, defining the properties of classes, formalising the ontology and evaluating and revising the ontology.

In accordance with this process, the ontologies of the main control point and beam type are constructed, as shown in Fig. 4. The figure shows that the attributes with direct relations are connected by relationship edges to represent their attribute interactions, whereas those without direct relations are not connected by relationship edges, indicating that they have no direct attribute interaction. For example, the category type and priority of the main control point are directly related, whereas the category type and the subcategory priority are indirectly related. Moreover, attributes, such as the grayscale coefficient, the left distance-rich threshold, and the right distance-rich threshold and the angle, exhibit varying effects on the actual length of the main control point, thereby influencing the beam type selection. Therefore, direct relations exist between these attributes, suggesting attribute interactions between them. Therefore, “effect” relations are added between the length attribute of the main control point and these attributes. Thus, relationship edges denoted by blue dashed lines are added to the ontology to represent their attribute interactions. In the inner attribute interaction of the main control point, the ontology structure can be utilised to further enhance the interactive feature extraction of the attribute nodes with high matching degrees.

Fig. 4
figure 4

The global ontology structure of main control point beam type

3.4.2 Global Ontology Construction of the Main Control Point and Beam Type

To select a suitable beam type for the main control point, railway bridge engineers must decide whether a type of beam can be arranged for the main control point based on the beam type priority and the minimum beam type priority allowed for the main control point. The actual length of the main control point can be determined by the length, angle, grayscale coefficient, left distance-rich threshold and right distance-rich threshold. Then, the selected beam type should be longer than the actual length of the main control point. Therefore, a global ontology structure must be constructed to enhance the priority and length attribute interaction between the main control point and the beam type.

In this study, the global ontology structure of the main control point beam type is constructed, as shown in Fig. 4. This ontology structure includes the inner and cross-interactions between the main control point and the beam type. Relationship edges denoted by red dashed lines are added to represent the priority and length attribute interactions between the main control point and the beam type. Therefore, the global ontology structure can be used to enhance the interactive feature extraction of attribute nodes with high matching degrees, such as the cross-attribute interaction between the actual length attribute of the main control point and the main span length attribute of the beam, as well as the minimum beam type priority allowed for the main control point and the priority attribute of the beam type.

3.4.3 Node Interaction Capture Based on the Ontology Structure

Different from the inner interaction and cross-interaction described in Sect. 3.3, in this section, the global ontology structure is used to organise the global attribute graph of the main control point and beam type. In the global attribute graph, the inner and cross-interactions of the main control point and the beam type are represented through the graph data structure. The graph convolutional network (GCN) is used to capture the inner and cross-attribute interactions between the main control point and beam type. The node representation captured by the GCN can be expressed as follows:

$$ u^{\prime\prime}_{i} = f_{GCN} \left( {u^{\prime}_{i} } \right), $$
(9)

where \(u^{\prime\prime}_{i}\) is the embedding of node i after being captured by the GCN, and \(u^{\prime}_{i}\) is the embedding of node i after fusion.

3.4.4 Attention Mechanism-Based Graph Pooling

The representations of the attribute graphs can be obtained by aggregating the embeddings of all nodes in the attribute graph. On the contrary, the priority of beam type should be greater than or equal to the minimum beam type priority allowed. Moreover, the main span length of the beam should be longer than the actual length of the main control point. Therefore, the minimum priority allowed, the length of the main control point, and the main span length of the beam should be provided with greater attention coefficients in the aggregation of node embeddings. By contrast, attributes that are not considered primarily, such as the category type of main control point and beam joint, should be assigned smaller weight coefficients.

In this section, an attention mechanism-based graph pooling method is used to assign different weight coefficients to the attribute nodes, highlighting the contributions of important attribute nodes to the representation of the attribute graph. Initially, the context information of the main control point and the beam type attribute graph are obtained by calculating the nonlinear transformation of the average node embedding value, as follows:

$$ c = \tanh \left( {\frac{1}{{n_{c} }}W\sum\limits_{i = 1}^{{n_{c} }} {u_{i}^{^{\prime\prime}} } } \right), $$
(10)

where c is the contextual information of the main control point (or beam type) attribute graph and \(c \in {\mathbb{R}}^{d}\); W is a learnable weight matrix; the nonlinear transformation of attribute node embeddings can be conducted through the learnable weight matrix, and \(W \in {\mathbb{R}}^{d \times d}\); nc is the number of all attribute nodes in the main control point (or beam type) attribute graph.

Different attention weights can be assigned to each attribute node in the main control point (or beam type) attribute graph through the contextual information c.

$$ a_{i} = softmax\left( {u_{i}^{^{\prime\prime}T} c} \right), $$
(11)

In the equation, \(a_{i}\) is the attention weight coefficient of node i. The attention weight coefficient captures the global context information c by performing an inner product operation on the embedding of node i and the global context information c. Through the inner product operation, the attribute node similar to the global context information obtains a larger attention weight. Moreover, the attention weight coefficients are normalised via the softmax function. Thus, the sum of the attention weight coefficients for all attribute nodes is 1.

Then, the embeddings and corresponding attention coefficients of all the attribute nodes are weighed and summed to obtain the final representation of the attribute graph as follows:

$$ h = f_{G} (G,V) = \sum\limits_{i = 1}^{{n_{c} }} {a_{i} } u_{i}^{^{\prime\prime}} , $$
(12)

where \(h \in {\mathbb{R}}^{d}\), G represents a certain attribute graph, V is the set of all nodes in the attribute graph, and \(\hat{V}\) is the set of all nodes in another attribute graph.

Through the above operations, the final representations of the main control point attribute graph and the beam type attribute graph can be obtained as

$$ h_{G}^{C} = f_{G} \left( {G^{C} ,V^{C} } \right),\;h_{G}^{B} = f_{G} \left( {G^{B} ,V^{B} } \right), $$
(13)

where \(h_{G}^{C}\) is the final representation of the main control point attribute graph and \(h_{G}^{B}\) is the final representation of the beam type attribute graph.

3.5 Graph Matching Layer

The main control point attribute graph and the beam type attribute graph can be matched through the dot product on the graph representations. The prediction score \(y^{\prime}\) can be calculated as follows:

$$ y^{\prime} = h_{G}^{{C^{T} }} h_{G}^{B} $$
(14)

The prediction score of each beam type can be obtained using Eq. (14), and the beam type with the highest prediction score is recommended.

3.6 Model Training and Optimisation

The binary cross-entropy loss function is utilised to optimise the parameters of the proposed recommendation algorithm. The binary cross-entropy loss function is defined as follows:

$$ L = - \frac{1}{N}\sum\limits_{i = 1}^{N} {\left[ {y_{i} \cdot \log \left( {y^{\prime}_{i} } \right) + \left( {1 - y_{i} } \right) \cdot \log \left( {1 - y^{\prime}} \right)_{i} } \right]} , $$
(15)

where N is the total number of samples in model training; yi is the true label of the i-th training sample, which is 1 for a positive sample and 0 for a negative sample; \(y^{\prime}_{i}\) is the prediction score of the i-th training sample.

In addition, the L2 norm is used to regularise all the parameters of the proposed recommendation algorithm.

$$ R(\theta ) = \frac{1}{N}\sum\limits_{i = 1}^{N} {L\left( {F_{AGM} \left( {x_{i} ;\theta } \right),y_{i} } \right) + \lambda \left( {\left\| \theta \right\|_{2} } \right)} , $$
(16)
$$ \theta^{ * } = agrmin_{\theta } R(\theta ), $$
(17)

where FAGM denotes the prediction function of the proposed method that outputs \(y^{\prime}_{i}\); L(⋅) denotes the loss function, i.e. the binary cross-entropy loss function; θ represents all the parameters of the proposed algorithm; \(\lambda\) is the L2 regularisation weight; \(\theta^{ * }\) is the final parameter.

4 Experiments

To evaluate the effectiveness and accuracy of the proposed recommendation model AGOAM, experiments on a real-world railway bridge design dataset were conducted. Meanwhile, the effectiveness of ontology-enhanced attribute interaction and attention mechanism-based graph pooling is evaluated by ablation study.

4.1 Experimental setting

4.1.1 Railway Bridge Design Dataset

In this study, the design documents of several existing railway bridge projects are collected. Then, the attributes of the main control points, as well as those of the beam type are extracted to construct a railway bridge main control point beam type design dataset. In the dataset, each sample contains the attributes of one main control point and those of the corresponding beam type. The detailed statistics of the dataset are summarised in Table 3.

Table 3 Statistics of the railway bridge design dataset

The railway bridge design dataset contains 13 types of main control points, totalling 1530 main control points. The length of the main control points ranges from 5 to 155 m, and the angle ranges from 60 to 120 degrees. The other attributes of the main control points in the dataset are the default attributes shown in Table 1. In addition, the dataset includes nine beam types, as detailed in Table 2.

In this study, the combination of each main control point and its actual beam type is regarded as a positive sample, and the combination of the main control point and each remaining beam type is regarded as a negative sample. Thus, each main control point corresponds to one positive sample and eight negative samples. There are 1530 main control points in the dataset, achieving a total of 13,770 main control point beam type pair samples. These samples are divided into two sets: the training set and the test set. Approximately 80% of the samples in the dataset were randomly selected as the training set, and the remaining 20% of the samples were regarded as the test set.

4.1.2 Test Environment

This study used Python 3.9 as the compilation environment and PyTorch 1.13.1 as the deep learning framework. The specific test environment is shown in Table 4.

Table 4 The test environment

4.1.3 Evaluation indicators

In recommender systems, the evaluation of recommendation results has always been a crucial step, and the performance of a recommendation model is directly reflected by the evaluation indicators. In this study, four most commonly used evaluation indicators, namely, the area under the ROC curve (AUC) [20], logarithmic loss (Logloss) [21], precision [22] and normalised discounted cumulative gain (NDCG) [23] are utilised to evaluate the performance of the proposed recommendation model.

AUC indicates the probability that, for a randomly selected pair of positive and negative samples, the score of the positive sample is greater than that of the negative sample. In a recommender system, the AUC is used to evaluate the ranking ability of a model, and its value is between 0.5 and 1. A larger AUC value indicates better performance of the recommender system.

The Logloss measures the difference between the predicted score and the true label; a smaller Logloss indicates better performance of the recommender system.

Precision refers to the proportion of positive samples amongst all samples that are predicted to be true. The calculation formula is expressed as follows:

$$ {\text{Precision}} = \frac{TP}{{TP + FP}}, $$
(18)

where TP represents the number of true examples and FP represents the number of false positive examples. A higher precision value indicates better recommendation performance.

The NDCG represents the normalised loss cumulative gain, which is a common indicator used to measure the quality of top-k recommendations and is normalised by the DCG. CG indicates accumulating the gain of k items in the recommendation list, and Gain denotes the correlation of each item. The DCG considers the order of each item in the recommendation result list and introduces a loss factor to capture the importance of items in different positions of the list. Consequently, the influence of low-ranking items is increasingly weakened. The calculation formula of the DCG can be expressed as follows:

$$ DCG_{k} = \sum\limits_{i = 1}^{k} {\frac{{2^{{rel_{i} }} - 1}}{{\log_{2} (i + 1)}}} , $$
(19)

where k indicates the size of the recommendation list to be observed, and reli denotes the relevance of the i-th recommendation result in the recommendation list. For DCG, as long as the recommendation results are sufficient, the value of DCG is infinite. Therefore, DCG must be normalised to obtain the NDCG.

$$ NDCG_{k} = \frac{{DCG_{k} }}{{IDCG_{k} }}, $$
(20)

where IDCG represents the DCG under ideal conditions. The NDCG value is between [0,1]. A value closer to 1 indicates better ranking effect, which can intuitively reflect the ranking quality of the recommender system.

4.2 Test results

4.2.1 Implementation Details

In this section, the implementation details of the proposed method are described. The samples in the training set are used for model training, and those in the test set are subsequently used to evaluate the model.

The main control point and beam type attributes of each sample are expressed in the form of key–value pairs. The railway bridge design dataset used in this study comprises 12 main control point attribute names and 10 beam type attribute names. Thus, the number of all main control point and beam type attribute names na, is equal to 22. Then, a 22-dimensional vector table is initialised randomly, and each key–value pair is assigned an initialised encoding vector based on the attribute name. Subsequently, the encoding vectors are mapped as 64-dimensional initialised embedding vectors, i.e. the embedding dimension d is set to 64. In the inner attribute interaction process, the dropout rate of the fully connected neural network MLP is set to 0.5.

In the model training and optimisation stage, the prediction score \(y^{\prime}\) and the actual label of the main control point beam type sample \(y_{i}\) are used as inputs, and the loss function value is calculated using Eq. (15). The parameters of the proposed recommendation algorithm are optimised based on the loss function value. The recommendation results of the proposed model are evaluated using the samples in the test set. The trained model is used to recommend beam types for the main control points, and the differences between the prediction scores and the actual labels of the samples are evaluated.

4.2.2 Effect of Feature Learning on the Embedding Vector

For the main control points in the test set, their embedding vectors are converted into a two-dimensional space through the dimensionality reduction technique t-SNE [24]. Visualisation of the embedding vectors can be realised by regarding the two-dimensional vector as the x-axis and y-axis, and using the corresponding beam types of the main control points as annotations.

A visualisation of the embedding vectors before and after model training is shown in Fig. 5. As shown in Fig. 5a, the embedding vectors prior to model training are combined after dimensionality reduction, indicating that they have no significantly distinguishable features. Thus, they cannot be directly used for classification. Figure 5b shows that the embedding vectors after model training have evident classification clusters after dimensionality reduction. The embedding vectors labelled with the same beam type are close to each other on the graph, whereas the embedding vectors labelled with different beam types are far apart. The visualisation results of the embedding vectors indicate that the model learns the features of the embedding vectors effectively and achieves excellent classification ability.

Fig. 5
figure 5

Visualisation of the embedding vectors based on t-SNE

4.2.3 Visualisation of Recommendation Results

The performance of the trained AGOAM model is visualised through the confusion matrix. The predicted and the actual beam types of each main control point in the test set are used to construct the confusion matrix, as shown in Fig. 6. Each row of the matrix represents the actual beam type, and each column indicates the beam type for the main control point predicted by the AGOAM model. The values in the matrix represent the number of times the AGOAM model predicts the actual beam type corresponding to the row given that the beam type is associated with the column. For example, the values in the first row and first column of the matrix indicate that the beam type is accurately predicted as a 32 m simply supported beam for 16 times. Similarly, the values in the first row and the second column of the matrix indicate that the beam type (32 m simply supported beam) is incorrectly predicted as 48 m continuous beam is 1. The confusion matrix illustrates that the probability of the beam type being inaccurately predicted is small, and the inaccurately predicted beam type is usually an adjacent category of the actual beam type.

Fig. 6
figure 6

Confusion matrix for the validation dataset after the last round of model training

4.2.4 Analysis of Evaluation Index Results

The AGOAM model is trained using the samples in the training set. After each iteration of the algorithm, the samples in the test set are used to evaluate the performance of the proposed model. Figure 7 shows the changes in each evaluation index with the number of iterations.

Fig. 7
figure 7

The evaluation index results of different iterations

Figure 7a demonstrates that as the number of iterations increases, the Logloss values of the model on the test set exhibit a downward trend. In addition, the decline range decrease as the number of iterations increases, and the value ultimately becomes stable. In the last round of iteration, the Logloss value is small at 0.34914, indicating that the final model has excellent prediction performance.

Figure 7b, c shows that for the indicators, precision and AUC, the values increased significantly in the first five rounds of training. The two values continued to rise slightly and eventually reached relative stability in the following 45 rounds of training. Moreover, after the last round of training, the precision index of the model on the test set is 0.91883, and the AUC index is 0.99323. These results show that the AGOAM model has excellent prediction accuracy.

Figure 7d shows the change trend in the NDCG indicators during the iteration process. The curves NDCG@1, NDCG@2 and NDCG@3 represent the evaluation of the recommendation quality of the top-1, top-2 and top-3 of the recommendation result list, respectively. The three indicators significantly increase during the first five rounds of training, from approximately 0.7 to approximately 0.9. The increase is gradual during the subsequent training and then eventually stabilises. At the end of the iteration, the three indicators reach high values of 0.85152, 0.95620 and 0.94261, indicating that the model has excellent ranking performance in the top-k predictions.

4.3 Comparison Study

The proposed method is compared with two classical methods, namely the factorization machine (FM) and attentional factorization machines (AFM) models. The FM model, a popular machine learning approach based on matrix factorization, is widely employed in recommendation systems. The AFM model integrates attention networks into the FM model to assign varying weights to feature interactions, resulting in enhanced prediction accuracy. A comparative analysis of the models is presented in Table 5.

Table 5 The evaluation indexes of different models

Table 5 illustrates that the AGM model exhibits higher values for the AUC, NDCG@1, and NDCG@3 indicators compared to the FM and AFM models, whilst demonstrating a lower Logloss indicator. These results suggest superior performance of the AGM model relative to the FM and AFM models. This performance advantage may be attributed to the AGM model’s utilisation of neural networks to capture attribute interaction information, distinguishing it from the FM and AFM models that rely on simplistic dot product operations.

4.4 Parameter Study

The AGOAM model uses a GNN, particularly GCN, to enhance the inner and cross-attribute interactions between the main control point and the beam type. The number of GCN layers was set to 1, 2, 3, 4 and 5 to study the impact of the number of GCN layers on the recommendation results. The feature dimensions of the other GCN layers are the same except for the input and output layers. The recommendation accuracy of the AGOAM model corresponding to different numbers of GCN layers is shown in Table 6.

Table 6 The recommendation accuracy of the AGOAM model under different numbers of GCN layers

The table shows that the values of AUC and NDCG@1 are largest when the number of GCN layers is 4, which means that the recommendation accuracy of the AGOAM model is highest under this condition. For GNN, the feature information of the global attribute graph can be aggregated into attribute nodes when the number of convolutional layers in GNN is sufficient. When the number of convolutional layers is insufficient, the hidden interaction feature information cannot be aggregated. Consequently, the interaction feature information becomes insufficient, thereby affecting the accuracy of the recommendation result. For the global ontology structure of the main control point beam type, the four-layer GNN is sufficient for aggregating the direct interaction information and hidden interaction information of the global attribute graph. Thus, the ontology structure efficiently improves the recommendation accuracy. The five-layer GNN aggregates excessive feature interaction information into the attribute nodes, decreasing the recommendation accuracy of the AGOAM model.

4.5 Ablation Study

An ablation study is conducted for the proposed AGOAM model to evaluate the effectiveness of ontology-enhanced attribute interaction and attention mechanism-based graph pooling. In this study, the recommendation results of several variant models of the AGOAM are compared. The model with the global ontology structure of the main control point beam type is called AGOM, whereas the model with attention mechanism-based graph pooling is called AGAM. Moreover, the model without the global ontology structure and attention mechanism-based graph pooling is called AGM. Each model is trained using the samples in the training set, and the samples in the test set are used to evaluate the recommendation results. The calculation results of the evaluation indicators are summarised in Table 7.

Table 7 The evaluation indexes of different models

The table shows that the AGOM and the AGAM model have certain improvements in AUC, NDCG and other indicators compared with AGM model. This finding is obtained because the main control point beam type ontology structure enhances the inner and cross-interaction of the corresponding attribute nodes, allowing the model to further extract the feature information. Consequently, the recommendation accuracy of the AGOM model is improved. On the contrary, the attention mechanism-based graph pooling assigns different attention coefficients to the attribute nodes. This approach enhances the recommendation accuracy of the AGAM model. In addition, the recommendation results of the AGAM model are remarkably better than those of the AGOM model, indicating that the attention mechanism-based graph pooling has a greater impact on the recommendation results than the ontology structure. Furthermore, the AGOAM model achieves greater improvement compared with the AGOM and AGAM models. This result indicates that the two operations of ontology-enhanced attribute interaction and attention mechanism-based graph pooling do not conflict and can be superimposed.

5 Conclusions

In this study, a recommendation algorithm based on GNNs is proposed to efficiently recommend the beam type of a bridge for main control point on railway route. A GNN is utilised to learn from the existing railway bridge design plans, and the recommended beam type of the main control point is presented through graph matching technology.

A real-world railway bridge design dataset is used to demonstrate the accuracy of the algorithm. The analysis of the evaluation index results showed that the algorithm could achieve highly accurate beam type recommendations using the main control point attributes. Moreover, the ablation study results reveal that the ontology-enhanced attribute interaction and attention mechanism-based graph pooling effectively improve the recommendation accuracy. On the other hand, this study solely considers first-order attribute interactions. Future research could explore higher-order attribute interactions to enhance model performance.