Generalizable surrogate model features to approximate stress in 3D trusses

https://doi.org/10.1016/j.engappai.2018.01.006Get rights and content

Abstract

Existing neural network (NN) models that predict finite element analysis (FEA) of 3D trusses are not generalizable. For example, if a model is designed for a ten-bar truss, it cannot accurately predict the analysis results of a 12-bar truss. Such changes require new sample data and model retraining, reducing the time-saving value of the approach. This paper introduces Generalizable Surrogate Models (GSMs) that use a set of feature descriptors of physical structures to aggregate analysis data from various structures, enabling a more general model that predicts performance for a variety of geometric class, topology and boundary conditions. The paper presents training of generalizable models on parametric dome, wall, and slab structures, and demonstrates the accuracy and generalizability of these GSMs compared to traditional NNs. Results demonstrate first how to combine and use analysis data from various structures to predict the performance of the members of structures of the same class with different topology and boundary conditions. The results further demonstrate how these GSMs more closely predict FEA results than NN models exclusively created for a specific structure. The methodology of this study can be adopted by researchers and engineers to create predictive models for approximation of FEA.

Introduction

One way to reduce the time of structural optimization is the prediction of the results of FEA using approximation-based models. These simple analytical models, known as “metamodels”, are based on data available from limited analysis runs. These “models of the model” seek to approximate computation-intensive functions within a considerably shorter time than expensive simulation codes that require significant computing power.

To create metamodels for structural optimization, researchers borrowed techniques from machine learning. For instance, using neural network (NN) for predicting the displacement (Hajela and Berke, 1992), maximum stress (Cho et al., 2007), cross-sectional area Kaveh and Servati (2001), Ramasamy and Rajasekaran (1996), and optimal design of truss and grid structures (Kamyab-Moghadas et al., 2012). The input parameters (or feature descriptors) of these models are mostly cross-sectional area of individual members and some parameters of the overall structure such as length and height. Given the limited set of input parameters, all the NN models that appear in literature have been designed for specific structures, so they are not generalizable to the prediction of the performance of other structures. For instance, if a model is designed for a five-bar truss, it cannot be used to predict the structural performance of another five-bar truss with a different boundary condition (e.g., applied forces). In addition, these models cannot be re-used if the topology of the structure changes (e.g., from a five-bar truss to a six-bar truss. By changing the topology, new sample data should be generated and the model should be re-trained. Therefore, the predictability and time-saving value of these exclusive models are limited. The objective of this study is to introduce and test a set of feature descriptors that can be used to create generalizable metamodels for a range of geometric classes of structures. The scope of this paper is to train and test the metamodels for predicting the stress values of truss members in slab, wall, and dome geometric classes of structures, in linear static mode.

Estimating the performance of design objectives instead of computing them is known as surrogate modeling (Queipo et al., 2005), or, interchangeably in this study, metamodeling. This technique approximates design objectives by determining the continuous function of design variables from a limited set of data. Metamodels can be used to predict results within a considerably shorter period of time than simulation code based on data available from limited analysis runs (Forrester et al., 2008). The efficiency of metamodels becomes imperative as designers seek interactive feedback and wish to construct and explore a design space to investigate the effect of design variables on objectives. Metamodels help formulate optimization problems that are more accurate and easier to solve (Wang and Shan, 2006).

After selecting optimization variables, the metamodeling process involves choosing the sampling method, selecting the metamodel and fitting it into the sample data, and validating and assessing for accuracy. Afterwards, a search function finds new design sample for analysis (Forrester and Keane, 2009). The following section describes various sampling plans and metamodeling choice of this study.

The first step in the design of metamodels is to develop a sampling plan. During this step, designers select a limited number of designs from a design space for analysis with simulation codes. Although an increased number of samples would improve the accuracy of the metamodels, it would also increase the computational time required to analyze the model. Therefore, selecting an efficient sampling technique is crucial to the success of predictive models.

In classic methods originating from the theory of Design of Experiments, designers pre-select their sampling points so that the evaluation of their hypothesis becomes independent of random errors in their physical experiments (Wang and Shan, 2006). Among these methods, one of the most convenient sampling technique is “full factorial”, in which designers split the design space into rectangular grids from where they uniformly pick their points. To improve this approach, designers generate random sub-samples within each grid to ensure a uniform projection of samples on each axis. This method is called “stratified random sampling”, the basis for Latin square and random Latin hypercube. In the Latin square, a square made of nn design variables are created and filled with (1,2,3,,n) so that each number appears only once in each row or column. The Latin hypercube is a multi-dimensional extension of the Latin square (Forrester et al., 2008).

Even though studies have presented various metamodeling techniques such as the polynomial (linear, quadratic, or higher), the spline (linear, cubic, NURBS), kriging, radial basis functions (RBF), the decision tree, the random forest, and the support vector machine, they have reached no consensus about which model is superior to the others (Wang and Shan, 2006). This study uses artificial NNs because of the compatibility of this technique with computing thousands of data points generated from structural analysis. This metamodeling technique is explained in the following section.

The artificial NN model was first introduced in biological systems for mathematically representing information processing (McCulloch and Pitts, 1943). Later, the technique was broadly used among scientists and scholars for pattern recognition. This section describes a specific class of artificial NNs, multilayer perceptron, which, as Bishop (2006) demonstrates, has proven to have considerable practical value.

A multilayer perceptron (MLP) is a feed-forward, nonlinear NN function with a vector of input (xi) and output parameters (yk) or neurons, adjustable control parameter w, and non-linear basis function φj (x). Based on the combination of non-linear basis functions, MLP can be extended from a general representation of the linear regression and classification models. ykx,w=fj=1Mwjφj(x).

The goal of constructing a NN model is to replace the basis function with a set of parameters that can be adjusted during the training process. In other words, a series of functional transformations known as activations (aj) are constructed and transformed using a linear combination of input parameters (M), and a differentiable, nonlinear activation function such as a sigmoid. aj=j=1Dwji1xi+wj01wji1=weightswj01=biases1indicatesweightsandbiasesinthefirstlayerofthenetworkj=1,,MandMrepresentthetotalnumberofinputs.Similarly, for the last layer of a two-layer network, we obtain ak=j=1Mwkj2zj+wk02wji2=weightswj02=biases2:weightsandbiasesinthesecondlayerofthenetworkk=1,,KandKrepresentthetotalnumberofoutputszj=hajzjistheactivationfunctionofthefirstlayer.Using an appropriate activation function for the final layer, we can compute the outputs of the model: yk=σak,where σ(a)=11+exp(a).The following equation represents a NN model with a sigmoidal activation function (Bishop, 2006). ykx,w=σj=1Mwkj2hi=1Dwji1xi+wj01+wk02.

One way to compute the adjustable parameters (w) is to use a set of input (xn) and target (tn) vectors as a training set. The objective is to fit a curve that satisfies the input and output parameters, minimizing the following error function in regression problems. Ew=12n=1Nyxn,wtn2.Minimizing the error function is an iterative process with the adjustment of the weight matrix in a sequence of steps from the front to back layers. This method, called “back-propagation (BP)”, is one of the most efficient methods of computing the derivative of the error function with respect to the weight (Bishop, 2006).

Metamodels are applied to optimization problems using three techniques. Traditionally, engineers design global metamodels and use them as surrogates of expensive objective functions. This sequential approach requires a large data set of examples, but it may not allow systematic model validation. Another technique is an adaptive strategy that allows both validation and optimization in a loop: Designers update and validate the metamodel as they evaluate more samples in the optimization loop. The adaptive approach can help with reducing the time of computation in multi-dimensional design spaces—the so-called “curse of dimensionality”. If a variable is sampled in each grid of a one-dimensional design space that is divided in ‘n’ segments, the variable should be sampled in n k times in k-dimensional space to achieve the same sampling density, thus significantly increasing the computation time (Forrester et al., 2008). The last approach, direct sampling, uses metamodels as a guide for adaptive sampling and excludes them from the optimization loop (Wang and Shan, 2006).

Section snippets

Reviews of metamodels developed for structural analysis or optimization

In recent years, deep neural network received an increasing amount of attention. As a result, various models such as generative adversarial networks (Goodfellow et al., 2014) and variational autoencoders (Kingma and Welling, 2013) are studied and tested in various domains such as image processing (Mescheder et al., 2017), speech recognition (Hsu et al., 2017), and 3D reconstruction (Gwak et al., 2017). These deep models, however, have not been applied to the approximation of linear static

Proposed approach

Two main phases of the GSM methodology are feature generation, and model creation and verification (Fig. 1). The goal of the first phase is to generate a set of feature vectors that encode various geometries of space frames or trusses into machine-readable codes. Using samples of feature vectors generated in this phase, the aim of the second phase is to develop, test, and verify NN models that predict structural performance (as validated by the results of an FEA) of various structures.

Experiment

We use two experiments to demonstrate how the results of FEA of various structures can be combined to create a metamodel that predicts the stress values of the members of similar structures. In the first experiment, we combine data from the same geometric class, whereas in the second experiment, we combine data from the three different geometric classes. Data Combination is mixing the sample of analysis data from various structures. For instance, the first five rows of Fig. 8 are the feature

Discussion

The results of this paper show that GSMs are superior to ESMs in various ways. First, the results of the first experiment showed that GSMs have consistently better prediction accuracy compared to ESMs. Next, Experiment 1 demonstrated that unlike ESMs, GSMs can aggregate data from each geometric class and predict the stress values of any structural member within each class. In addition, metamodels generated from the combination of data in each geometric class produced fewer errors than the

Conclusions

The theoretical contribution of this study is the introduction and validation of the feature descriptors that enable the aggregation of structural analysis data. These features include joint type, coordinates of nodes, xx max, yy max, zz max, support on the node, load on the node, proximity to supports, proximity to loads, member area, No. of neighbors, and local stiffness to convert various physical geometries into machine readable code. In addition, this study demonstrates a methodology

References (30)

  • BishopC.M.

    Pattern Recognition and Machine Learning

    (2006)
  • ChengJ.

    Application of artificial neural networks to the response prediction of geometrically nonlinear truss structures

    Struct. Eng. Mech.

    (2007)
  • ChoY.S.

    Study of optimized steel truss design using neural network to resist lateral loads

    Key Eng. Mater.

    (2007)
  • ForresterA.

    Engineering Design Via Surrogate Modelling: A Practical Guide

    (2008)
  • GholizadehS.

    Seismic design of double layer grids by neural networks

    Int. J. Optim. Civ. Eng.

    (2012)
  • Cited by (13)

    • Difference-based deep learning framework for stress predictions in heterogeneous media

      2021, Composite Structures
      Citation Excerpt :

      Past researchers have shown great potential for ML and Deep Learning (DL) methods as surrogates for predicting mechanical properties in Computational Solid Mechanics and Computational Fluid Dynamics, without performing Finite Element Analysis (FEA). Earlier attempts to integrate ML techniques into FEA focus on updating model [11,12], defining material constitutive relationship [13] and approximating nonlinear constitutive behavior [14]. With limitations of ML techniques in modelling complex nonlinear models, researchers have resorted to DL method as a surrogate to FEA in the field of stress prediction.

    • Structural dynamics simulation using a novel physics-guided machine learning method

      2020, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Data-fitting methods construct surrogate models by fitting the relationship between input and output data with a mathematical model. Commonly used data-fitting methods include, for example, neural networks (Nourbakhsh et al., 2018), radial basis functions (Amouzgar et al., 2018), and Kriging models (Liu et al., 2018). Hierarchical modeling methods and projection-based methods are physics-based methods as they reflect the underlying physical structure of the original model.

    • FINITE ELEMENT QUANTITATIVE ANALYSIS AND DEEP LEARNING QUALITATIVE ESTIMATION IN STRUCTURAL ENGINEERING

      2022, WCCM-APCOM 2022 - 15th World Congress on Computational Mechanics and 8th Asian Pacific Congress on Computational Mechanics: Pursuing the Infinite Potential of Computational Mechanics
    View all citing articles on Scopus
    View full text