Skip to main content

Advertisement

Log in

Theoretical study of GDM-SA-SVR algorithm on RAFM steel

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

With the development of society and the exhaustion of fossil energy, we need to identify new alternative energy sources. Nuclear energy is an ideal choice, but the key to the successful application of nuclear technology is determined primarily by the behavior of nuclear materials in reactors. Therefore, we studied the radiation performance of the fusion material RAFM steel. We used the GDM algorithm to upgrade the annealing stabilization process of simulated annealing algorithm. The yield stress performance of RAFM steel was successfully predicted by the hybrid model, which combined simulated annealing with the support vector machine for the first time. The prediction process was as follows: first, we used the improved annealing algorithm to optimize the SVR model after training on a training dataset. Next, we established the yield stress prediction model of RAFM steel. By testing the model and conducting sensitivity analysis on the model, we can conclude that, compared with other similar models such as the ANN, linear regression, generalized regression neural network, and random forest, the predictive attribute variables cover all of the variables in the training set. Moreover, the generalization performance of the model on the test set is superior to that of other similar prediction models. Thus, this paper introduces a new method for the study of RAFM steel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Ananthi S, Dhanalakshmi P (2015) SVM and HMM modeling techniques for speech recognition using LPCC and MFCC features. In: Satapathy S, Biswal B, Udgata S, Mandal J (eds) Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications (FICTA) 2014. Advances in intelligent systems and computing, vol 327. Springer, Cham

  • Babu MN, Mukhopadhyay CK, Sasikala G et al (2016) Study of fatigue crack growth in RAFM steel using acoustic emission technique. J Constr Steel Res 126:107–116

    Google Scholar 

  • Bahlak S, Gazalet J, Lefebvre JE et al (2014) Convex optimization: algorithms and complexity. Found Trends Mach Learn 8(3–4):231–357

    Google Scholar 

  • Bahrami S, Ardejani FD, Baafi E (2016) Application of artificial neural network coupled with genetic algorithm and simulated annealing to solve groundwater inflow problem to an advancing open pit mine. J Hydrol 536:471–484

    Google Scholar 

  • Barnes H (2007) Yield stress Myth. Rheol Acta 24(4):323–326

    Google Scholar 

  • Bilger M, Manning WG (2015) Measuring overfitting in nonlinear models: a new method and an application to health expenditures. Health Econ 24(1):75–85

    Google Scholar 

  • Chen D, Zou F, Wang J et al (2016) SAMCCTLBO: a multi-class cooperative teaching learning based optimization algorithm with simulated annealing. Soft Comput Fus Found Methodol Appl 20(5):1921–1943

    Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Kluwer Academic Publishers, Dordrecht

    MATH  Google Scholar 

  • Darmatasia, Fanany M I (2017) Handwriting recognition on form document using convolutional neural network and support vector machines (CNN-SVM). In: Icoict

  • Du C (2016) An improved particle swarm algorithm with immune mechanism for traffic matrix estimation. In: International conference on natural computation. IEEE, pp 269–274

  • Gaganidze E, Aktaa J (2013) Assessment of neutron irradiation effects on RAFM steels. Fus Eng Des 88(3):118–128

    Google Scholar 

  • Ghamisi P, Couceiro MS, Benediktsson JA (2015) A novel feature selection approach based on FODPSO and SVM. IEEE Trans Geosci Remote Sens 53(5):2935–2947

    Google Scholar 

  • Huang CL, Tsai CY (2009) A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting. Expert Syst Appl 36(2):1529–1539

    MathSciNet  Google Scholar 

  • Jia F, Lichti D (2017) A Comparison of simulated annealing, genetic algorithm and particle swarm optimization in optimal first-order design of indoor. TLS Netw IV–2/W4:75–82

    Google Scholar 

  • Jinhuan Z, Zhenggui Z (2016) An improved genetic algorithm and its applications to the optimisation design of an aspirated compressor profile. Int J Numer Methods Fluids 79(12):640–653

    MathSciNet  Google Scholar 

  • Joachims T (1999) Making large-scale support vector machine learning practical. In: Advances in kernel methods, pp 169–184

  • Keerthi SS, Shevade SK, Bhattacharyya C et al (2014) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649

    MATH  Google Scholar 

  • Kemp R, Cottrell GA, Bhadeshia HKDH et al (2006) Neural-network analysis of irradiation hardening in low-activation steels. J Nuclear Mater 348(3):311–328

    Google Scholar 

  • Liu R, Liu E, Yang J et al (2006) Optimizing the hyper-parameters for SVM by combining evolution strategies with a grid search. Lect Notes Control Inf Sci 344:712–721

    MATH  Google Scholar 

  • Neelamegam C, Sapineni V, Muthukumaran V et al (2013) Hybrid Intelligent modeling for optimizing welding process parameters for reduced activation ferritic-martensitic (RAFM) steel. J Intell Learn Syst Appl 05(1):39–47

    Google Scholar 

  • Otten RHJM, Ginneken LPPPV (2016) The annealing algorithm. Kluwer Int 72:109–111

    Google Scholar 

  • Peng Z, Hu Q, Dang J (2017) Multi-kernel SVM based depression recognition using social media data. Int J Mach Learn Cybern 4:1–15

    Google Scholar 

  • Platt JC (1998) Sequential minimal optimization: a fast algorithm for training support vector machines. Adv Kernel Methods Support Vector Learn 3:212–223

    Google Scholar 

  • Powell MJD (1979) On the estimation of sparse Hessian matrices. SIAM J Numer Anal 16(6):1060–1074

    MathSciNet  MATH  Google Scholar 

  • Rodgers JL, Alannicewander W (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42(1):59–66

    Google Scholar 

  • Ruxton GD (2006) The unequal variance t-test is an underused alternative to Student’s t-test and the Mann Whitney U test. Behav Ecol 17(4):688–690

    Google Scholar 

  • Singh R, Singh SK, Singh U et al (2008) Bayes estimator of generalized-exponential parameters under linex loss function using Lindley’s approximation. Data Sci J 7:65–75

    Google Scholar 

  • Shi G (2005) Fuzzy systems and knowledge discovery. Springer, Berlin

    Google Scholar 

  • Slater ML (1981) A companion inequality to Jensen’s inequality. J Approx Theory 32(2):160–166

    MathSciNet  MATH  Google Scholar 

  • Tan K, Du P (2010) Classification of hyperspectral image based on morphological profiles and multi-kernel SVM. In: Evolution in remote sensing. IEEE, Hyperspectral Image and Signal Processing, pp 1–4

  • Taylor R (1990) Interpretation of the correlation coefficient: a basic review. J Diagn Medical Sonogr 6(1):35–39

    MathSciNet  Google Scholar 

  • Wang WL, Tang MH (2015) A normalization process to standardize handwriting data collected from multiple resources for recognition. Proced Comput Sci 61:402–409

    Google Scholar 

  • Witten IH, Frank E (1999) Data mining: practical machine learning tools and techniques with Java implementations. ACM Sigmod Record 31(1):76–77

    Google Scholar 

  • Ye H, Luo Q, Li Y (2014) Application of wavelet and multi-Kernel SVM in UAV sensors fault diagnosis. Electron Meas Technol 37:112–116

    Google Scholar 

  • Zeugmann T (2016) PAC learning. Springer, Berlin

    Google Scholar 

  • Zhang C, Wang H T, Jiang Y, et al (2017) Application of improved simulated annealing algorithm in TSP. In: Computer engineering and software

  • Zhang T (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann Stat 32(1):56–84

    MathSciNet  MATH  Google Scholar 

  • Zhu S, Cai J (2016) Application of improved affine particle swarm optimization algorithm in parameter estimation. J Electron Meas Instrum 12:34

    Google Scholar 

  • Zien A, Kramer N, Sonnenburg S et al (2009) The feature importance ranking measure. Statistics 5782:694–709

    Google Scholar 

Download references

Acknowledgements

The research is supported by the National Natural Science Foundation of China under Grant No. 61572526 and the China Institute of Atomic Energy. Thanks to the editor, the experiment was successfully conducted, and the results were greatly improved, which enhanced the structure of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Hessian matrix

The second-order partial derivative of the real value function on the real vector is called the Hessian matrix, which is expressed as H[f(x)]and is defined in the following form:

$$\begin{aligned}{}[Hf(x)]=\frac{\partial f(x)}{\partial x{\partial x^{\top }}}=\frac{\partial \frac{\partial f(x)}{\partial x^{\top }}}{\partial x}\in {\mathbb {R}}^{n\times n}. \end{aligned}$$
(29)

Therefore, the (s, t) element is defined as

$$\begin{aligned} H[f(x)]=\frac{\partial ^2 f(x)}{\partial x \partial x^{\top }}= \begin{bmatrix} \frac{\partial ^2 f}{\partial x \partial x^{\top }}&\cdots&\frac{\partial ^2 f}{\partial x \partial x^{\top }}\\ \vdots&\ddots&\vdots \\ \frac{\partial ^2 f}{\partial x \partial x^{\top }}&\cdots&\frac{\partial ^2 f}{\partial x \partial x^{\top }}\\ \end{bmatrix}\in {\mathbb {R}}^{n\times n}. \end{aligned}$$
(30)

The Hessian matrix is a symmetrical matrix because the second-order partial derivative and the mixed partial derivative have nothing to do with the order of the derivation.

Appendix B: Statistic information of sample data

Refer to Table 4.

Table 4 Basic information of the input parameters

Appendix C: Extreme point discrimination of real variable functions

Theorem 1

IfUis a local extreme point ofF(x), andf(x) is continuously differentiable inP(UR), the neighborhood of pointU. Then, we have

$$\begin{aligned} \nabla _{u} \,f(u)=\frac{\partial f(x)}{\partial x}|_{x=u}=0. \end{aligned}$$
(31)

This is only the necessary condition for the first-order stationary point, and it is not a sufficient and necessary condition. For example, the saddle point also satisfies the above constraints.

Theorem 2

If the second-order partial derivative is continuous in the open neighborhood ofUand satisfies

$$\begin{aligned} \nabla _u\, f(u)=\frac{\partial f(x)}{\partial x}|_{x=u}=0 \end{aligned}$$
(32)
$$\begin{aligned} \nabla _{u}^{2}\,f(u)=\frac{\partial ^2 f(x)}{\partial x \partial x^{\top }}|_{x=u}\succ 0, \end{aligned}$$
(33)

thenUis a strict local minimum point of a functionf(x). The Hessian matrix of pointUis a positive definite matrix.

Appendix D: Support vector regression model

The main principle of the SVR model is to map data characteristics to high-dimensional space through the kernel function, making many non-separable low dimensional data become linearly separable in high-dimensional Hilbert space (characteristic space). With the support of strong mathematical theory (based on the VC dimensional generalization theory framework and the principle of structural risk minimization) and its superior performance, SVR quickly became mainstream in machine learning. Before deep learning, it was always one of the preferred models in the field of application because of its superior performance.

For the regression problem, a given sample is \(\hbox {D}=\{\)(x1, y1), (x2, y2), … \(\}\), and a model is expected to be learned as follows:

$$\begin{aligned} \underset{\alpha }{max}J(\alpha )=\frac{\alpha ^{\top }M \alpha }{\alpha ^{\top }N \alpha }. \end{aligned}$$
(34)

Other regression models often use the difference between the output f(x) and the real value to calculate the loss (for example, mean square error); however, SVR has an \(\epsilon\) tolerance deviation between F(x) and the real value. Therefore, the correct prediction dot is in the following closed area

Fig. 10
figure 10

Soft interval support vector. The red points indicate correct predictions, and the blue points indicate predictions that do not meet the precision requirements. (Color figure online)

The SVR problem is further transformed into

$$\begin{aligned} \underset{w,b}{min}\frac{1}{2}\left\| \omega \right\| ^{2}+C\sum _{i=1}^{m}\iota (f(x_i)-y_i). \end{aligned}$$
(35)

\(\iota\) is an insensitive function, and \(\omega\) is the corresponding weight matrix. We should pay special attention to the constant C, that is the regularization constant (also known as the penalty factor). The higher C is, the less tolerable the error, which leads to over-fitting. The smaller C is, the less fitting it would be, which leads to under-fitting. A too large or too small value of C can lead to poor generalization ability. Therefore, when using the SVR model, the parameter C is one of the two important parameters. It is often difficult to get the above results directly, and therefore we use the idea of relaxation and approximation to turn the problem into another problem that is easy to handle.

After the relaxation variable is introduced, the upper formula is transformed into an unconstrained minimization problem by introducing the Lagrange multiplier method. Finally, the objective function that needs to be optimized is

$$\begin{aligned} L(\omega ,b,\alpha ,{\hat{\alpha }},\xi ,{\hat{\xi }},\mu ,{\hat{\mu }}) \end{aligned}$$
(36)
$$\begin{aligned} =\frac{1}{2}\left\| \omega \right\| ^{2}+C\sum _{i=1}^{m}(\xi _i+\hat{\xi _i})-\sum _{i=1}^{m}\mu _i \xi _i- \sum _{i=1}^{m}\hat{\mu _i}\hat{\xi _i}+\sum _{i=1}^{m} \alpha _i(f(x_i)-y_i-\epsilon -{\hat{\xi }}_i) \end{aligned}$$
(37)
$$\begin{aligned} +\sum _{i=1}^{m}{\hat{\alpha }}_i(y_i-f(x_i)-\epsilon -{\hat{\xi }}_i). \end{aligned}$$
(38)

We convert the above formula into the SVR dual problem. If the original formula has a solution, it must satisfy the KKT condition. In the end, it can be expressed as follows:

$$\begin{aligned} f(x)=\sum _{i=1}^{m}({\hat{\alpha }}_i-\alpha _i)\kappa (x_i,x_j)+b, \end{aligned}$$
(39)
$$\begin{aligned} \kappa (x_i,y_i)=\varPhi (x_i)^{\top }\varPhi (x_j), \end{aligned}$$
(40)

where \(\kappa\) (\(\cdot\)) is a kernel function. For regression problems, the selection of the kernel function often has great influence on the regression effect. Therefore, multi-core learning is a direction to solve this problem. In this study we can choose a Gaussian kernel or a sigmoid kernel. After a kernel function is set, it will introduce another important parameter, the kernel parameter gamma, which needs to be adjusted. This kernel parameter implicitly determines the distribution situation after attribute variables are mapped to the reproducing Hilbert space (characteristic space). The larger the gamma, the less support vectors there will be. The smaller the gamma, the more support vectors there will be. The number of support vectors affects the speed of training and prediction.

After the kernel function is determined, the final objective function is converted to the following formula by processing the relative equation of the kernel matrix:

$$\begin{aligned} \underset{\omega }{max}J(\omega )=\frac{\omega ^{\top }S_{b}^{\phi }\omega }{\omega ^{\top }S_{w}^{\phi }\omega }. \end{aligned}$$
(41)

This equation is equivalent to the original objective function that needs to be optimized. To date, the theoretical derivation of SVR is the end.

SVR has been successfully applied in text processing because in the process of feature engineering, SVR often uses every single word as an attribute. Through the VC dimension theory, we can conclude that its description ability is sufficient to break up different documents, and so it will show the sparsity in the global arena. This characteristic is in line with the principle of SVR. Considering the similarity of the problem, and according to the dataset analysis of the system, the data distribution shows sparsity. This is the principal support of choosing SVR as the main prediction model in the present study.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Long, S., Zhao, M. Theoretical study of GDM-SA-SVR algorithm on RAFM steel. Artif Intell Rev 53, 4601–4623 (2020). https://doi.org/10.1007/s10462-020-09803-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-020-09803-y

Keywords

Navigation