1 Introduction

Fracture propagation using variational approaches and phase-field methods is currently an important topic in applied mathematics and engineering. The approach was established in [6, 12] and overview articles and monographs include [7, 8, 11, 31, 32] with numerous further references cited therein. While the major amount of work concentrates on forward modeling of phase-field fracture, more recently some work started on parameter identification employing Bayesian inversion [18, 28, 29, 33], topology optimization with shape derivatives obtained with the adjoint method [10], stochastic phase-field modeling [14], and optimal control [15,16,17, 25,26,27]. Due to the irreversibility constraint on the fracture growth, optimization problems subject to such an evolution become mathematical programs with complementarity constraints (MPCC) [4, 21, 22] so that standard constraint qualifications like [30, 34] cannot hold. In [17, 25,26,27] the complementarity constraint was replaced with a suitable penalty term.

Specifically, based on [26, 27], a computational space-time framework for phase-field fracture optimal control problems was designed in [17], including detailed tests with six numerical examples. In particular, we employ Galerkin formulations in time and discuss in detail how the crack irreversibility constraint is formulated using a penalization [23, 26] and an additional viscous regularization [19, 27]. The optimization problem is formulated in terms of the reduced approach by eliminating the state variable with a control-to-state operator. Therein, Newton’s method requires the evaluation of the state, adjoint, tangent, and adjoint Hessian equations. The latter requires the evaluation of second-order derivatives; see, e.g., [5], for parabolic optimization problems. Based on these settings, concrete time-stepping schemes are derived. As usual, the state and tangent equations run forward in time, whereas the adjoint and adjoint Hessian equations run backward in time.

However, in [17] the directional derivative of the irreversibility constraint was numerically approximated to obtain an efficient numerical scheme. The rigorous mathematical treatment is the key objective in the current work. As it will turn out, the justification of differentiability requires higher-order regularity in corresponding space-time function spaces. To this end, we start from an energy functional and propose suitable function spaces for the state and control variables. In addition, higher-order derivatives of the penalty functional are established. Using these results, the weak formulation of the state equation is derived. However, due to the crack irreversibility regularization, a second-order time derivative on the phase-field variable arises, which was not the case in [17] due to the previously mentioned numerical approximation. Dealing with this (higher-order) time derivative, we follow a classical approach (see, e.g., [3]), and formulate a mixed first-order-in-time system, which yields an additional equation. For that mixed system, the space-time formulation and the resulting time-stepping scheme are derived. Finally, the weak formulation serves as constraint in the overall optimal control problem and, using the Lagrangian, the three additional equations are derived in a rigorous space-time fashion along with the resulting time-stepping schemes.

The outline of this paper is as follows: In Sect. 2, we introduce the phase-field fracture forward model and derive the derivatives of the fracture irreversibility penalization. In Sect. 3, a Galerkin space-time discretization for the mixed system in its weak form is provided. Next, in Sect. 4, we state the optimization problem, including the reduced space approach. In the key Sect. 5, the Lagrangian and three auxiliary equations are derived in full detail. Our work is summarized in Sect. 6.

2 Phase-Field Fracture Forward Model

To formulate the state equation, we first introduce some basic notation and then proceed with the energy functional. In addition, we establish higher-order derivatives for the penalty functional representing the crack irreversibility and we finally obtain a regularized energy functional.

2.1 Notation

We consider a bounded domain \(\Omega \subset \mathbb {R}^2\). The boundary is partitioned as \(\partial \Omega = \Gamma _D {\mathop {\cup }\limits ^{.}}\Gamma _N\) where both \(\Gamma _D\) and \(\Gamma _N\) have nonzero Hausdorff measure. Next we define Hilbert spaces \(V \mathrel {{\mathop :}{=}}V_u \times V_\varphi \subset L^2(\Omega ; \mathbb {R}^3)\) with \(V_u \mathrel {{\mathop :}{=}}H_D^1(\Omega ;\mathbb {R}^2)\) for the displacement field u and \(V_\varphi \mathrel {{\mathop :}{=}}H^1(\Omega )\) for the phase-field \(\varphi \), and \(Q \mathrel {{\mathop :}{=}}L^2(\Gamma _N; \mathbb {R}^2)\) for the control force q, where

$$\begin{aligned} H^1_D(\Omega ; \mathbb {R}^2) \mathrel {{\mathop :}{=}}\{v \in H^1(\Omega ; \mathbb {R}^2):v|_{\Gamma _D} = 0\}. \end{aligned}$$

Moreover we consider a compact time interval \(I = [0, T]\) and introduce the spaces

consequently \(\varphi \in X_\varphi \mathrel {{\mathop :}{=}}H^2(I, V_\varphi )\) and \(\dot{\varphi }\in X_{\dot{\varphi }}\mathrel {{\mathop :}{=}}H^1(I, V_\varphi )\). We denote natural scalar products and norms with the space as index, such as \((\cdot ,\cdot )_{X_\varphi }\equiv (\cdot ,\cdot )_{H^2(I, H^1(\Omega ))}\), \(L^2\) scalar products and norms with the domain, such as \((\cdot ,\cdot )_{\Gamma _N}:=(\cdot ,\cdot )_{L^2(\Gamma _N; U)}\) or (omitting the image space U: either some \(\mathbb {R}^d\) or \(\mathbb {R}^{2 \times 2}\)), and \(L^p\) norms as .

2.2 Energy Functional of Quasi-Static Variational Fracture Modeling

In the next step we introduce a functional \(E_\varepsilon ^{\gamma \eta } :W \times X \rightarrow \mathbb {R}\) from which we derive our forward problem. Here is defined as the sum of the regularized total energy of a crack plus penalty and convexification terms for the time dependent irreversibility constraint \(\dot{\varphi }\le 0\). The regularized total energy of a crack is given by

(1)

where q denotes a force that is applied in orthogonal direction to \(\Gamma _N \subset \partial \Omega \), \(\mathbb {C}\in {\mathbb {R}}^{2\times 2\times 2\times 2}\) is the elasticity tensor and e(u) the symmetric gradient. Then, we have under certain assumptions

$$\begin{aligned} \mathbb {C}e(u) = \sigma (u) = 2 \mu e(u) + \lambda {\text {tr}}(e(u)) I, \end{aligned}$$

where \(\mu>0, \lambda > - 2/3\mu \) are the Lamé parameters and \(I\in {\mathbb {R}}^{2\times 2}\) is the identity matrix. The so-called degradation function \(g(\varphi )\mathrel {{\mathop :}{=}}(1-\kappa )\varphi ^2 + \kappa >0\) helps to extend the displacements to the entire domain \(\Omega \). Due to the bulk regularization parameter \(\kappa >0\), the first term in is bounded away from zero even for \(\varphi = 0\). The term is a regularized form of the Hausdorff measure [2]. Herein \(G_c>0\) denotes the critical energy release rate: for larger \(G_c\) more energy is required to form a new fracture. The phase-field regularization parameter \(\varepsilon \) represents the width of the transition zone \(0 \le \varphi \le 1\), i.e., of the zone between fully broken material (\(\varphi = 0\)) and fully intact material (\(\varphi = 1\)). So far the problem consists in finding a function that minimizes the regularized total energy (1) subject to the irreversibility constraint \(\dot{\varphi }(t,x)\le 0\) a.e. in \(I \times \Omega \).

2.3 Crack Irreversibility Penalization and its Differentiability

In the sequel, the constraint \(\dot{\varphi }(t,x)\le 0\) is being replaced by a penalty term, which is defined as

To ensure Fréchet differentiability in \(X_\varphi \) up to third order, which will be necessary for the adjoint Hessian equation, we work with \(\max \{\dot{\varphi }, 0\}^4\), although \(\max \{\dot{\varphi }, 0\}^{3+\rho }\) for some \(\rho > 0\) might suffice. Observe that exists for every \(f \in X_{\dot{\varphi }}\): from \(f(t) \in V_\varphi \) it follows that \(f(t) \in L^q(\Omega )\) for any \(q \in [1, \infty )\) due to the continuous embedding \(V_\varphi \hookrightarrow L^q(\Omega )\), and from \(f \in X_{\dot{\varphi }}\) we similarly have \(f \in C(I, V_\varphi )\). In summary, any \(f \in X_{\dot{\varphi }}\) is also in \(L^4(I,L^4(\Omega ))\). We notice that this part differs from the other penalizations used in [26] and [17]. In addition, we notice other theoretical and numerical work on penalizations for phase-field fracture in [13, 24]. Later we will minimize (1) by solving the corresponding necessary optimality conditions of first order. In order to compute the Fréchet derivatives of R, we first prove an estimate for the \(L^2\) norm of a product and then we use the directional derivative of the maximum operator \(h(f) \mathrel {{\mathop :}{=}}\max \{f, 0\}\) in direction g:

$$\begin{aligned} (D_gh(f))(x) = {\left\{ \begin{array}{ll} g(x), &{} f(x) >0,\\ \max \{g(x), 0\}, &{} f(x) = 0, \\ 0, &{} f(x) <0. \end{array}\right. } \end{aligned}$$
(2)

Differentiating \(R(\varphi )\) for given directions leads to \(L^2\) norms of products of time derivatives . In order to prove Fréchet differentiability of R we have to bound these \(L^2\) norms in a proper way. The following proposition gives the required estimate with respect to the \(X_\varphi \) norm.

Proposition 2.1

Given , \(m \in \mathbb {N}\), there exists \(C > 0\) such that

Proof

For holds by definition of the norms:

For \(m > 1\), set \(q \mathrel {{\mathop :}{=}}2 m\) and employ Hölder’s inequality ( for \(r, q_i \in (1, \infty )\) with \(1/r = 1/q_1 + \dots + 1/q_m\); here \(r = 2\), \(q_i = q\)) together with the continuous embedding \(V_\varphi = H^1(\Omega ) \hookrightarrow L^q(\Omega )\) (which holds in dimension 2 with , \(c_1 > 0\)) to obtain

The last norm is finite: holds by definition since , and the continuous embeddings \(H^1(I) \hookrightarrow C(I) \hookrightarrow L^\infty (I) \hookrightarrow L^q(I)\) for \(q \in [1, \infty )\) provide . Now we apply Hölder’s inequality (again with \(q = 2 m\)) to the outer norm and bound the resulting \(L^q(I)\) norms with the \(L^\infty (I)\) norm where :

Finally, we apply again the embedding \(H^1(I) \hookrightarrow L^\infty (I)\) (with ),

which gives the desired result with \(C \mathrel {{\mathop :}{=}}c_1^m c_2^m c_3^m\). \(\square \)

Proposition 2.2

The penalty functional R is three times Fréchet differentiable in \(X_\varphi \). For directions the derivatives are given by

Proof

We have to show that the \(\ell \)-linear functionals \(R^{(\ell )}(\varphi )\) are bounded with respect to the \(X_\varphi \) norm and that the resulting remainder terms of the Taylor expansion satisfy , where

After deriving the given expressions for the functionals and showing their boundedness, we will actually prove the stronger estimates using integral representations of .

Substituting the directional derivative (2) gives the first derivative

This is clearly a linear functional on \(X_\varphi \), which is bounded by Proposition 2.1:

The second and third derivatives \(R''(\varphi )\) and \(R'''(\varphi )\) are derived in a similar manner. Again by Proposition 2.1, these bilinear and trilinear functionals are bounded with

It remains to show that for \(\ell \in \{1, 2, 3\}\) and any , where

see [1, Thm. 5.8]. As already mentioned, we will prove . The first-order remainder reads

A case distinction for the two maximum functions provides the following pointwise estimate for the integrand (except on a set of measure zero):

The term on the last line absorbs from the previous line due to the relation which holds in case 2 ( with ) and in case 3 ( with ): these two cases produce . Integrating s, \(s^2\), and \(s^3\) over [0, 1] now yields factors \(\frac{1}{2}\), \(\frac{1}{3}\), and \(\frac{1}{4}\), all of them \(\le \frac{1}{2}\), which gives the claim using Proposition 2.1:

The second-order remainder reads

We start again with a case distinction (where in cases 2 and 3) to obtain the pointwise estimate

Integrating \((1 - s) s^k\) over [0, 1] for \(k \in \{1, 2\}\) yields factors \(\frac{1}{6}\) and \(\frac{1}{12}\), and hence

Finally, the third-order remainder reads

Once again we derive a pointwise estimate for the integrand,

Integrating \((1 - s)^2 s\) over [0, 1] yields the factor \(\frac{1}{12}\) and hence

\(\square \)

2.4 Convexification

One final modification of \(E_\varepsilon \) is needed: we add the convexification term for some \(\eta > 0\). Indeed, in [27], the term in the time steps \(t_{i-1}, t_i\) was considered for \(\eta > 0\), where \(\varphi ^i - \varphi ^{i-1}\) is the finite difference approximation of \(\dot{\varphi }\) and is the test function. This term corresponds to a potential viscous regularization of a rate-independent damage model [19]. We note that we extend this here to the continuous time case.

2.5 Regularized Energy Functional

Finally, the forward problem consists in finding that solves the following optimization problem for given initial data \(\varphi _0, \dot{\varphi }_0 \in V_\varphi \) and given control \(q \in W\):

(3)

with penalty parameter \(\gamma >0\) and convexification parameter \(\eta > 0\). We notice that having fourth powers in \(R(\varphi )\) will yield third-order terms in time in the first-order necessary conditions on the PDE level. Conceptionally, the procedure is then similar to the elastic wave equation, see e.g., [3], and we borrow ideas for our formulations that will follow in the remainder of this paper. We notice that it holds \(0 \le \varphi \le 1\) (without imposing those bounds explicitly), by arguments from [27, Section 2].

Remark 2.1

(Convexification) We notice that strict positivity \(\eta > 0\) improves the numerical solution process of (4). For the quasi-static case of a related formulation with a fourth-order regularization and time-discrete convexification, it was proved in [27] that for sufficiently large values of \(\eta \) the control-to-state mapping associated with (3) is single valued due to strict convexity of the energy corresponding to the equation. However, the convexification term also penalizes crack growth. To ensure the dominance of the physically motivated term \(\gamma R(\varphi )\), a careful weighting of \(\gamma \) and \(\eta \) is required with \(\gamma \gg \eta \). As it is often the case in numerics, the weighting is chosen heuristically or by experience since no rigorous numerical analysis exists. It is indeed a drawback of the (simple) penalization approach that several regularization parameters interact with material parameters and discretization parameters.

3 Space-Time Weak Formulation and Discretization

In this section, a weak formulation is first derived. Due to the second-order time derivative in the phase-field variable, we formulate an equivalent mixed first-order-in-time system, which differs from our prior work [17]. These ideas are based on [3]. Afterwards, our emphasis is on the Galerkin in time finite element discretization for this mixed system. Finally, we briefly account for the spatial discretization by employing finite elements, too.

3.1 Weak Formulation

Before we continue with the spatial discretization and the concrete time-stepping scheme, we state the weak form of (3). To this end, we replace (3) by its first-order optimality conditions , see e.g., [26], yielding a coupled nonlinear PDE system: given \(\varphi _0, \dot{\varphi }_0 \in V_\varphi \) and \(q \in W\), find such that every test function satisfies

(4)

In order to derive the state equation from (4), we first combine the two equations into a single equation,

(5)

To simplify the notation, let us collect the energy-related terms of (5) as follows.

Definition 3.1

The spatial semi-linear form \(a:Q \times V \times V \rightarrow \mathbb {R}\) is defined as

(6)

For every subinterval \(J \subseteq I\), a corresponding semi-linear form \(a^J:W \times X \times X \rightarrow \mathbb {R}\) is defined as time integral over J,

Next, we have to perform integration by parts in order to move the time derivatives in (5) from the test functions to the phase-field \(\varphi \):

and similarly

Thus (5) becomes

(7)

Additional focus has to be put on the second-order time derivative of \(\varphi \). Following the approach of [3, Sect. 3.1], we introduce an additional variable \(\varphi ' \in X_{\dot{\varphi }}\) that represents the first-order time derivative of the phase-field,

$$\begin{aligned} \varphi '(t,x) \mathrel {{\mathop :}{=}}\dot{\varphi }(t,x). \end{aligned}$$
(8)

By this we replace every time derivative of \(\varphi \) in (7) with the new variable \(\varphi '\) and attach the weak form of (8) as an additional equation. Consequently, (7) is replaced by a first-order mixed system in weak form: Given \(\varphi _0, \dot{\varphi }_0 \in V_\varphi \) and \(q\in W\), find and \(\varphi ' \in X_{\dot{\varphi }}\) such that the following equations hold for all and :

(9)

Remark 3.2

In contrast to [3, Eq. (3.16)], the starting point for our investigations is given by a weak PDE and not a strong formulation. Consequently, we define the mixed system and already include the initial values as in [3, Eq. (3.17)]. The initial conditions for \(\varphi \) are specified in the first equation of (9) and the initial conditions for \(\varphi ' = \dot{\varphi }\) in the second equation.

3.2 Galerkin Time Discretization

Using a time grid \( 0 = t_0< \cdots < t_M = T, \) we first partition the interval I into M left-open subintervals \(I_m = (t_{m-1}, t_m]\),

$$\begin{aligned} I = \{0\} \cup I_1 \cup \cdots \cup I_M. \end{aligned}$$

By using the discontinuous Galerkin method, here dG(\(r\)), we seek for a solution in the space \(X^r_k\) of piecewise polynomials of degree r. The subindex k denotes the time-discretized function space in order to distinguish it from the continuous space X. To this end, we have

$$\begin{aligned} X^r_k \mathrel {{\mathop :}{=}}\{\varvec{v}\in X:v_\varphi (0) \in V_\varphi \text { and } \varvec{v}|_{I_m} \in \mathbb {P}_r(I_m, V), \, m = 1, \dots , M\}. \end{aligned}$$

To work with the discontinuities in \(X^r_k\), we introduce the standard notation

$$\begin{aligned} \varvec{v}^+_m \mathrel {{\mathop :}{=}}\varvec{v}(t_m+), \qquad \varvec{v}^-_m \mathrel {{\mathop :}{=}}\varvec{v}(t_m-) = \varvec{v}(t_m), \qquad [\varvec{v}]_m \mathrel {{\mathop :}{=}}\varvec{v}^+_m - \varvec{v}^-_m. \end{aligned}$$

We keep following the approach of [3] with special focus on [3, Eq. (3.27)–(3.30)]. Notice that in (9) we work with two terms involving time derivatives, and one of them even involves a maximum. Moreover, all time derivatives are only applied to \(\varphi \) and not to the full variable as in [3]. Since \(\varphi '\) is in \(Y \mathrel {{\mathop :}{=}}X_{\dot{\varphi }}\), we define the corresponding discretized space as

$$\begin{aligned} Y^r_k \mathrel {{\mathop :}{=}}\{v \in X_{\dot{\varphi }}:v(0) \in V_\varphi \text { and } v|_{I_m} \in \mathbb {P}_r(I_m, V_\varphi ), \, m = 1, \dots , M\}. \end{aligned}$$

Consequently, the semi-discretized state equation consists in finding and \(\varphi ' \in Y^r_k\) for a given control q such that for every and it holds

(10a)
(10b)
(10c)
(10d)
(10e)
(10f)
(10g)
(10h)

Here the first line (10a) represents the dG(\(r\)) discretization of the term

while (10c) and (10d) result from the dG(\(r\)) discretization of

In (10b) and (10f) we have the respective initial conditions for \(\varphi \) and \(\varphi '\), and in (10e) the physical terms (energy). The last two lines (10g) and (10h) contain the temporal boundary terms due to the integration by parts that we did before.

Remark 3.3

In (10) the sums of jump terms start with \(m = 0\), whereas in [3] they start with \(m = 1\). The reason is that our later derivations follow [5] where the sums also start with \(m = 0\). Both formulations are in fact equivalent, which follows from [20, Remark 3.2].

By shifting the index of the jump terms by one, all summations range from \(m = 1\) to \(m = M\) and can therefore be combined into a single sum for each equation. This yields our final form of the full state equation.

Proposition 3.1

The state equation in dG(\(r\)) form reads: Find and \(\varphi ' \in Y^r_k\) for a given control q such that for every and it holds

(11)

The discontinuous in time trial and test functions allow for sequential decoupling from which we obtain the time-stepping scheme.

Proposition 3.2

The time-stepping for the state equation (10) starts with solving the initial conditions at \(m = 0\):

(12)

Then the equations for \(m = 1, \dots , M-1\) need to be solved in a forward recursion:

(13)

Finally, we have to solve the terminal conditions at \(m = M\):

(14)

3.3 Spatial Discretization

For the spatial discretization, we employ again a Galerkin finite element scheme by introducing \(H^1\) conforming discrete spaces. We consider two-dimensional shape-regular meshes with quadrilateral elements K forming the mesh \({\mathcal {T}}_h = \{K\}\); see [9]. The spatial discretization parameter is the diameter \(h_K\) of the element K. On the mesh \({\mathcal {T}}_h\) we construct a finite element space \(V_h \mathrel {{\mathop :}{=}}V_{uh} \times V_{\varphi h}\) as usual:

$$\begin{aligned} V_{uh}&\mathrel {{\mathop :}{=}}\{v \in V_u:v|_K \in Q_s(K) \text { for } K \in {\mathcal {T}}_h\}, \\ V_{\varphi h}&\mathrel {{\mathop :}{=}}\{v \in V_\varphi :v|_K \in Q_s(K) \text { for } K \in {\mathcal {T}}_h\}. \end{aligned}$$

Herein \(Q_s(K)\) consists of shape functions that are obtained as bilinear transformations of functions defined on the master element \({\hat{K}} = (0, 1)^2\), where \({\hat{Q}}_s({\hat{K}})\) is the space of tensor product polynomials up to degree s in dimension d, defined as

$$\begin{aligned} {\hat{Q}}_s ({\hat{K}}) \mathrel {{\mathop :}{=}}{\text {span}} \left\{ \prod _{i=1}^d {\hat{x}}_i^{\alpha _i}:\alpha _i \in \{0, 1 \dots , s\}\right\} . \end{aligned}$$

Specifically, for \(s = 1\) and \(d = 2\) we have

$$\begin{aligned} {\hat{Q}}_1({\hat{K}}) = {\text {span}}\{1, {\hat{x}}_1, {\hat{x}}_2, {\hat{x}}_1 {\hat{x}}_2\}. \end{aligned}$$

Similar to the state variables, the control variables need to be discretized in time and space. The polynomial degrees of the discrete spaces may differ, and for the control we denote them by \(r'\) and \(s'\). To this end, we employ a dG(\(r'\)) discretization in time and a cG(\(s'\)) discretization in space. The semi-discrete control space is denoted by \(W_k\) and the fully discrete space is denoted by \(W_{kh}\).

We notice for the next two sections that the following derivations are independent of the specific spatial discretization and for this reason the subindex h is omitted.

4 Optimization with Phase-Field Fracture

We formulate the following nonlinear optimal control problem with a standard tracking type cost functional. For given \(\varphi _0, \dot{\varphi }_0 \in V_\varphi \) we seek a solution of

(15)

where \(\varphi _d \in V_\varphi \) is some desired time-independent phase-field and \(q_d \in W\) is a suitable nominal control that can be used for numerical stabilization. The second norm represents a common Tikhonov regularization with the Tikhonov parameter \(\alpha >0\).

Remark 4.1

We notice that the existence of a global solution in \(W \times X\) has been proved for a related problem in [26, Thm. 4.3]. That problem differs from (15) in that it is posed in the primal form with a tracking type functional for the displacements u and a penalization in discrete time (also of order four).

4.1 Reduced Optimization Problem and Solution Algorithm

We solve (15) by a reduced space approach. To this end, we assume the existence of a solution operator \(S:W \rightarrow X \times Y\), , via equation (9). With this solution operator the cost functional reduces to \(j:W \rightarrow \mathbb {R}\), \(j(q) \mathrel {{\mathop :}{=}}{\mathcal {J}}(q, S(q))\). (The component \(\varphi ' \in Y = X_{\dot{\varphi }}\) of S(q) is not used in \({\mathcal {J}}\) but will be needed later.) As a result we can replace (15) by the unconstrained optimization problem

$$\begin{aligned} \min _q \ j(q). \end{aligned}$$
(16)

The reduced problem is solved by Newton’s method applied to \(j'(q) = 0\), and hence we need computable representations of the derivatives \(j'\) and \(j''\). The established approach in [5] expresses the directional derivatives \(j'\) and \(j''\) in terms of the Lagrangian , whose precise form is defined in (23) using the notation

For given \(\delta q \in W\), a state solution \({\bar{u}}= S(q)\), and any \({\bar{z}}\), the directional derivative \(j'(q)(\delta q)\) reads

$$\begin{aligned} j'(q)(\delta q) = {\mathcal {L}}'_q(q,{\bar{u}},{\bar{z}})(\delta q) + {\mathcal {L}}'_{{\bar{u}}}(q,{\bar{u}},{\bar{z}})(S'(q)(\delta q)). \end{aligned}$$
(17)

By determining an adjoint solution \({\bar{z}}\) from (20), the second term in (17) vanishes. For two directions \(\delta q_1, \delta q_2 \in W\) and the solutions \({\bar{u}}\) of (19) and \({\bar{z}}\) of (20), the second derivative of j is expressed in a similar way as

$$\begin{aligned} \begin{aligned} j''(q)(\delta q_1,\delta q_2)&= {\mathcal {L}}''_{qq}(q,{\bar{u}},{\bar{z}})(\delta q_1, \delta q_2) + {\mathcal {L}}''_{q{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\delta q_1,S'(q)(\delta q_1)) \\&+ {\mathcal {L}}'_{q{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta q_1, T'(q)(\delta q_2)) + {\mathcal {L}}''_{{\bar{u}}q}(q,{\bar{u}},{\bar{z}})(S'(q)(\delta q_1), \delta q_2) \\&+ {\mathcal {L}}''_{{\bar{u}}{\bar{u}}}(q,{\bar{u}},{\bar{z}})(S'(q)(\delta q_1),S'(q)(\delta q_2)) + {\mathcal {L}}''_{{\bar{u}}{\bar{z}}}(q,{\bar{u}},{\bar{z}})(S'(q)(\delta q_1),T'(q)(\delta q_2)) \\&+ {\mathcal {L}}''_{{\bar{z}}q}(q,{\bar{u}},{\bar{z}})(T'(q)(\delta q_1),\delta q_2) + {\mathcal {L}}''_{{\bar{z}}{\bar{u}}}(q,{\bar{u}},{\bar{z}})(T'(q)(\delta q_1),S'(q)(\delta q_2)). \end{aligned} \end{aligned}$$
(18)

Herein \(T:Q \rightarrow X \times Y\) denotes the solution operator of the adjoint equation (20). Below, computing a tangent solution from (21) and an adjoint Hessian solution from (22) eliminates most of the terms from (22) and leads to a short representation of \(j''\). In summary, the following four equations have to be solved:

  1. 1.

    State equation: given \(q \in W\), find \({\bar{u}}\in X \times Y\) such that (4) holds for all \(\bar{\Phi }\in X \times Y\):

    $$\begin{aligned} {\mathcal {L}}'_{{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\bar{\Phi }) = 0. \end{aligned}$$
    (19)
  2. 2.

    Adjoint equation: given \(q \in W\) and \({\bar{u}}= S(q)\) from (19), find \({\bar{z}}\in X \times Y\) such that for all \(\bar{\Phi }\in X \times Y\)

    $$\begin{aligned} {\mathcal {L}}'_{{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\bar{\Phi }) = 0. \end{aligned}$$
    (20)
  3. 3.

    Tangent equation: given \(q \in W\), \({\bar{u}}= S(q)\), and a direction \(\delta q \in W\), find \(\delta {\bar{u}}\in X \times Y\) such that for all \(\bar{\Phi }\in X \times Y\)

    $$\begin{aligned} {\mathcal {L}}''_{q{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta q,\bar{\Phi }) + {\mathcal {L}}''_{{\bar{u}}{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\bar{\Phi }) = 0. \end{aligned}$$
    (21)
  4. 4.

    Adjoint Hessian equation: given \(q \in W\), \({\bar{u}}= S(q)\), \({\bar{z}}\in X \times Y\) from (20), \(\delta {\bar{u}}\in X \times Y\) from (21), and a direction \(\delta q \in W\), find \(\delta {\bar{z}}\in X \times Y\) such that for all \(\bar{\Phi }\in X \times Y\)

    $$\begin{aligned} {\mathcal {L}}''_{q{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\delta q,\bar{\Phi }) + {\mathcal {L}}''_{{\bar{u}}{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\bar{\Phi }) + {\mathcal {L}}''_{{\bar{z}}{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{z}},\bar{\Phi }) = 0. \end{aligned}$$
    (22)

Solving these equations in a specific order (see for instance [5, 20]) leads to the following representations of the derivatives that we need for Newton’s method:

$$\begin{aligned} j'(q)(\delta q)&= {\mathcal {L}}'_q(q,{\bar{u}},{\bar{z}})(\delta q) \quad \forall \delta q \in W, \\ j''(q)(\delta q_1, \delta q_2)&= {\mathcal {L}}''_{qq}(q,{\bar{u}},{\bar{z}})(\delta q_1, \delta q_2) + {\mathcal {L}}''_{{\bar{u}}q}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\delta q_2) + {} \\&\quad \ {\mathcal {L}}''_{{\bar{z}}q}(q,{\bar{u}},{\bar{z}})(\delta {\bar{z}},\delta q_2) \quad \forall \delta q_1, \delta q_2 \in W. \end{aligned}$$

5 Lagrangian and Auxiliary Equations

In the following main section, we specify the previously given abstract formulations in detail. We first derive the Lagrangian and then the three auxiliary equations (20)–(22). Specific emphasis is on the regularization terms for the crack irreversibility and the convexification. The Lagrangian is defined based on the dG(\(r\)) formulation, and this Galerkin discretization yields naturally the adjoint, tangent and adjoint Hessian. Furthermore, this approach exhibits the property that optimization and discretization interchange.

5.1 Lagrangian

The Lagrangian \({\mathcal {L}}:W \times (X\times Y) \times (X\times Y) \rightarrow \mathbb {R}\) corresponding to (9) is defined as

(23)

The discrete time Lagrangian \({\mathcal {L}}_k^r:W_k \times (X_k^r \times Y_k^r) \times (X_k^r \times Y_k^r) \rightarrow \mathbb {R}\) corresponding to (10) is defined directly in the dG(\(r\)) form as

(24)

5.2 Adjoint Equation

In this section we derive a time-stepping scheme for the adjoint equation (20) from the discrete time Lagrangian (24). In the adjoint for dG(\(r\)) we seek such that

When calculating the partial derivative of \({\mathcal {L}}\) with respect to \({\bar{u}}\), some care has to be taken with the temporal boundary values from prior integration by parts, i.e., with the last two lines in (24). We first formulate the adjoint equation in the continuous time weak form and later in the dG(\(r\)) setting:

(25a)
(25b)
(25c)
(25d)
(25e)
(25f)
(25g)
(25h)
(25i)
(25j)

Here denotes the integral of over I, and the terms (25c), (25e), and (25f) arise from the integration by parts (i.b.p.) applied to the partial derivatives

where BT denotes the corresponding boundary terms in (25c), (25e), and (25f). Note that the integration by parts is necessary in these cases because after computing the partial derivative the time derivative is applied to the direction or and not to the adjoint \({\bar{z}}\).

Now we see that the last term in (25b) cancels with the second term in (25c), and so do the terms in (25e), (25f) with those in (25i), (25j). Consequently we obtain the adjoint equation in dG(\(r\)) form as

(26)

Herein is the time integral of the partial derivative

Next we exploit the separability property of the cost functional \({\mathcal {J}}\) to expand the time integrals and into m subintegrals. After that we distinguish between the terms for \(m = 0, \dots , M-1\) and \(m=M\) and combine them. Finally, we obtain the desired dG(\(r\)) form of the adjoint equation wherein denotes the partial derivative of \({\mathcal {J}}_m\) with respect to .

Proposition 5.1

The adjoint equation in dG(\(r\)) form (where we write \(t_M\) for T) reads: Find \({\bar{z}}\in X_k^r \times Y_k^r\) such that every \(\bar{\Phi }\in X_k^r \times Y_k^r\) satisfies

(27)

This leads immediately to the resulting time-stepping scheme.

Proposition 5.2

The adjoint time-stepping for (27) starts with solving the terminal conditions at \(m = M\) for \(\varvec{z}(t_M)\) and :

Then the equations for \(m = M-1, \dots , 1\) need to be solved in a backward recursion:

Finally, we have to solve the initial conditions at \(m = 0\):

5.3 Tangent Equation

The second auxiliary equation is the tangent equation. In this equation we seek such that

$$\begin{aligned} {\mathcal {L}}''_{q{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta q,\bar{\Phi }) + {\mathcal {L}}''_{{\bar{u}}{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\bar{\Phi }) = 0 \quad \forall \bar{\Phi }\in X_k^r \times Y_k^r. \end{aligned}$$

Here we will apply the same procedure as for the state equation. Recall that \({\mathcal {L}}(q,{\bar{u}},{\bar{z}})\) contains the integral with \(\varvec{z}\) entering linearly. Hence the partial derivative required for \({\mathcal {L}}''_{{\bar{u}}{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\bar{\Phi })\) is simply , and the partial derivative required for \( {\mathcal {L}}''_{q{\bar{z}}}(q,{\bar{u}},{\bar{z}})(\delta q,\bar{\Phi })\) is derived from (6) as the integral :

(28)

Furthermore, does not depend on , hence \({\mathcal {J}}''_{q{\bar{z}}}\) and \({\mathcal {J}}''_{{\bar{u}}{\bar{z}}}\) vanish. In contrast to the adjoint we formulate the tangent equation directly including the dG(\(r\)) jump terms:

As for the state equation we shift the summation index for the jump terms by one and combine all sums into a single one.

Proposition 5.3

The tangent equation in dG(\(r\)) form reads:

Again we readily obtain the resulting time-stepping scheme.

Proposition 5.4

The tangent time-stepping starts with solving the initial conditions at \(m = 0\):

(29)

Then the equations for \(m = 1, \dots , M-1\) need to be solved in a forward recursion:

(30)

Finally, we need to solve the terminal conditions at \(m = M\):

(31)

5.4 Adjoint Hessian Equation

The third and last auxiliary equation is the adjoint Hessian equation. In this equation we seek such that for all \(\bar{\Phi }\in X_k^r \times Y_k^r\) the following equation holds:

$$\begin{aligned} {\mathcal {L}}''_{q{\bar{u}}}(q,{\bar{u}}, {\bar{z}})(\delta q,\bar{\Phi }) + {\mathcal {L}}''_{{\bar{u}}{\bar{u}}}(q,{\bar{u}}, {\bar{z}})(\delta {\bar{u}},\bar{\Phi }) + {\mathcal {L}}''_{{\bar{z}}{\bar{u}}}(q,{\bar{u}}, {\bar{z}})(\delta {\bar{z}},\bar{\Phi }) = 0. \end{aligned}$$
(32)

First we see that \({\mathcal {L}}''_{q{\bar{u}}}(q,{\bar{u}}, {\bar{z}})(\delta q,\bar{\Phi })= 0\) since q and \({\bar{u}}\) are decoupled. The derivative of \(a^I\) in \({\mathcal {L}}''_{{\bar{z}}{\bar{u}}}(q,{\bar{u}}, {\bar{z}})(\delta {\bar{z}},\bar{\Phi })\) is given by due to the linearity of \(\varvec{z}\) in \(a^I\). However, arises as a genuine second-order derivative in \({\mathcal {L}}''_{{\bar{u}}{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\bar{\Phi })\) where

(33)

Now we obtain the adjoint Hessian equation

(34a)
(34b)
(34c)
(34d)
(34e)
(34f)
(34g)
(34h)

The following result guarantees that the least regular term (34b) is well-defined.

Proposition 5.5

The partial derivative is in \(L^2(I,L^2(\Omega ))\).

Proof

Similar to (2), an elementary calculation for \(\varphi ' \in X_{\dot{\varphi }}= H^1(I,V_\varphi )\) gives

$$\begin{aligned} \partial _t\max \{\varphi '(t,x), 0\} = {\left\{ \begin{array}{ll} \dot{\varphi }'(t,x) &{} \varphi '(t,x) > 0, \\ \max \{\dot{\varphi }'(t,x), 0\} &{} \varphi '(t,x) = 0, \\ 0, &{} \varphi '(t,x) < 0. \end{array}\right. } \end{aligned}$$

The product rule together with the estimate for \(c \in \mathbb {R}\) then yields

The factors \(\dot{\varphi }', \delta \dot{\varphi }'\) are in \(L^2(I,V_\varphi )\); all other factors in the right-hand side norms are in \(X_{\dot{\varphi }}\hookrightarrow L^\infty (I,V_\varphi )\) while \(V_\varphi = H^1(\Omega ) \hookrightarrow L^q(\Omega )\) for every \(q \ge 2\). Arguing as in Proposition 2.1, this gives the finiteness of the first product norm (and similarly of the two other ones):

\(\square \)

Remark 5.1

This proposition together with Proposition 2.2 shows why we chose a fourth order penalization for the fracture irreversibility constraint: this provides just sufficient regularity for the derivative to be in \(L^2(I,L^2(\Omega ))\). A third order penalization would instead lead to nonexistent derivatives.

Next we will apply integration by parts to (34b), to the first term in (34e), and to the first and second term in (34f):

Consequently (34) becomes

Finally, we apply dG(\(r\)) to the terms involving and . We do not need to apply dG(\(r\)) to since it is part of \({\mathcal {L}}''_{{\bar{u}}{\bar{u}}}(q,{\bar{u}},{\bar{z}})(\delta {\bar{u}},\bar{\Phi })\): this acts as a right hand side as it does not involve the solution variable \(\delta {\bar{z}}\).

Proposition 5.6

The adjoint Hessian equation in dG(\(r\)) form reads:

The resulting time-stepping scheme is again immediate.

Proposition 5.7

The time-stepping for the adjoint Hessian equation starts with solving the terminal conditions at \(m = M\):

(35)

Then the equations for \(m = M-1, \dots , 1\) need to be solved in a backward recursion:

(36)

Finally we have to solve the initial conditions at \(m = 0\):

(37)

Remark 5.2

In this work we have considered a continuous-time cost functional for the phase-field \(\varphi \). A discrete-time cost functional

with weights \(w_m \ge 0\) can easily be added (or be used instead). In the four equations to be solved and in the associated time-stepping schemes this will produce objective terms similar to the current ones for \(m = 1, \dots , M\) plus additional terms at \(t_0 = 0\). The Tikhonov regularization might also be formulated in discrete time, with contributions . However, then we would have to replace W with C(IQ) or a subspace thereof so that the point evaluations of q and \(q_d\) are well-defined.

6 Conclusions

In this paper, we established rigorous space-time formulations for the state, adjoint, tangent, and adjoint Hessian equations of phase-field fracture optimal control problems. Due to the crack irreversibility constraint a higher-order penalization was adopted that allowed us to prove Fréchet differentiability. On the other hand, this higher-order term yielded a second-order time derivative of the phase-field variable. To this end, a mixed first-order-in-time system was formulated. Based on this system the Lagrangian and auxiliary equations were established. By using a discontinuous Galerkin discretization in time, the time integrals decouple and we derived the resulting time-stepping schemes. Possible extensions of the current work include the implementation as this mixed phase-field system has not been addressed in the numerical realization so far. Along with the implementation, comparisons of gradient methods, quasi-Newton approaches (BFGS), and the proposed full Newton method could be undertaken. However, the main computational challenges are the linear and nonlinear solvers of the forward problem as they become severely expensive due to numerous forward and backward runs within the optimization loop. This holds true for two-dimensional configurations, as shown in our prior work [17], as well as three-dimensional extensions. Conceptionally, numerically and implementation-wise three-dimensional configurations can be addressed by our model, while some mathematical statements must be carefully revisited, and the major challenge is the numerical cost. Another interesting extension benefits from the space-time formulation as the temporal part could be solved with higher-order in time Galerkin finite elements as for instance dG(1) methods in which linear elements per time interval are employed. A final modification can be the replacement of the penalty method by an augmented Lagrangian or primal-dual active set method, wherein the mathematical theory will be more involved. Numerically, we have implemented in other works (see references in the monograph [31, Chapter 5]) such alternative penalizations of the crack irreversibility constraint. The broader impact of the current work is reflected by the fact that in phase-field fracture numerous papers on forward modeling have been published to date, but due to mathematical and numerical challenges only a few have been published on optimization (which are cited in our introduction). Consequently, the present work provides one possibility of formulating phase-field fracture optimization problems.