1 Introduction

1.1 Problem Statement

Let \(A=A^T\in {\mathbb {R}}^{n\times n}\) be a symmetric positive definite matrix and let B be a tall matrix with \(B\in {\mathbb {R}}^{n\times p}\), with \(p\ll n\). We refer to p as the block size. We want to approximate

$$\begin{aligned} \mathcal{F}(s) =B^T\phi (A,s)B \end{aligned}$$
(1)

via approximations of \(\phi (A,s)B\) on a block Krylov subspace

$$\begin{aligned} \mathcal {K}_m(A,B)=\textrm{blkspan} \{B, AB, A^2 B, \dots A^{m-1} B \} \end{aligned}$$

where blkspan means that the whole space range\(([B, AB, A^2 B, \dots A^{m-1} B])\) is generated. In the literature various functions \(\phi \) are considered to derive bounds for the scalar case, for instance strictly completely monotonic functions [3, 5] or functions with sufficient derivatives on the spectral interval of A [36]. Here, we mainly consider the simpler problem \(\phi (A,s)=(A+sI)^{-1}\) with \(s\in {\mathbb {R}}_+\), which can be generalized to Stieltjes functions. Our numerical experiments will also illustrate the applicability of our results for exponentials and resolvents with purely imaginary \(s\in \imath {\mathbb {R}}\), which is an important class of problems in the computation of transfer functions of linear time-invariant (LTI) systems with imaginary Laplace frequencies and time domain responses. To simplify notation, without loss of generality, we assume that the matrix B has orthonormal columns.

Gaussian-quadrature rules via Krylov subspace methods [24] have been extended to the block case [18, 40]. Gauss–Radau quadrature rules have also been extended to the block case via the block Krylov subspace [35], where at each block Krylov iteration, p Ritz values of A projected onto the block Krylov subspace are prescribed by a low-rank modification of this projection. These p Ritz values are distinct and larger than the maximum eigenvalue of A, which did not lead to an acceleration in convergence of the block Krylov method in [35].

In this paper, we take a different approach and extend Gauss–Radau quadrature to the block case by prescribing p identical Ritz values that are smaller than the minimum eigenvalue of A. When considering the quadratic form \(B^T \phi (A) B\), this allows us to obtain two-sided estimates of the solution at the cost of the standard block Lanczos algorithm in the Krylov block subspace \(\mathcal {K}_m(A,B)\). We further show that this choice leads to a (convergent) monotonically decreasing approximation and allows for computable and tight error bounds. This definition is consistent with the Gauss–Radau bound for the A-norm of the error in CG, which to date only has been developed for the scalar case [37]. The matrices we consider in this work arise as discretizations of PDE operators in unbounded domains whose smallest eigenvalue tends to zero, therefore, the p prescribed Ritz values are zero.

We focus on applications stemming from diffusive electromagnetic problems in an unbounded domain where we show that averaging Gauss–Radau and Gauss quadratures leads to a cost-free reduction in approximation error. This significantly improves the accuracy of the standard Krylov projection approximation, even though matrix-function approximation by Krylov projection is near optimal in global norms [1].

A vast body of literature exists for approximating \(\phi (A)B\) via block Krylov subspace methods for symmetric and non-symmetric matrices A [21]. Ideas based on Gauss–Radau quadratures have motivated the Radau–Lanczos method for the scalar (non-block) case [20], and the Radau–Arnoldi method for the non-symmetric block case [22]. Similarly to the Gauss–Radau definition in [35], the Radau–Arnoldi method in [22] modifies p Ritz values at each iteration to be larger than the largest eigenvalue of A. The considered functions \(\phi (\cdot )\) in these works are Riemann–Stieltjes integrals, and we note that the results obtained for the resolvent in this work carry over straightforwardly to such functions.

Monotonicity results for the ordinary Lanczos algorithm were obtained for the exponential function in [8]; see also similar results in [19]. Two-sided monotonic bounds for scalar Gauss and Gauss–Radau quadratures were first obtained in [36]. In this work we generalize these results to matrix-valued quadratures calculated via the block Lanczos algorithm. For instance, using pairs of Gauss and Gauss–Radau quadratures has been explored for functions of adjacency matrices in [3]. For the related conjugate gradient algorithm, Gauss–Radau based upper bounds for the scalar case have been developed in [37, 38]. For rational functions of matrices, error bounds for the classical Lanczos algorithms have been obtained in [23]. Recently, error bounds for approximating \(\phi (A)B\) with the Lanczos method have been extended to the block Lanczos method [42]. A connection of the block Lanczos algorithm and matrix continued fraction similar to the one used in this work was described in the context of quantum physics for MIMO transfer function approximations in [30].

The remainder of the article is organized as follows: we review the basic properties of the block Lanczos algorithm in Sect. 2, and we show that the resolvent of the Lanczos block tridiagonal matrix leads to a Stieltjes-matrix continued fraction in Sect. 3. Using these continued fractions we define the notion of Gauss–Radau quadrature for the block case in Sect. 4. This is used in Sect. 5 to derive a two-sided error bound. Convergence acceleration based on averaged Gauss–Radau and Gauss quadrature is presented in the same section with a more detailed treatment in “Appendix A”. Last, numerical examples are presented in Sect. 6.

1.2 Notation

Given two square symmetric matrices \(G_1, G_2\) we use the notation \(G_1<G_2\) to mean that the matrix \(G_2-G_1\) is positive definite. A sequence of matrices \(\{G_m\}_{m\ge 0}\) is said to be monotonically increasing (resp. decreasing) if \(G_m < G_{m+1}\) (resp. \(G_{m+1}<G_m\)) for all m. Positive-definite \(p \times p\) matrices are denoted by Greek letters \(\varvec{\alpha },\varvec{\beta },\varvec{\gamma }\) and matrix-valued functions by calligraphic capital letters such as \(\mathcal{C}(s)\) or \( \mathcal{F}(s)\). Last, for \(\varvec{\alpha }, \varvec{\beta }\in \mathbb {R}^{p\times p}\) and \(\varvec{\beta }\) is nonsingular, we use the notation \(\frac{\varvec{\alpha }}{\varvec{\beta }}:= \varvec{\alpha }\varvec{\beta }^{-1}\) (right inversion). The matrix \(E_k \in \mathbb {R}^{mp\times p}\) has zero elements except for the \(p\times p\) identity matrix in the k-th block, \(E_k =[0,\ldots , 0,I, \ldots , 0]^T\).

2 Block Gauss Quadratures Via the Block Lanczos Algorithm

For scalar \( \mathcal{F}(s)\), i.e. \(p=1\), the approximation of \( \mathcal{F}(s)=B^T (A+ sI)^{-1}B\) by means of \( \mathcal{F}_m(s)=E_1^T (T_m+ sI)^{-1}E_1\), where \(T_m\) is a specifically designed s.p.d. tridiagonal matrix, can be classically interpreted as a quadrature formula for integrals [3, 36]. The block setting with \(p>1\) and \(T_m\) a specifically designed s.p.d. block tridiagonal matrix allows for an analogous block quadrature interpretation.

Consider the eigendecompositions \(A=Z\Lambda Z^T\) with eigenpairs \((\lambda _i,z_i)\) and \(T_m=\tilde{Z}_m \tilde{\Lambda }_m \tilde{Z}_m^T\), \(T_m \in \mathbb {R}^{mp\times mp}\) with eigenpairs \((\tilde{\lambda }_i,\tilde{z}_i)\). Then we can rewrite \( \mathcal{F}(s)\) as an integral

$$\begin{aligned} B^T (A+ sI)^{-1}B = \sum _{i=1}^N \frac{(z_i^T B)^T(z_i^T B)}{\lambda _i+s} = \int _{0}^\infty \frac{1}{\lambda +s} \,\textrm{d}\mu (\lambda ), \end{aligned}$$

with symmetric, matrix valued spectral measure

$$\begin{aligned} {\mu (\lambda )= \sum _{i=1}^{N}(z_i^T B)^T(z_i^T B) H(\lambda -\lambda _i),} \end{aligned}$$

with \(H(\cdot )\) the Heaviside step function. The approximation \( \mathcal{F}_m(s)\) of this integral can be written as

$$\begin{aligned} E_1^T (T_m+ sI)^{-1}E_1 = \sum _{i=1}^{mp} \frac{(\tilde{z}_i^T E_1)^T(\tilde{z}_i^T E_1)}{\tilde{\lambda }_i+s}, \end{aligned}$$
(2)

which can be interpreted as a quadrature rule \(\sum _i w_i \frac{1}{x_i+s}\) with weights \(w_i=(\tilde{z}_i^T E_1)^T(\tilde{z}_i^T E_1)\) and nodes \(x_i=\tilde{\lambda }_i\). Such a quadrature rule is said to be a Gauss quadrature rule if it exactly integrates polynomials of degree (strictly) less than 2m. A quadrature rule is called Gauss–Radau quadrature rule if additionally, one quadrature node is fixed at either end of the integration interval. In this paper we consider matrices A stemming from the discretization of a differential operator with continuous spectrum approaching 0, hence we will fix a quadrature node at \(x_0/\tilde{\lambda }_0 = 0\) [24].

Gauss–Radau quadrature has been extended to the block case in Section 4.3 of [35] where p quadrature nodes are prescribed that are distinct and larger than the maximum eigenvalue of A. In this extension of the Gauss–Radau quadrature to the block case, we propose to prescribe the first quadrature point at 0 with quadrature weight \(w_i\) of rank p.

Algorithm 1
figure a

Block Lanczos iteration

To obtain a (block)-symmetric \(T_m\) and a quadrature approximation the block Lanczos algorithm (Algoritm 1) is used. Let us assume that we can perform m steps, with \(mp \le n\), of the block Lanczos iteration without breakdowns or deflationFootnote 1 [26, 39]. As a result, the orthonormal block vectors \(Q_i\in \mathbb {R}^{n \times p}\) form the matrix \({\varvec{Q}}_m=[Q_1, \ldots , Q_m]\in \mathbb {R}^{n\times mp}\), whose columns contain an orthonormal basis for the block Krylov subspace

$$\begin{aligned} \mathcal {K}_m(A,B)=\textrm{blkspan} \{B, AB, A^2 B, \dots A^{m-1} B \}. \end{aligned}$$

The Lanczos iteration can then be compactly written as

$$\begin{aligned} A{\varvec{Q}}_m={\varvec{Q}}_m T_m + Q_{m+1} \varvec{\beta }_{m+1}E_m^T, \end{aligned}$$
(3)

where \(T_m\) is the symmetric positive definite block tridiagonal matrix

$$\begin{aligned} T_m= \begin{pmatrix} \varvec{\alpha }_1 & \varvec{\beta }_2^T & & & & & \\ \varvec{\beta }_2 & \varvec{\alpha }_2 & \varvec{\beta }_3^T & & & & \\ & \ddots & \ddots & \ddots & & & \\ & & \varvec{\beta }_{i}& \varvec{\alpha }_i & \varvec{\beta }_{i+1}^T & & \\ & & & \ddots & \ddots & \ddots & \\ & & & & \varvec{\beta }_{m-1}& \varvec{\alpha }_{m-1}& \varvec{\beta }_m^T\\ & & & & & \varvec{\beta }_m& \varvec{\alpha }_m\\ \end{pmatrix} , \end{aligned}$$
(4)

and \(\varvec{\alpha }_i, \varvec{\beta }_i \in \mathbb {R}^{p \times p}\) are the block coefficients in Algorithm 1.

Using the Lanczos decomposition, \( \mathcal{F}(s)\) can be approximated as

$$\begin{aligned} \mathcal{F}(s)\approx \mathcal{F}_m(s)= E_1^T\phi (T_m,s)E_1. \end{aligned}$$
(5)

This approximation is known as a block Gauss quadrature rule. In the following, we show that \( \mathcal{F}_m\) is monotonically increasing with m, i.e. \( \mathcal{F}_m< \mathcal{F}_{m+1}< \mathcal{F}(s)\) and thereby converges monotonically. Analogous results for the non-block case (i.e., \(p=1\)) for a larger class of functions were obtained in [8, 19, 36]. We also introduce a computable upper-bound matrix \( {\tilde{\mathcal{F}}}_m\) that monotonically decays as iterations increase, that is \( {\tilde{\mathcal{F}}}_m> {\tilde{\mathcal{F}}}_{m+1}> \mathcal{F}> \mathcal{F}_m\). These two-sided bounds will then provide an error measure and allows us to reduce the approximation error using an averaging formula. The quantity \( {\tilde{\mathcal{F}}}_m\) may be interpreted as an extension to \(p>1\) of the approximation via a Gauss–Radau quadrature rule introduced in [36].

Remark 1

The simple Lanczos algorithm using only three-term recursions without re-orthogonalization is known to be unstable due to computer round-offs. The instability is manifested by the loss of orthogonality of the Lanczos vectors and the appearance of spurious copies of the Lanczos eigenvalues. However, this instability only dampens the spectral adaptation of the approximation of actions of regular enough matrix functions and does not affect the converged approximation [9, 13, 27]. Likewise, even though the spectral nodes and weights of the Gaussian quadratures computed via the simple Lanczos algorithm can be affected by round-off, the converged quadrature rule is not [25, 32]. Moreover, compared to the actions of matrix functions, the quadrature convergence is less affected by the loss of orthogonality of Lanczos vectors, because these vectors are not explicitly used in the computation of the Gaussian quadratures. We assume that the same reasoning applies to the block extensions as well to the Gauss–Radau quadrature, which we observe in our numerical experiments. Specifically, we assume that \(T_m\) obtained from three-term block recursions is s.p.d. and (thereby) that its block Cholesky factorization exists.

3 Gauss Quadrature and the Matrix S-fraction

We represent the block tridiagonal Lanczos matrix \(T_m\) (see Eq. (4)) in a block \(LDL^T\) decomposition.

Lemma 1

Let \(T_m\) be as defined in Eq. (4). Then there exist \(\hat{\varvec{\kappa }}_{i}\in {\mathbb {R}}^{p\times p}\), \(\hat{\varvec{\kappa }}_1=I_p\) and \(\varvec{\gamma }_{i}\in {\mathbb {R}}^{p\times p}\) all full rank such that

$$\begin{aligned} T_{m} :=(\widehat{\varvec{K}}_{m}^{-1})^T {J}_{m} \varvec{\Gamma }^{-1}_m {J}_{m}^T \widehat{\varvec{K}}_{m}^{-1} \end{aligned}$$
(6)

where

$$\begin{aligned} {J}_{m}^T = \begin{bmatrix} I_p & -I_p & ~ & ~ \\ ~ & \ddots & \ddots & ~ \\ ~& ~& \ddots & -I_p \\ ~ & ~& ~& I_p \end{bmatrix}\in {\mathbb {R}}^{pm\times pm}, \begin{array}{ll} \widehat{\varvec{K}}_{m}& =\textrm{blkdiag}(\hat{\varvec{\kappa }}_{1},\dots ,\hat{\varvec{\kappa }}_{m})\\ {\varvec{\Gamma }}_m& =\textrm{blkdiag}(\varvec{\gamma }_{1},\dots ,\varvec{\gamma }_{m}), \end{array} \end{aligned}$$

and \(\varvec{\alpha }_1=(\hat{\varvec{\kappa }}_{1}^{-1})^T\varvec{\gamma }_1^{-1}\hat{\varvec{\kappa }}_{1}^{-1}=\varvec{\gamma }_1^{-1}\), \(\varvec{\alpha }_i=(\hat{\varvec{\kappa }}_{i}^{-1})^T(\varvec{\gamma }_{i-1}^{-1}+\varvec{\gamma }_{i}^{-1})\hat{\varvec{\kappa }}_{i}^{-1}\) and \(\varvec{\beta }_i= {-(\hat{\varvec{\kappa }}_{i}^{-1})^T \varvec{\gamma }_{i-1}^{-1} \hat{\varvec{\kappa }}_{i-1}^{-1}}\) for \(i=2,\ldots ,m\).

Proof

The block \(LDL^T\) decomposition for positive definite, tridiagonal matrices consists of a block bidiagonal L with identity as its diagonal block and the block diagonal s.p.d. D. The parametrization in Eq. (6) is a general parameterization of such a \(LDL^T\) decomposition with \(L=(\widehat{\varvec{K}}_{m}^{-1})^T {J}_{m} (\widehat{\varvec{K}}_{m})^T\) and \(D=(\widehat{\varvec{K}}_{m}^{-1})^T \varvec{\Gamma }^{-1}_m (\widehat{\varvec{K}}_{m}^{-1})\).

The \(p\times p\) matrices must be of full rank, since they are Cholesky factors of the s.p.d. Lanczos matrix (6), which is of full rank. Further by the same reasoning, the (block)-diagonal factor \({\varvec{\Gamma }}_m\) is s.p.d. since D is s.p.d. . \(\square \)

We can extract the matrices \(\varvec{\gamma }_i\)’s and \(\hat{\varvec{\kappa }}_{i}\)’s directly during the block Lanczos recurrences using the coefficients \(\varvec{\alpha }_i\)’s and \(\varvec{\beta }_i>0\)’s. This is reported in Algorithm 2, which in a different parameterization was derived in Appendix C of [14]. From their construction, we obtain that \(\hat{\varvec{\kappa }}_i\), \(\varvec{\gamma }_{i}\) are full rank, as long as no deflation occurs in the block Lanczos recurrence, as \(\varvec{\beta }_i^T\varvec{\beta }_i>0\) for \(i=2,\ldots ,m\), i.e. nonsingularity of all coefficients \(\varvec{\alpha }_i,\varvec{\beta }_i\) ensures that Algorithm 2 runs without breakdowns. This algorithm is also related to the matrix continued fraction construction via block Lanczos algorithm used for computation of quantum scattering problems [30].

Algorithm 2
figure b

Extraction algorithm \(\varvec{\gamma }/\hat{\varvec{\kappa }}\): (Block \(LDL^T\) Cholesky factorization of block tridiagonal \(T_m\))

To write \( \mathcal{F}_m\) as a matricial continued-S-fractions we first convert \(T_m\) to pencil form using the factors introduced in (6). To this end, we introduce the matrices \(\varvec{\hat{\gamma }}_j\)

$$\begin{aligned} \varvec{\hat{\gamma }}_j =\hat{\varvec{\kappa }}_j^T \hat{\varvec{\kappa }}_j, \quad j=1,\ldots ,m. \end{aligned}$$
(7)

The matrices \(\varvec{\gamma }_j\) and \(\varvec{\hat{\gamma }}_j\) are known as the Stieltjes parameters [16] and they are both s.p.d. if block Lanczos runs without breakdown.

Let \( Z_m:= {J}_m \varvec{\Gamma }_m^{-1} {J}_{m}^T\) and \(\varvec{\widehat{\Gamma }}_m=\textrm{blkdiag}(\varvec{\hat{\gamma }}_{1},\dots ,\varvec{\hat{\gamma }}_{m})\). Then, due to the initial condition \(\varvec{\hat{\gamma }}_1=I_p\), the function \( \mathcal{F}_m\) can be rewritten using the pencil form \((Z_m,\varvec{\widehat{\Gamma }}_m)\), that is

$$\begin{aligned} \mathcal{F}_m(s)=E_1^T(T_m+sI)^{-1}E_1= E_1^T( Z_m+s \widehat{\varvec{\Gamma }}_m)^{-1}E_1. \end{aligned}$$
(8)

Next, we show that \( \mathcal{F}_m\) is a matricial Stieltjes continued fraction (S-fraction) that can be written as

$$\begin{aligned} \mathcal{F}_m(s)= \frac{1}{s\varvec{\hat{\gamma }}_1+ \frac{1}{\varvec{\gamma }_1 + \frac{1}{s\varvec{\hat{\gamma }}_2 +\frac{1}{ \ddots \frac{1}{s \varvec{\hat{\gamma }}_m + \frac{1}{ \varvec{\gamma }_m}} }}}} , \end{aligned}$$
(9)

where the basic building block of these S-fractions is given by the recursion

$$\begin{aligned} \mathcal{C}_i(s)=\frac{1}{s\varvec{\hat{\gamma }}_i + \frac{1}{\varvec{\gamma }_i+\mathcal{C}_{i+1}(s)}} , \qquad \textrm{Re}\{s\}>0. \end{aligned}$$
(10)

Proposition 1

Let the recursion \(\mathcal{C}_i(s)\), \(i=m-1, \ldots , 1\) as in (10) be given. Then for \(s\in {\mathbb {R}}_+\) and \(\mathcal{C}_{m+1}(s) = 0\), the function \( \mathcal{F}_m\) is a continued matrix S-fraction defined as

$$\begin{aligned} \mathcal{F}_m(s):=\mathcal{C}_1 . \end{aligned}$$
(11)

This result can be formally related to earlier works by Duykarev et al. [16, 17] in matricial continued S-fraction. The proof used here is related to the derivation by Ingerman (Section 2.4 [28]).

Proof

We recall that \( \mathcal{F}_m(s)\) corresponds to the first \(p\times p\) block of the solution \({ U}_m\) of the linear system

$$\begin{aligned} ( Z_m+s \widehat{\varvec{\Gamma }}_m){ U}_m =E_1. \end{aligned}$$

Let \({ U}_m = [\mathcal{U}_1; \ldots ; \mathcal{U}_m]\), with \(\mathcal{U}_i\) a \(p\times p\) block. We notice that \( U_m\) is full column rank. Let us write down the relevant block equations of this system, that is

$$\begin{aligned} (\frac{1}{\varvec{\gamma }}_1+s\hat{\varvec{\gamma }}_1) \mathcal{U}_1 - \frac{1}{\varvec{\gamma }}_1 \mathcal{U}_2= & I_p \end{aligned}$$
(12)
$$\begin{aligned} - \frac{1}{\varvec{\gamma }_{i-1}} \mathcal{U}_{i-1} +(\frac{1}{\varvec{\gamma }_i} + \frac{1}{\varvec{\gamma }_{i-1}} +s\hat{\varvec{\gamma }}_i) \mathcal{U}_i - \frac{1}{\varvec{\gamma }_{i}} \mathcal{U}_{i+1}= & 0, \quad i=2,\ldots , m-1 \end{aligned}$$
(13)
$$\begin{aligned} - \frac{1}{\varvec{\gamma }_{m-1}} \mathcal{U}_{m-1} +(\frac{1}{\varvec{\gamma }_{m}}+\frac{1}{\varvec{\gamma }_{m-1}}+s\hat{\varvec{\gamma }}_m) \mathcal{U}_m= & 0. \end{aligned}$$
(14)

As a preliminary observation, we note that we can assume that all \(\mathcal{U}_i\) are full rank. Indeed, since the coefficient matrix is block tridiagonal and unreducible, if there exists a nonzero vector v such that \(\mathcal{U}_{i}v=0\) for some i, the components of \(\mathcal{U}_1v\) can be obtained by solving the \((i-1)\times (i-1)\) system with the top-left portion of the coefficient matrixFootnote 2.

For \(1<i \le m\), let us formally define the operation

$$\begin{aligned} \mathcal{C}_i^{-1} \mathcal{U}_i:= - \frac{1}{\varvec{\gamma }_{i-1}} (\mathcal{U}_i-\mathcal{U}_{i-1}). \end{aligned}$$

Note that this allows us to write

$$\begin{aligned} \mathcal{C}_i^{-1}\mathcal{U}_i = (\mathcal{C}_i + \varvec{\gamma }_{i-1})^{-1} \mathcal{U}_{i-1}. \end{aligned}$$
(15)

Rearranging terms for \(i=m\) in the Eq. (14), it holds

$$\begin{aligned} -\mathcal{C}_m^{-1} \mathcal{U}_m + s \hat{\varvec{\gamma }}_m \mathcal{U}_m + \frac{1}{\varvec{\gamma }_m} \mathcal{U}_m = 0, \end{aligned}$$

so that since \(\mathcal{U}_m\) is non-singular, we obtain \(\mathcal{C}_m^{-1}= s \hat{\varvec{\gamma }}_m + \frac{1}{\varvec{\gamma }_m}\). Reordering terms similarly in the ith Eq. (13), with \(1<i < m\), we obtain

$$\begin{aligned} -\mathcal{C}_i^{-1}\mathcal{U}_i + s \hat{\varvec{\gamma }}_i \mathcal{U}_i + \mathcal{C}_{i+1}^{-1} \mathcal{U}_{i+1} = 0. \end{aligned}$$

Using (15) with index \(i+1\) in place of i and substituting, we rewrite the equation above as

$$\begin{aligned} -\mathcal{C}_i^{-1}\mathcal{U}_i + s \hat{\varvec{\gamma }}_i \mathcal{U}_i + (\mathcal{C}_{i+1}+\varvec{\gamma }_i)^{-1} \mathcal{U}_{i} = 0, \end{aligned}$$

yielding the following recurrence for \(\mathcal{C}_i\) (for \(\mathcal{U}_i\) nonsingular)

$$\begin{aligned} \mathcal{C}_i = \frac{1}{s \hat{\varvec{\gamma }}_i + \frac{1}{\mathcal{C}_{i+1}+\varvec{\gamma }_i}}. \end{aligned}$$

Rearranging also the first Eq. (12), we have

$$\begin{aligned} s\hat{\varvec{\gamma }}_1 \mathcal{U}_1 - \frac{1}{\varvec{\gamma }}_1(\mathcal{U}_2 - \mathcal{U}_1) = I_p, \end{aligned}$$

so that \(s\hat{\varvec{\gamma }}_1 \mathcal{U}_1 + \mathcal{C}_2^{-1} \mathcal{U}_2 = I_p\). Using (15), we obtain \(s\hat{\varvec{\gamma }}_1 \mathcal{U}_1 + (\mathcal{C}_2+\varvec{\gamma }_1)^{-1} \mathcal{U}_1 = I_p\), from which we get the final relation,

$$\begin{aligned} \mathcal{U}_1 = \frac{1}{s\hat{\varvec{\gamma }}_1+\frac{1}{\mathcal{C}_2+\varvec{\gamma }_1}} \equiv \mathcal{C}_1. \end{aligned}$$

\(\square \)

4 Block Gauss–Radau Quadrature

Let us modify the lowest floor of (9) by replacing \(\varvec{\gamma }_m\) with \(\frac{\varvec{\gamma }_m}{\epsilon }\), \(\epsilon >0\) and keeping the remaining Stieltjes parameters unchanged. In the limit of \(\epsilon \rightarrow 0\) such a modification yields

$$\begin{aligned} {\tilde{\mathcal{F}}}_m(s) =\underset{{\epsilon \rightarrow 0}}{\lim } \frac{1}{s\varvec{\hat{\gamma }}_1+ \frac{1}{\varvec{\gamma }_1 + \frac{1}{s\varvec{\hat{\gamma }}_2 +\frac{1}{ \ddots \frac{1}{s \varvec{\hat{\gamma }}_m + \frac{\epsilon }{ \varvec{\gamma }_m}} }}}} = \frac{1}{s\varvec{\hat{\gamma }}_1+ \frac{1}{\varvec{\gamma }_1 + \frac{1}{s\varvec{\hat{\gamma }}_2 +\frac{1}{ \ddots \frac{1}{s \varvec{\hat{\gamma }}_m } }}}}. \end{aligned}$$
(16)

Equivalently, such a limiting transition can be written using the resolvent of a block tridiagonal matrix as

$$\begin{aligned} {\tilde{\mathcal{F}}}_m(s)=E_1^T(\tilde{T}_m+sI)^{-1}E_1, \end{aligned}$$

where

$$\begin{aligned} \tilde{T}_m= \begin{pmatrix} \varvec{\alpha }_1 & \varvec{\beta }_2^T & & & & & \\ \varvec{\beta }_2 & \varvec{\alpha }_2 & \varvec{\beta }_3^T & & & & \\ & \ddots & \ddots & \ddots & & & \\ & & \varvec{\beta }_{i}& \varvec{\alpha }_i & \varvec{\beta }_{i+1}^T & & \\ & & & \ddots & \ddots & \ddots & \\ & & & & \varvec{\beta }_{m-1}& \varvec{\alpha }_{m-1}& \varvec{\beta }_m^T\\ & & & & & \varvec{\beta }_m& \tilde{\varvec{\alpha }}_m\\ \end{pmatrix} , \end{aligned}$$
(17)

and \(\varvec{\alpha }_i\), \(i=1,\ldots , m-1\) and \(\varvec{\beta }_i\), \(i=2,\ldots , m\) are the same block coefficients as in Algorithm 1, with

$$\begin{aligned} \tilde{\varvec{\alpha }}_m=\lim _{\epsilon \rightarrow 0}\varvec{\alpha }_m(\epsilon )= \lim _{\epsilon \rightarrow 0} (\hat{\varvec{\kappa }}_{m}^{-1})^T\left[ \varvec{\gamma }_{m-1}^{-1}+ \left( {\frac{\varvec{\gamma }_m}{\epsilon }}\right) ^{-1}\right] \hat{\varvec{\kappa }}_{m}^{-1}= (\hat{\varvec{\kappa }}_{m}^{-1})^T\varvec{\gamma }_{m-1}^{-1}\hat{\varvec{\kappa }}_{m}^{-1}, \end{aligned}$$
(18)

using \(\varvec{\gamma }_m\), \(\hat{\varvec{\kappa }}_{m}\) computed via Algorithm 2. The following proposition is fundamental for understanding the distinct properties of \( \mathcal{F}_m\) and \( {\tilde{\mathcal{F}}}_m\) and the relation of \( {\tilde{\mathcal{F}}}_m\) to the Gauss–Radau quadrature.

Proposition 2

The matrix function \( \mathcal{F}_m(s)\) is analytical and of full rank at \(s=0\) as

$$\begin{aligned} \mathcal{F}_m(0)=\sum _{i=1}^m\varvec{\gamma }_i. \end{aligned}$$
(19)

Further, \( {\tilde{\mathcal{F}}}_m(s)\) has a pole of order one at \(s=0\) with a residue matrix of rank p, specifically

$$\begin{aligned} \lim _{s\rightarrow 0}(s {\tilde{\mathcal{F}}}_m)=\left( \sum _{i=1}^{m} \varvec{\hat{\gamma }}_i\right) ^{-1}. \end{aligned}$$
(20)

Proof

The basic building block of the considered S-fractions is given in (10), which allows us to succinctly define the truncated matrix continued S-fraction (16) recursively as

$$\begin{aligned} {\tilde{\mathcal{F}}}_m(s):=\mathcal{C}_1, \quad \textrm{for}\quad {\mathcal{C}_{m+1}(s) = +\infty }, \end{aligned}$$
(21)

in the same fashion we defined \( \mathcal{F}_m(s)\) in equation (16) via (11).

We will start with (19) and use the recursions (10) and (11) to compute \( \mathcal{F}_m(0)\). First, we notice that \(\mathcal{C}_m(0)=\varvec{\gamma }_m\). Then for \(i\le m\) we get \(\mathcal{C}_{i-1}=\mathcal{C}_{i}+\varvec{\gamma }_{i-1}\) which proves (19) by induction.

Likewise, (20) can be obtained from the recursion (10) and (21). We will start with computing \((\lim _{s\rightarrow 0} s\mathcal{C}_{m})^{-1}=\varvec{\hat{\gamma }}_{m}\) and then get the recursion

$$\begin{aligned} (\lim _{s\rightarrow 0} {s\mathcal{C}_{i-1}})^{-1}=\varvec{\hat{\gamma }}_{i-1}+(\lim _{s\rightarrow 0} {s\mathcal{C}_{i}})^{-1} \text { for } i=m-2,\ldots ,2. \end{aligned}$$

Then (20) will be computed by substituting \(\mathcal{C}_1\) into the last equality of (21). The inverse of the sum of s.p.d. matrices is of full rank, which concludes the proof. \(\square \)

The poles of \( {\tilde{\mathcal{F}}}_m\) (as of any matrix S-fraction) are real non-positive and the pole at \(s=0\) is of first order with residue matrix of rank p, which follows from the second result of Proposition 2. Therefore, \(\tilde{T}_m\) has eigenvalue zero with multiplicity p and we arrive at the following corollary.

Corollary 1

The symmetric block tridiagonal matrix \(\tilde{T}_m\) is semi-definite with p-dimensional null space.

Because it is obtained from \(T_m\) by modifying only the last diagonal block element \(\tilde{\varvec{\alpha }}\), \(\tilde{ \mathcal{F}}_m\) can be viewed as a natural extension of the Gauss–Radau quadrature introduced in [36] for \(p=1\), thus we will call it block Gauss–Radau quadrature.

Remark 2

Other choices of \(\tilde{\varvec{\alpha }}_m\) that lead to block Gauss–Radau rules are discussed in [35], where p distinct eigenvalues of the block tridiagonal matrix \(\tilde{T}_m\) are prescribed by solving a block tridiagonal system. Here we consider discretizations of differential operators with a contentious spectrum where the lowest eigenvalue tends to zero such that we choose p eigenvalues at zero.

Remark 3

The theory presented in this manuscript was developed for s.p.d. matrices. In the case A is a nonnegatvive definite matrix and the intersection of the column space of B with the null space of A is empty (\(\textrm{col}(B) \cap \textrm{null}(A) = \{\varnothing \}\)) we can still use block Lanczos algorithms to approximate the considered quadratic forms. In exact arithmetic \(T_m\) would be positive definite for this case, a property that is lost in computer arithmetic. However, a spectral correction can be applied to the eigenvalues of \(T_m\) that are not positive by expanding the spectral projectors introduced for the non-block case in [12] to the block case. Essentially, eigenvalues below a spectral threshold are shifted to the threshold and block Lanczos is rerun on the spectrally corrected matrix to restore the block tridiagonal structure. In the numerical experiments presented in Sects. 6.1.2 and 6.2 the considered matrix A is nonnegative definite, yet no spectral corrections were needed.

5 Two-Sided Error Bounds and Convergence Acceleration Via Averaging of Gauss and Gauss–Radau Rules

5.1 Monotonically Convergent Lower and Upper Bounds

In the absence of deflation, Gaussian quadrature \( \mathcal{F}_m(s)\), as all Krylov approximations eventually do, converges to \( \mathcal{F}(s)\). Gauss–Radau quadrature \( {\tilde{\mathcal{F}}}_m(s)\) differs from \( \mathcal{F}_m(s)\) only by the absence of the lower floor in the S-fraction representation, thus it also converges to the same limit. We next prove that the two sequences give monotonic two-sided bounds and that their difference defines an upper bound.

Theorem 1

For a given real positive s, the following holds,

  1. (i)

    The matrix sequence \(\{ \mathcal{F}_m(s)\}_{m\ge 1}\) is increasing, i.e. \( \mathcal{F}_m(s)< \mathcal{F}_{m+1}(s)\);

  2. (ii)

    The matrix sequence \(\{ {\tilde{\mathcal{F}}}_m(s)\}_{m\ge 1}\) is decreasing, i.e. \( {\tilde{\mathcal{F}}}_m(s)> {\tilde{\mathcal{F}}}_{m+1}(s)\).

Proof

From the recursion (10) with final condition given by (11) \(\forall s>0\) and \(i<m+1\) we get \(\mathcal{C}_i(s)>0\). From (10) it follows that we get \( \mathcal{F}_{m-1}\) if we impose \(\mathcal{C}_{m}=0\). From (10) it also follows that \(\mathcal{C}_i\) is monotonic with respect to \(\mathcal{C}_{i+1}\) (\(\mathcal{C}_{i+1}\) in the denominator of \(\mathcal{C}_{i}\)’s denominator), so the replacement of the final condition in (11) by \(\mathcal{C}_{m}=0\) decreases \(\mathcal{C}_{m-1}\), that consequentially decreases \(\mathcal{C}_{m-2}\), etc., and by induction we obtain \( \mathcal{F}_{m-1}< \mathcal{F}_m\).

For proving (ii), we recall that one can get \( {\tilde{\mathcal{F}}}_{m-1}\) by terminating the S-fraction with \(\mathcal{C}_{m}=+\infty \). Then with the same monotonicity argument as for \( \mathcal{F}_m\), it follows that such a substitution recursively increases \(\mathcal{C}_i\), \(i=m-1,m-2,\ldots , 1 \). \(\square \)

Now we are ready to formulate a two-sided Gauss–Radau-type estimate.

Proposition 3

Let Algorithm 1 produce \(\varvec{\beta }_j^T\varvec{\beta }_j>0\) for \(j\le m\). Then

$$\begin{aligned} 0< \mathcal{F}_{m-1}(s)< \mathcal{F}_{m}(s)< \mathcal{F}(s)< \tilde{\mathcal{F}}_{m}(s)<\tilde{\mathcal{F}}_{m-1}(s) \quad \forall s\in {\mathbb {R}}_+. \end{aligned}$$

Proof

We apply Theorem 1 to \( \mathcal{F}_m\) and \( {\tilde{\mathcal{F}}}_m\), yielding \(0< \mathcal{F}_{m-1}(s)< \mathcal{F}_{m}(s)\) and \( {\tilde{\mathcal{F}}}_{m}(s)< {\tilde{\mathcal{F}}}_{m-1}(s)\). To conclude the proof, we notice that for s satisfying the condition of the proposition \(\lim _{m\rightarrow \infty } \mathcal{F}_m(s)= \mathcal{F}(s)\), i.e., \( \mathcal{F}_m(s)< \mathcal{F}(s)\). From \(T_m- \tilde{T}_m\ge 0\) we obtain

$$\begin{aligned} \tilde{ \mathcal{F}}_m(s)- \mathcal{F}_m(s)= & E_1^T(\tilde{T}_m+sI)^{-1}E_1-E_1^T (T_m+sI)^{-1}E_1\\= & \int _{0}^{1}\frac{d}{dr}E_1^T(T_m+ r(\tilde{T}_m-T_m)+sI)^{-1}E_1 d r\\= & -\int _{0}^{1}V(r)^T(\tilde{T}_m-T_m)V(r) dr \ge 0, \end{aligned}$$

with \(V(r)=[T_m+ r(\tilde{T}_m-T_m)+sI]^{-1}E_1\), that concludes the proof. \(\square \)

The ordering above provides us with a computable error estimate of the approximation of \( \mathcal{F}\), that is

$$\begin{aligned} \Vert \mathcal{F}- \mathcal{F}_m \Vert < \Vert {\tilde{\mathcal{F}}}_m - \mathcal{F}_m \Vert . \end{aligned}$$
(22)

The two-sided bounds of Proposition 3 indicate the possibility of using their averaging to compute an error correction. Such an approach was considered in the works of Reichel and his co-authors for Gauss–Radau quadrature with \(p=1\) [36] and for \(p>1\) for anti-Gauss quadratures [18]. Here we identify an important class of PDEs operators for which the averaging is particularly efficient.

5.2 Acceleration Via Averaged Gauss–Radau and Gauss Rules

We propose two averaging formulas for the developed block Gauss–Radau and block Gauss rules to accelerate convergence. We motivate and analyze these averaging formulas in “Appendix A”.

The first averaging formula, which we refer to as “Averaging 1" in the numerical experiments, is the arithmetic average of the Gauss and Gauss–Radau approximation

$$\begin{aligned} \widehat{\mathcal{F}}_m(s):=\frac{1}{2} \left( \mathcal{F}_m(s)+\tilde{\mathcal{F}}_{m+1}(s)\right) . \end{aligned}$$
(23)

We stress that the computation of \(\tilde{ \mathcal{F}}_{m+1}(s)\) requires the same m steps of the block Lanczos algorithm as \( \mathcal{F}_{m}(s)\).

In “Appendix A” (for \(p=1\)) we show that for the cases with linear convergence (corresponding to the problems arising from discretzation of operators with continuous spectrum, e.g., PDEs in unbounded domains) such an averaging leads to approximate error reduction by a factor equal to the linear convergence rate of Gauss or Gauss–Radau quadratures, so the proposed averaging will be beneficial for problems with slow linear convergence.

The second averaging formula, which we refer to as “Averaging 2" in the numerical experiments is the geometric mean

(24)

of the arithmetic mean \(\widehat{\mathcal{F}}_m(s)\) and harmonic mean

$$\begin{aligned} \overline{ \mathcal{F}_m}^{-1}:=\frac{1}{2} \left[ ( \mathcal{F}_m(s))^{-1}+(\tilde{\mathcal{F}}_{m+1}(s))^{-1}\right] . \end{aligned}$$
(25)

This average is invariant if we substitute \( \mathcal{F}^{-1}\) for \( \mathcal{F}\) and reverse their order, i.e. the averaging formula for \( \mathcal{F}\) and \( \mathcal{F}^{-1}\) are the same. This is a significant property of relevance for matrices A stemming from discretizations of 2D PDEs as explained in “Appendix A”. The simple averaging formula (23) lacks this property.

6 Numerical Examples

In this section, we consider the application of the developed error bounds and show that the averaging formulas accelerate convergence for MIMO transfer function computations for PDE operators with continuous or very dense spectrum when Krylov subspace approximations lose their spectral adaptivity. This phenomenon can only be observed for large-scale problems, thus dimensions of our experiments are of order \(10^5-10^6\). Though the theory in this manuscript is limited to the resolvent with \(s\in \mathbb {R}^+\), we stress that the error bound, monotonicity and error correction via averaging seem to hold for imaginary s and other functions such as \(\phi (A)=\exp (-A t)\). We also explore possible accelerations stemming from adding random vectors to the initial block. All errors reported are measured in the 2-norm.

6.1 Transfer Functions of PDE Operators on Unbounded Domains

The computation of transfer functions of self-adjoint PDE operators on unbounded domains is the main target of the proposed approach. For accurate PDE discretization of such domains, we adopted the optimal grid approach [29] allowing spectral convergence in the exterior part of the computational domain, e.g., see [10] with a more detailed explanation of optimal grid implementation. A trade-off of the spectral approximation of the exterior problem is that A’s spectrum approximates the continuum spectrum of the exact problem well, and thus the Krylov subspace approximations lose their adaptability. The proposed averaging attempts to alleviate this problem by adapting to continuous spectra.

6.1.1 2D Diffusion

We discretize the operator

$$\begin{aligned} \sigma (\textbf{x})^{-\frac{1}{2}}\Delta \sigma (\textbf{x})^{-\frac{1}{2}} \end{aligned}$$

on an unbounded domain using second-order finite differences on a \(300\times 300\) grid. In the grid center where the variations of \(\sigma (\textbf{x})\) are supported, we use a grid size of \(\delta _x=1\). This formulation can be associated with both heat transfer and the scalar diffusion of electromagnetic fields. To approximate this, we consider a Dirichlet problem on a bounded domain with \(N_{opt}=10\) exponentially increasing grid steps in the exterior and a uniform grid in the interior part. Utilizing the optimal geometric factor of \(\exp {(\pi /\sqrt{10})}\) as indicated by [29], we achieve spectral accuracy in the exterior region of the domain, thereby effectively approximating the boundary condition at infinity (exponential decay). The resulting grid and function \(\sigma (\textbf{x})\) are shown in Fig. 1. Only the first two steps of the geometrically increasing grid are shown in the figure since they increase rapidly. In this case, B is a vector of discrete delta functions with entries at grid nodes corresponding to the transducer locations shown in the same figure as Trx.

The convergence for \(s=10^{-3}\) is shown in Fig. 2 alongside the convergence curve for purely imaginary shifts \(s=\imath \cdot 10^{-3}\) in Fig. 2a. For imaginary shifts s, block Lanczos converges slower. The monotonicity result from Theorem 1 and the error bound from Eq. (22) are only proven for \(s\in \mathbb {R}^+\). Despite this the convergence for purely imaginary s is monotone in this example, the upper bound holds and the averaged quadrature rules reduce the approximation error by an order of magnitude. For the chosen shifts both averaging formulas perform equally well.

In Fig. 2b we compare the introduced methods with the averaged anti-Gauss rules developed in [18]. An anti-Gauss rule can be obtained by modifying the last off block diagonal entry of \(T_{m+1}\), which we call \(\varvec{\beta }_{m+1}\), by multiplication with the factor \(\sqrt{2}\). Let this matrix be called \(T_{m+1}^\mathcal{H}\). There is no guarantee that this matrix is positive definite, and in about 38% of the iterations for the 2D diffusion problem it is not. In this experiment the averaged block anti-Gauss rule

$$\begin{aligned} \widehat{ \mathcal F}^\mathcal{H}_m = \frac{1}{2}\left( \mathcal{F}_m + E_1^T (T_{m+1}^\mathcal{H} +sI)^{-1}E_1 \right) \end{aligned}$$
(26)

has the same error order as the block Gauss rule but is less stable due the presence of negative eigenvalues.

In Fig. 2c the convergence for the matrix exponential \(\phi (A)=\exp {(-At)}\) for \(t=10^{5}\) is shown. The convergence is superlinear, the error bound becomes tight with increased iterations and the averaged quadrature rule from formula (24) performs better for low iteration counts. Numerical results for accelerating convergence further through subspace enrichment by adding random starting vectors to B in \(\mathcal {K}_m(A,B)\) are presented in “Appendix B”.

Fig. 1
figure 1

Grid, head conductivity \(\sigma (\textbf{x})\) and transducer locations of the heat diffusion textcase

Fig. 2
figure 2

Convergence and quadrature rule averaging for the 2D heat diffusion problem

6.1.2 3D Electromagnetic Diffusion

In this example we solve the 3D Maxwell equations in the quasi-stationary (diffusive) approximation. Krylov subspace approximation of action of matrix functions for the solution of that problem was introduced in [11] and remains the method of choice for many important geophysical applications. The diffusive Maxwell equations in \({\mathbb {R}}^{3}\) can be transformed to the form

$$\begin{aligned} (\nabla \times \nabla \times +\sigma (\textbf{x}) \mu _0 s)\mathcal {E}_{(r)}(\textbf{x},s) = - { s}\mathcal {J}^\textrm{ext}_{(r)}(\textbf{x},s), \quad \textbf{x}\in \Omega , \end{aligned}$$
(27)

where \(\sigma \) is the electrical conductivity, \(\mu _0\) the vaccum permeability, \(\mathcal {J}^{\textrm{ext}}_{(r)}\) is an external current density indexed by transmitter index (r) causing the electrical field strength \(\mathcal {E}_{(r)}(\textbf{x},s)\). For \(s\notin {\mathbb {R}}_-\) the solution of (27) converges to 0 at infinity, i.e.

$$\begin{aligned} \lim _{\textbf{x}\rightarrow \infty } \Vert E(\textbf{x})\Vert =0. \end{aligned}$$
(28)

We approximate (27)–(28) by the conservative Yee grid similarly to [11]. Following the same framework as in the diffusion problem in \({\mathbb {R}}^2\) described earlier, we truncate the computational domain with Dirichlet boundary conditions for the tangential components of \(\mathcal {E}_{(r)}\) on the domain boundary with a optimally geometrically progressing grid, which gives spectral convergence of the condition (28) at infinity. In our particular realization, we used \(N_{opt}=6\) gridsteps geometrically increasing with geometric scaling factor \(\exp {(\pi /\sqrt{6})}\) according to [29]. The grid is \(N_x=80\), \(N_y=100\) and \(N_z=120\) gridsteps large leading to a matrix A of size \(N=2\, 821\, 100\). The configuration has a homogeneous conductivity of \(\sigma =1\) in the entire domain except for two inclusions shown in Fig. 3a where \(\sigma =0.01\). As current densities, we use 6 magnetic dipoles (approximated by electric loops), with 3 dipoles located above the inclusion and 3 dipoles located in the middle of the inclusions as highlighted by the arrows in Fig. 3a. Such configuration corresponds to the tri-axial borehole tools used in electromagnetic hydrocarbon exploration [41] and the matrix B thus has 6 columns.

The operator and its discretization are nonnegative definite; however, the vectors B corresponding to divergence-free magnetic sources are intrinsically orthogonal to the operator’s null space, consisting of potential solutions. Thus, it satisfies the conditions in Remark 3.

In Fig. 3b the convergence is shown for various real values of s after \(m=400\) block iterations. The error bound is tight across all values for s and there is no difference in averaging formulas. Further, in Fig. 3c the convergence for this test case is shown for a real shift of \(s=0.05\) and in Fig. 3d for an imaginary shift.

Fig. 3
figure 3

Convergence of the block Lanczos method and averaging for the 3D electromagnetic case

6.2 Graph Laplacian

Efficient computation of SISO and MIMO transfer functions of large graph-Laplacians is important for efficient clustering of big data sets, computation of centrality measures and other applications [4, 15, 36]. For our numerical experiments we take the web-BerkStan graph from the SNAP (Standford network analysis project) [33] and turn it into an undirected graph by symmetrization. The original graph has 685230 nodes and 7600595 edges and we construct the normalized graph Laplacian A as our test matrix. The nodes of the graph represent pages from berkeley/stanford.edu domains and directed edges (here converted to undirected edges) represent hyperlinks between them. The data were collected in 2002.

As B matrix we take \(p=3\) (graph) delta functions chosen at random nodes.Footnote 3 The graph Laplacian is nonnegative definite, where the null space of the graph Laplacian consists of constant vectors on every connected graph component. The columns of B are graph delta functions in our experiments and thus not orthogonal to the null space; however, excluding the case of connected graph components of dimension 1 (single disconnected nodes), the intersection of the column space of B with the null space of A is empty, and as such, satisfies the condition of Remark 3.

In Fig. 4a we evaluate the resolvent \(\phi (A,s)=(A+s I)^{-1}\) for \(s=10^{-4}\) showing that the error bound is tight and the averaging formulas give similar results. After \(m=50\) block iterations the error for various values of s is displayed in Fig. 4.

Fig. 4
figure 4

Convergence, error bound and averaged quadrature rules for the normalized graph Laplacian test case

In contrast to PDE operators on unbounded domains, data-related graphs may exhibit spectral distributions with non-vanishing gaps even at very large dimensions. Consequently, they cannot be regarded as approximations of the problems discussed in Proposition 5. Nevertheless, as illustrated in Fig. 4a, quadrature approximations for such graphs can demonstrate linear convergence, in which case they satisfy tight block Gauss–Radau bounds and their averaging will be efficient.

7 Conclusions

In this paper, we derived a block extension of Gauss–Radau quadrature for the computation of MIMO transfer functions by leveraging the connection between the block Lanczos algorithm and matrix Stieltjes continued fractions. We proved that the block Gauss and Gauss–Radau quadrature rules provide monotonically convergent two-sided bounds for the resolvent, important for MIMO transfer functions of LTI dynamical systems with real Laplace frequencies. For problems involving the approximation of operators with a continuous spectrum, we also prove (in the SISO case) that these bounds are tight, thereby justifying the averaging of Gauss and Gauss–Radau approximations.

Our numerical experiments with large-scale graph-Laplacian, 2D diffusion, and 3D Maxwell systems confirmed the tightness of the derived bounds and the efficiency of the averaging formula, not only for the cases analyzed but also for transfer functions of LTI systems with imaginary Laplace frequency and time domain responses. Additionally, we explored the potential of using random enrichment of the initial Lanczos block within the developed algorithm framework. These findings highlight the robustness and broad applicability of our approach targeted for large-scale problems.