An Inverse Potential Problem for Subdiffusion: Stability and Reconstruction

In this work, we study the inverse problem of recovering a potential coefficient in the subdiffusion model, which involves a Djrbashian-Caputo derivative of order $\alpha\in(0,1)$ in time, from the terminal data. We prove that the inverse problem is locally Lipschitz for small terminal time, under certain conditions on the initial data. This result extends the result in Choulli and Yamamoto (1997) for the standard parabolic case to the fractional case. The analysis relies on refined properties of two-parameter Mittag-Leffler functions, e.g., complete monotonicity and asymptotics. Further, we develop an efficient and easy-to-implement algorithm for numerically recovering the coefficient based on (preconditioned) fixed point iteration and Anderson acceleration. The efficiency and accuracy of the algorithm is illustrated with several numerical examples.


Introduction
Let Ω ⊂ R d (d = 1, 2, 3) be a smooth open bounded domain with a boundary ∂Ω. Consider the following initial boundary value problem for subdiffusion: in Ω × (0, T ], u(·, 0) = u 0 , in Ω, u = 0, on ∂Ω × (0, T ], (1.1) where T > 0 is the final time and u 0 is the initial data. The notation ∂ α t u denotes the Djrbashian-Caputo derivative of order α ∈ (0, 1) (in time), defined by [22, p. 91] where Γ(z) = ∞ 0 s z−1 e −s ds, for z > 0, denotes Euler's Gamma function. For smooth functions u, the fractional derivative ∂ α t u recovers the usual first-order derivative u (t) as α → 1 − . The function q refers to the radiativity or reaction coefficient or potential in the standard parabolic case, dependent of the specific applications. Throughout, we denote by u(q) the solution of problem (1.1) that corresponds to a given potential q ∈ L 2 (Ω).
The model (1.1) is a direct extension of the standard subdiffusion model, which has a trivial potential q (i.e., q ≡ 0), and can faithfully describe anomalously slow diffusion processes. At a microscopical level, standard subdiffusion can be described by continuous time random walk, where the waiting time distribution between consecutive jumps is heavy tailed with a divergent mean, in a manner similar to Brownian motion for normal diffusion, and the governing equation for the probability density function of the particle appearing at certain time instance t and space location x is of the form. Subdiffusion has been observed in several applications in engineering, physics and biology, e.g., thermal diffusion in fractal domains [31], and dispersive ion transport in column experiments [11]; see the review [29] for physical motivation and an extensive list of physical applications; See also the works [12,41] for the derivation of reaction-subdiffusion models within the framework of continuous time random walk.
Remark 1.1. The regularity condition u 0 ∈ D(A 1+γ ) is to ensure the well-posedness of the direct problem with q ∈ L 2 (Ω). The condition (1.4) is to ensure pointwise lower and upper bounds on the solution u(0)(T ), and the set of u 0 satisfying (1.4) is a convex subset of D(A 1+γ ). The condition λ 1 T α < θ dictates that either T or λ 1 should be sufficiently small, the latter of which holds if the domain Ω is large, since λ 1 tends to zero as the volume of Ω tends to infinity [8].
We also develop a simple algorithm to numerically recover the potential q. It is based on preconditioned fixed point iteration given in (4.1), and employs Anderson acceleration [2] to speed up the convergence. It extends an existing scheme proposed in [34] for the standard parabolic problem to subdiffusion, but enhanced by the preconditioner A −1 for better numerical stability and acceleration via Anderson acceleration. The algorithm is straightforward to implement, since it involves solving one direct problem at each iteration, and generally applicable (no sign restriction, no condition on the initial data), and when equipped with the discrepancy principle [9,14], it is also accurate for both subdiffusion and normal diffusion. We provide several numerical experiments to confirm the efficiency and accuracy of the algorithm, and to illustrate the behavior of the inverse problem. The stability result in Theorem 1.1 and the reconstruction algorithm represent the main contributions of this work. Now we discuss several existing works. Inverse problems for subdiffusion are of relative recent nature, initiated by the pioneering work [4] for recovering the diffusion coefficient from lateral Cauchy data (in the one-dimensional case) using Sturm-Liouville theory; see the work [19] for an overview. The inverse potential problem for the model (1.1) has also been analyzed in several works [18,30,42,20,21]. Miller and Yamamoto [30] proved the unique recovery from data on a space-time subdomain, using an integral transformation. Zhang and Zhou [42] discussed the unique recovery using a fixed point argument [13], and derived error estimates in the presence of data noise. Kaltenbacher and Rundell [20] gave the well-posedness of the direct problem and also proved the invertibility of the linearized map from the space L 2 (Ω) to H 2 (Ω) under the condition u 0 > 0 in Ω and q ∈ L ∞ (Ω) using a Paley-Wiener type result, where the condition q ∈ L ∞ (Ω) plays a central in the proof, which invokes a type of strong maximum principle. Further, they developed (frozen) Newton and Halley type iterative schemes for numerically recovering the coefficient q from the terminal data, and proved their convergence. Kian and Yamamoto [21] derived a stability result for recovering a space-time dependent potential coefficient from lateral Cauchy data. It is worth noting that the parabolic counterpart of the inverse problem (1.2) has been extensively studied [33,13,5,6,23]. Isakov [13] proved the existence and uniqueness for the inverse problem using strong maximum principle, and proposed a constructive algorithm based on fixed point iteration. Choulli and Yamamoto [5] proved a generic well-posedness result in a Holderian space, by introducing a scalar parameter in the leading elliptic term ∆u. Later, they [6] analyzed the inverse problem in a Hilbert space setting. Theorem 1.1 represents an extension of the result in [6] to the subdiffusion case. Note that due to the drastic difference in solution operators, i.e., the fractional case involves Mittag-Leffler functions, the extension is nontrivial. We refer interested readers to [36,26,25] and references therein for related inverse source problems, which are often employed to analyze the generic well-posedness for the inverse potential problem.
The rest of the paper is organized as follows. In Section 2, we discuss the well-posedness of the direct problem, and prove that for every q ∈ L 2 (Ω), there exists a unique classical solution, for suitably smooth initial data u 0 . Then in Section 3, we give the proof of Theorem 1.1. Next, we develop the fixed point algorithm and present its preliminary properties in Section 4. Last, we provide several numerical experiments to illustrate feasibility of the reconstruction algorithm. Throughout, (·, ·) denotes the L 2 (Ω) inner product, and H s (Ω) denotes the usual Sobolev space [1]. The notation c denotes a generic constant which may change at each occurrence, but it is always independent of the coefficient q.
2 Well-posedness of the Cauchy problem First we study the well-posedness of the following abstract Cauchy problem: It is a reformulation of the direct problem (1.1) into an operator form. We prove that for suitably smooth u 0 and any q ∈ L 2 (Ω), problem (2.1) has a unique classical solution u = u(q) ∈ C α ([0, T ]; L 2 (Ω)) ∩ C([0, T ]; D(A)). The analysis is based on a "perturbation" argument, developed recently in [17] for the numerical analysis of nonlinear subdiffusion problems, where Banach fixed point theorem and the argument of equivalent norm family play an important role (see, e.g., [7,Chapter 3]); See also [20] for a well-posedness result under slightly different assumptions on the potential q. Specifically, let {(λ j , ϕ j )} ∞ j=1 be the eigenpairs of the operator A, with the eigenvalues λ j ordered nondecreasingly and multiplicity counted, and {ϕ j } ∞ j=1 form an orthonormal basis of L 2 (Ω). For any γ > 0, the notation D(A γ ) denotes the domain of the fractional power A γ , with the graph norm · D(A γ ) , given by By viewing qu(t) as the inhomogeneous term and applying Duhamel's principle, we deduce that the solution u(t) satisfies where U (t) = F (t)u 0 , and the solution operators F (t) and E(t) are defined by [17] Here E α,β (z) refers to the two-parameter Mittag-Leffler function, defined by [22] , z ∈ C. (2. 2) The next lemma collects smoothing properties of the operators F and E. The notation · denotes the operator norm on L 2 (Ω).
Lemma 2.1. For the operators F (t) and E(t), the following estimates hold where the constant c α,θ > 0 depends on α and θ.
Proof. The first estimate follows from the fact that E α,1 (−t) is completely monotone [32]: The second follows similarly since E α,α (−t) is also completely monotone, and the last is known from [17,Lemma 3.4].
Now we can specify the function analytic setting. Let 0 < β < (1 − γ)α be fixed and set Then for every q ∈ L 2 (Ω), we define an associated operator L(q) by The next result gives the mapping property of the operator L(q).
Since f ∈ C β ([0, T ]; D(A γ )), by Sobolev embedding theorem [1], g ∈ C β ([0, T ]; L 2 (Ω)), and thus by [35,Lemma 3.4], v 1 ∈ C β ([0, T ]; D(A)) ⊂ X. Next, for t ∈ [0, T ], τ > 0 such that t + τ ≤ T , we have Thus by the smoothing property of E in Lemma 2.1, we deduce we obtain . It remains to show Av 2 ∈ C([0, T ]; L 2 (Ω)). This follows from the identity in view of the identity d dt (I − F (t)) = AE(t) [17]. Since F (t) − I is continuous on L 2 (Ω), the desired assertion follows. This completes the proof of the lemma. Proof. The proof proceeds by the argument of equivalent norm family (see, e.g., [7, Chapter 3, Section 3.8]). Specifically, we equip the space X with an equivalent family of norms · λ , λ ≥ 0, defined by which is equivalent to the norm on X, and then prove the invertibility by choosing λ suitably. For f ∈ X, let v = L(q)f . Then by Sobolev embedding [1] and Lemma 2.1, where the last inequality follows from changing variables ζ = λs by Similarly, Meanwhile, for t ∈ [0, T ) and τ > 0 with t + τ ≤ T , we have This, the inequality (2.3) and the choice β < (1 − γ)α give In the same way, we deduce Combining the preceding two estimates gives Then the preceding argument and Lemma 2.1 lead to Combining the preceding estimates implies It follows directly from this estimate that the function L(q)f λ tends to zero as λ tends to infinity, and thus the operator norm L(q) λ < 1 if λ is large enough, which shows the lemma. Now we can state the unique solvability of the Cauchy problem (2.1).
Lemma 3.1. The following properties hold on the function U (t).
is already proved in [6]. Part (ii) follows from the maximum principle for the subdiffusion model (see, e.g., [27, Theorem 1.1] or [28]). We only prove (iii). Let w( By assumption, ∆u 0 ≤ 0 in Ω, and thus by the maximum principle for the subdiffusion [27, Let ω be defined as in Theorem 1.1. Lemma 3.1(ii) implies that extended by zero outside ω belongs to L ∞ (Ω). Now we define the operator P T : This operator arises in the linearization of the forward map.
The next result gives an upper bound on the constant c α defined in (1.3). In particular, it indicates that c α < α < 1, which is crucial for proving Theorem 1.1.
Combining the preceding estimates leads directly to By the complete monotonicity of E α,1 (−t), the first term απ sin(απ) E α,1 (−t α ) in the bracket is monotonically decreasing, whereas the second term 1 − E α,1 (−t α ) is monotonically increasing. Thus, one simple upper bound is obtained by equating these two terms, which directly gives . Upon substituting it back to (3.2) and noting the complete monotonicity of E α,1 (−t), we deduce This completes the proof of the proposition. and the function f (α) = απ απ+sin απ is strictly increasing in α over the interval (0, 1). Thus, the factor is strictly less than 1 for any α ∈ (0, 1). Note also that for the limiting case α = 1, the constant c 1 = sup t≥0 te −t = e −1 , which is much sharper than the preceding bound. Since the function E α,α (−t) is actually continuous in α, one may refine the bound on c α slightly for α close to unit. Further, it is worth noting that the integral representation (3.1) for w(t) can also be deduced from the following Cristoffel-Darboux type formula for Mittag-Leffler functions, i.e., where y = z are any complex numbers. Consequently, by a limiting argument, which upon simplification gives directly the formula (3.1) for w(t).
Remark 3.2. Proposition 3.1 provides an upper bound on the constant c α . In Fig. 1(a), we plot the function α −1 tE α,α (−t) versus t for several different fractional orders, where the Mittag-Leffler function E α,α (−t) is computed using an algorithm developed in [37]. Clearly, for any fixed α, the function tE α,α (−t) first increases with t and then decreases, and there is only one global maximum. The maximum is always achieved at some t * between 0.8 and 1, a fact that remains to be established, and the maximum value decreases with α. The optimal constant cα α versus the upper bound απ sin απ+απ is shown in Fig. 1(b). Note that cα α is strictly increasing with respect to α, and the upper bound in Proposition 3.1 is about three times larger than the optimal one cα α . This is attributed to the fact that the derivation employs upper bounds of the Mittag-Leffler function E α,1 (−t) that are valid on the entire real line, instead of sharper ones on a finite interval, e.g., [0, 1]. The fact that the ratio cα α increases with α implies that the smaller the fractional order α is, there is a larger degree of freedom for choosing the parameter as well as µ 1 /µ 0 in Theorem 1.1, which partly indicates the potential beneficial effect of subdiffusion on the inverse potential problem.
The next result gives the invertibility of the operator I − P T on L 2 (ω).   Proof. First, we bound tAE(t) . Using the eigenpairs {(λ j , ϕ j )} ∞ j=1 of the operator A, we deduce Thus, Since sup t∈[0,∞] |tE α,α (−t)| ≤ c α < α, in view of Proposition 3.1, Meanwhile, using the governing equation for U (t), we have which together with the fact ∆U (x, t) ≤ 0 implies Similarly, Consequently, there holds .
The preceding two estimates and Lemma 2.1 imply Straightforward computation shows Thus, m(0) = 1 and by Proposition 3.1, under the given conditions on , µ 0 and µ 1 in Theorem 1.1. Thus, there exists a θ > 0 such that whenever x < θ, m(x) < 1, and accordingly, for λ 1 T α sufficiently close to zero, P T is a contraction on L 2 (ω). Then by Neumann series expansion, I − P T is invertible and (I − P T ) −1 is bounded. This completes the proof of the lemma.
Now we introduce the trace operator: tr : X → D(A), v → v(T ). Then tr ∈ B(X, D(A)) and tr B(X,D(A)) ≤ 1. Finally, we can present the proof of Theorem 1.1.
Proof. With Lemma 3.2 at hand, the proof is identical with that of [6]. We only include a proof for the convenience of readers. We define the mapping: K : L 2 (ω) → L 2 (ω), q → [−Au(q)(T )]| ω = [−Atr(I − L(q)) −1 U ]| ω . Clearly, K is continuously Fréchet differentiable, cf. Lemma 2.3, and its derivative K at q ∈ L 2 (ω) in the direction p is given by We define a multiplication operator M : L 2 (ω) → L 2 (ω), p → U (T )p. Then M is invertible, and its inverse is exactly the multiplication operator by u T . Consequently, Q T M −1 = P T − I. By Lemma 3.2, (P T − I) −1 belongs to B(L 2 (ω)). Therefore, Q T has a bounded inverse and Q −1 T = M −1 (P T − I) −1 . By the implicit function theorem, K is locally a C 1 -diffeomorphism from a neighborhood of 0 onto a neighborhood of K(0). In particular, K −1 is Lipschitz continuous in a neighborhood of K(0). Then Theorem 1.1 follows by noting the following inequality for any q 1 , q 2 ∈ L 2 (ω).

Fixed point algorithm
Now we propose a simple fixed point algorithm to find the potential q from the terminal observation. Given a noisy version of the exact data g = u(q † )(T ) corresponding to the exact potential q † and an initial guess q 0 , we employ the following fixed point iteration where λ > 0 is a relaxation parameter and A = −∆ is the negative Laplacian with a zero Dirichlet boundary condition. In the absence of the preconditioning operator A −1 , the iteration (4.1) was proposed in [34] for the standard parabolic problem. For both normal diffusion and subdiffusion, the unpreconditioned version works very robustly for exact data, but it tends to suffer from severe numerical instability in the presence of data noise. This is attributed to the fact that the noise in the data g is amplified by a factor λ at each iteration, in view of the smoothing property of the solution operator, and the noise effect accumulates very rapidly so as to completely spoil the reconstruction after a few iterations. The preconditioner A −1 is to mitigate the deleterious effect of noise in the observation g by implicitly filtering out the high-frequency components present in the noise thereby achieving a form of regularization [9]. Numerically, the scheme is straightforward to implement since it requires only one forward solve, and the preconditioning step incurs very little extra computational effort.
Proposition 4.1. For any nonnegative u 0 ≡ 0 and any q ∈ S, the linearized map F is contractive on S in the following sense provided that the relaxation parameter λ is sufficiently small.
Proof. For any q, h ∈ S, the Gâteaux derivative F is given by where v ≡ v(q, h) satisfies the following inhomogeneous problem By the "strong" maximum principle for the subdiffusion model [28] and the nonnegativity of u 0 and q that Since h ∈ S, the maximum principle [28] shows In particular, Now if h ≡ 0, then for any fixed t > 0, supp(f (t)) = supp(h), and by the positivity of ϕ 1 in Ω, (f (t), ϕ 1 ) > 0 for any t > 0. Thus, 0 ≤ v(T ) ≡ 0, and by the properties of elliptic problems, A −1 v(T ) > 0 in Ω. Thus, by choosing λ sufficiently small (depending on h), we deduce the desired assertion.
The next result shows that the fixed point iteration (4.1) can actually also be interpreted as a preconditioned gradient descent method, under certain restrictions on u 0 and the residual u(q)(T ) − g. The descent property can be numerically observed in a more general case, which, however, remains to be proved. Proposition 4.2. If u 0 ≡ 0 is nonnegative and u(q)(T ) − g ≡ 0 is not sign changing, then A −1 (u(q)(T ) − g) is a descent direction to the functional J(q) = 1 2 u(q)(T ) − g 2 L 2 (Ω) .
Proof. Let w ≡ w(q) solve the adjoint problem where the notation R t ∂ α T w denotes the right-sided Riemann-Liouville fractional derivative (based at T ), and R t ∂ α−1 T w(T ) the Riemann-Liouville fractional integral of order 1 − α. Further, using the solution operator E q associated with the ∆ − q, w can be represented by where δ T (s) denotes the Dirac delta function at T . Then direct computation shows that the gradient J (q) to the functional J(q) is given by
Algorithm 1 Anderson acceleration for the fixed point iteration (4.1).
1: Give the initial guess q 0 , memory parameter m ≥ 1, and the maximum iteration number K. Set m k = min(m, k).

7:
Check the stopping criterion.

8: end for
Numerical experiments indicate that the convergence behavior of fixed point iteration (4.1) depends very much on the relaxation parameter λ > 0: if λ is small, then it converges steadily but only slowly, whereas for large λ, the convergence may be unstable and suffers from large oscillations. In order to accelerate the convergence, we employ the classical Andersson acceleration technique [2], which can be viewed as a version of GMRES for nonlinear problems [40]; see the review [3] for other related extrapolation techniques. The complete procedure for Anderson acceleration is listed in Algorithm 1. The integer m controls the number of memory terms used for the Anderson update. Thus, the acceleration step only involves simple algebraic manipulations, and the associated computational overhead is negligible. In our experiment below, m = 2 represents a good choice. At line 7 of the algorithm, the stopping criterion of the iteration can employ the standard discrepancy principle, i.e., where τ > 1 is the tolerance, and δ = g − u(q † ) L 2 (Ω) is the noise level. The discrepancy principle is a well established early stopping strategy for iterative regularization methods [9]. The fixed point algorithm and its accelerated variant exhibit a very similar behavior in practice, when noise is present in the data; see Section 5 for numerical illustrations. Despite the enormous empirical success, the global convergence of Anderson acceleration remains completely open, even for affine linear maps with fixed memory (the case of linear map with full memory is well known due to its connection with GMRES [40]). The local convergence of Anderson acceleration for contractive maps was studied recently in [39,10]. However, these results do not apply to the inverse potential problem, since the associated map is not a contraction.
Remark 4.1. In the fixed point iteration (4.1), the update does not change the boundary condition of the initial guess q 0 . Thus, it is implicitly assumed that the boundary condition is exactly known. Further, for g ∈ L 2 (Ω), by the standard elliptic regularity result, the update increment A −1 (u(q)(T )−g) belongs to H 2 (Ω), and thus the regularity of the initial guess q 0 essentially determines the regularity of the iterates, and the algorithm is most suitable for recovering a smooth potential.

Remark 4.2.
There are alternative choices of fixed point algorithms. One popular choice is due to Isakov [13]: given the initial guess q 0 , it reads The convergence of the algorithm in the time-fractional case has been analyzed in [42], provided that the terminal time T is sufficiently large. Anderson acceleration might also be used to accelerate this algorithm.

Numerical reconstructions and discussions
Now we illustrate the accuracy and efficiency of the fixed point algorithm (4.1) with one-and two-dimensional numerical examples. The direct problem is solved by a fully discrete scheme based on the Galerkin finite element method in space and backward Euler convolution quadrature in time, which is first-order accurate in time and second-order accurate in space [15]; (see [16] for an overview of existing schemes). The noisy data g is generated by where the noise ξ(x) follows the standard Gaussian distribution, and ≥ 0 denotes the (relative) noise level. The exact data u(q † )(x, T ) is generated using a finer spatial-temporal mesh in order to avoid the inverse crime. In Anderson acceleration, the memory parameter m is fixed at 2, and the relaxation parameter λ is fixed at 1000 and 100 for one-and two-dimensional problems, respectively. Note that this choice of λ is not optimized, since the optimal choice depends strongly on the problem data, e.g., T and u 0 . Nonetheless, the numerical experiments below indicate that Anderson acceleration is fairly robust with respect to λ, and works for a broad range of λ values. Throughout, the parameter τ in the discrepancy principle (4.3) is fixed at τ = 1.01. Below, For a given reconstruction q * , we compute two metrics, the L 2 (Ω)-error e q and the residual r q , defined, respectively, by e q = q † − q * L 2 (Ω) and r q = u(q * ) − g L 2 (Ω) , where q † denotes the exact potential. Unless otherwise specified, the results presented below are obtained by the fixed point algorithm (4.1) with Anderson acceleration, with a zero initial guess.

Results for the one-dimensional case
First we present two one-dimensional examples on the unit interval Ω = (0, 1). In the computation, the domain Ω is divided into M equal subintervals, and the time interval (0, T ) is divided into N subintervals.
To generate data, we take M = 1000 and N = 1000, whereas for the inversion, M = 200 and N = 500. The fixed point iteration (4.1) is run for at most 1000 iterations. The first example is to recover a smooth potential.
Example 5.1. u 0 = sin πx + 1 100 x(1 − x) and q † (x) = e x sin(2πx). Note that the initial condition u 0 is chosen to fulfill the conditions in Theorem 1.1. The numerical results for Example 5.1 are shown in Tables 3-1, with three different final times, T = 0.01, T = 0.1 and T = 1, which also include the results for normal diffusion (i.e., α = 1.00). In the tables, the numbers refer to the reconstruction error e q , and the numbers in the brackets denote the stopping index determined by the discrepancy principle (4.3). It is observed that the error e q decreases steadily as the noise level tends to zero for all three fractional orders α and final time T . For each fixed T and , the accuracy does not change much with respect to α, and thus the fractional order α does not influence much the behavior of the reconstruction error. Nonetheless, for any fixed α, when the data is noise free, the error e q increases with the time T , although only very slightly. These observations are consistent with the local Lipschitz stability in Theorem 1.1 (and the stability for the parabolic case [6]), which holds for all α ∈ (0, 1], so long as the terminal time T is sufficiently small. The numerical experiments actually indicate that even for much large T , the inverse problem exhibits nearly identical behavior in terms of the reconstruction error e q , indicating similar degree of ill-posedness. See Fig. 2 for exemplary reconstructions for Example 5.1 with T = 1 at two noise levels. The reconstructions are largely comparable with each other for different fractional orders, corroborating Table 3. However, the last observation for large T seems no longer valid for normal diffusion (i.e., α = 1), for which the numerical reconstruction becomes much more challenging; the fixed point algorithm does not work as well as in the fractional case: it takes many more iterations to reach the discrepancy principle, and yet the reconstruction is generally inferior at all noise levels. This agree also with the empirical observations in the last column of Fig. 1 of [20].  Tables 1-3 indicate that with Anderson acceleration and discrepancy principle, the fixed point algorithm is generally terminated after about 10 iterations for low noise level, and 5 iterations for high noise levels. In contrast, the fixed point algorithm (4.1) takes far more iterations, by a factor of 10; see Table 4 for related results for Example 5.1 with T = 1. Nonetheless, with or without acceleration, the obtained reconstruction errors are largely comparable with each other, except the case = 0, for which the iteration (4.1) requires far more than 1000 iterations in order to achieve comparable accuracy with that in Table 3. Thus, Anderson acceleration is very effective in speeding up the convergence, while maintaining comparable accuracy. It  is worth noting that for T = 1, the results for normal diffusion are inferior for T = 1, as manifested by the fact that the convergence of the fixed point algorithm suffers seriously and the least-squares problem in Anderson acceleration exhibits pronounced ill-conditioing, which necessitates proper regularization (done via SVD here). Also the accelerating effect of Anderson acceleration is less dramatic, although it does converge after more iterations, when compared with that for smaller T or small α. These observations seem to indicate the dramatic difference in the behavior of the inverse potential problem for subdiffusion and normal diffusion at large time T , and the fractional case is far more amenable with numerical reconstruction. The convergence behavior of the acceleration scheme is shown in Fig. 3. Note that the reconstruction  (1000) error e q first decreases, and then starts to increase as the iteration further proceeds. This behavior is very similar to semi-convergence typically observed for an iterative regularization method (e.g., Landweber iteration). The discrepancy principle (4.3) can choose a suitable stopping index before the divergence kicks in, indicated by the red circle in the plots, and the attained reconstruction error is only slightly larger than the optimal value (along the trajectory), showing the optimality of the discrepancy principle. Further, a few extra iterations beyond the stopping index does not greatly deteriorate the reconstruction, i.e., the algorithm enjoys excellent numerical stability. This highly desirable property is attributed to the use of the preconditioner A −1 in the iteration (4.1). Surprisingly, the residual r q is monotonically decreasing as the iteration proceeds, and eventually levels off at the noise level δ. That is, the fixed point iteration is actually a descent method for minimizing the residual r q , an interesting fact that remains to be rigorously established (see Proposition 4.2 for a partial justification). Thus, overall, the algorithm with discrepancy principle is an effective reconstruction method.
(a) error e q (b) residual r q Figure 3: Convergence behavior of the fixed point algorithm with Anderson acceleration for Example 5.1 with α = 0.5 at T = 1. In the plots, the red circle refers to the stopping index determined by the discrepancy principle (4.1).
The next example is about recovering a nonsmooth coefficient.

Results for the two-dimensional case
Last we given a two-dimensional example on the unit square Ω = (0, 1) 2 with a smooth coefficient. The domain Ω is first partitioned smaller square of side length 1/M , and then a uniform triangulation is obtained by connecting the upper left and lower right vertices. The data is generated using M = 200 and N = 1000, and for the inversion, the discretization parameters are taken to be M = 100 and N = 500. The fixed point algorithm (4.1) is run for a maximum 200 iterations and the relaxation parameter λ is fixed at 100, which is very conservative for scheme (4.1).
The initial condition u 0 does not satisfy the condition in Theorem 1.1. The numerical results for Example 5.3 at T = 0.1 are shown in Table 9 and Fig. 6. With λ = 100, the fixed point method (4.1) can only converge very slowly, and requires thousands of iterations to yield reasonable reconstruction, and thus the corresponding results are not shown. Anderson acceleration can greatly speed up the convergence, so that with any fixed noise level > 0, it converges in two iterations. The method converges steadily, and the reconstruction error e q decreases steadily as the noise level tends to zero. Up to =1e-2 noise in the data, the result represents an excellent reconstruction of the true potential q † .
There are a few interesting questions on the inverse potential problem awaiting answers. First, the numerical experiments indicate a descent property of the fixed point iteration for the residual, which however remains to be established in the general case. Second, it is of much interest to analyze the regularizing property, e.g., convergence and convergence rates, of the fixed point algorithm (and the accelerated variant) when equipped with the discrepancy principle. Third, it is natural to ask whether it is possible to recover the potential and the fractional order α simultaneously from the terminal data, and if so, also to derive relevant stability estimates. In the case of lateral Cauchy data, it is known that one can recover the diffusion coefficient and fractional order together [4]. We shall explore these issues in future works.