MAGORINO: Magnitude-only fat fraction and R2* estimation with Rician noise modelling

Purpose: Magnitude-based fitting of chemical shift-encoded data enables proton density fat fraction (PDFF) and R2* estimation where complex-based methods fail or when phase data is inaccessible or unreliable, such as in multi-centre studies. However, traditional magnitude-based fitting algorithms suffer from Rician noise-related bias and fat-water swaps. To address these issues, we propose an algorithm for Magnitude-Only PDFF and R2* estimation with Rician Noise modelling (MAGORINO). Methods: Simulations of multi-echo gradient echo signal intensities are used to investigate the performance and behavior of MAGORINO over the space of clinically plausible PDFF, R2* and SNR values. Fitting performance is assessed in terms of parameter bias, precision and fitting error. To gain deeper insights into algorithm behavior, the paths on the likelihood functions are visualized and statistics describing correct optimization are generated. MAGORINO is compared against Gaussian noise-based magnitude fitting and complex fitting. Results: Simulations show that MAGORINO reduces bias in both PDFF and R2* measurements compared to Gaussian fitting, through two main mechanisms: (i) a greater chance of selecting the true (non-swapped) optimum, and (ii) a shift in the position of the optima such that the estimates are closer to ground truth solutions, as a result of the correct noise model. Conclusion: MAGORINO reduces fat-water swaps and Rician noise-related bias in PDFF and R2* estimation, thus addressing key limitations of traditional Gaussian noise-based magnitude-only fitting.


Introduction
5][16] Quantification of PDFF and R2* is therefore a flexible means to quantify a variety of biologic and pathological processes using a fast and relatively simple acquisition.
8][19][20][21][22] A key step in complex-based methods is estimation of the field map, which is a measure of B0 inhomogeneity.However, field map estimation is a challenging optimization problem with multiple solutions in the form of local minima. 23Selection of the incorrect minimum leads to fat-water swaps and inaccuracies in quantification.2][23] However, the smoothness assumption does not fully account for the underlying physics of B0 perturbations, limiting the accuracy of fatwater separation. 24The presence of phase errors due to eddy currents may also cause inaccuracies.Furthermore, to use complex-based methods, phase data must be accessible and reliable.Although this may be realistic in a research setting, it can be challenging in multi-centre studies, where participating sites may not all have access to expensive research agreements or dedicated software packages.In turn, this limits the feasibility and increases the cost of using these measurements in clinical trials.
As a result of the limitations of complex fitting, a number of authors have pursued magnitude-based methods for fat quantification.Magnitude-based fitting can either be employed in isolation or as a final step to refine the results of complex-based fitting (thus mitigating the effects of phase errors). 25With pure magnitude-based fitting, an alternative means to resolve fat-water ambiguity is necessary because phase information has been discarded.Bydder et al. suggested that fat-water ambiguity could be resolved on the basis of the multipeak fat spectrum. 26Hernando also showed that fat-water ambiguity disappears when fitting in-phase echoes alone, provided that a fat model with multiple spectral fat peaks is used. 27More recently, Triay Bagur et al. described MAGO, a multipoint search method for magnitude-based fat-water separation, relying on the fact that the solution space of magnitude methods generally results in only two candidate solutions, one of which can be selected on the basis that the true (non-swapped) solution generally has a lower residual than the incorrect (swapped) solution 25 .MAGO shows good agreement with complex-based methods across a range of scanners and field strengths. 25However, MAGO has important limitations arising from the Rician noise distribution observed in magnitudeonly data, particularly (i) the potential for fat-water swaps due to selection of the incorrect (swapped) solution and (ii) an additional bias resulting shifts in the position of the solution in the parameter space.These limitations are discussed further in the Theory section below.
To address the existing limitations of magnitude-based fitting, we propose a new fitting algorithm combining magnitude-only fitting with Rician noise modelling, known as Magnitude-Only fat fraction and R2* estimation with Rician Noise modelling (MAGORINO).
To demonstrate the benefits of this approach, we perform a series of detailed simulation experiments in which the noise can be controlled and a full range of clinically-plausible PDFF and R2* values can be simulated.

Theory
With a gradient-echo based CSE-MRI acquisition, assuming that the fat and water signals have equal phase at  = 0 (a reasonable assumption for a multi-echo gradient echo sequences), the noise-free complex signal  acquired at echo time  can be modelled as: where (0,  $ ) is Gaussian noise present in both real and imaginary channels and  $ is the noise variance.
For a single measurement, the log likelihood for the measured signal is given by For a set of measured signals, the log likelihood becomes -*+ [4]   where { , -} is the set of measured signals, { -} is the corresponding set of predicted signals based on the parameter estimates and  is the number of measurements (double the number of echotimes for complex data, or the number of echotimes for magnitude data).
The second term is the sum of squared errors (SSE) divided by 2 $ .The maximum value for Equation [4] therefore corresponds to the minimum SSE value; meaning that the maximum likelihood estimate can be obtained minimising the sum of squared errors (SSE), which is the widely-used nonlinear least squares approach.
Having estimated  ' and  !, the proton density fat fraction is calculated using For the signal magnitude, the noise-free signal in equation [1] becomes with only three unknown parameters ( !,  # , and  $ * ).
The corresponding noisy signal becomes where  / (0,  $ ) denotes Rician noise and 0 and  $ denote the mean and variance of the underlying complex Gaussian distribution.Importantly, this distribution has a nonzero mean which depends on  $ .Under Rician noise, the assumption of nonlinear least squares that the minimum SSE corresponds to the maximum likelihood no longer holds; parameter estimates can be obtained directly by maximizing the log likelihood. \ where { , -} is the set of measured magnitude signals at different TEs, { -} is the corresponding set of predicted magnitude signals, where  0 is the 0 th order Bessel function and  is the noise standard deviation.
Traditional magnitude-based fitting has two main limitations.First, phase information has been discarded and so is not available to resolve the fat-water ambiguity problem; some additional information is therefore needed to resolve this ambiguity.One approach is to resolve fat and water on the basis of their differing relaxation times, since fat contains multiple spectral peaks resulting in a shorter T2* due to 'spectral broadening'. 26ternatively, the multispectral nature of fat can be exploited by comparing fitting residuals from two different signal models -single peak fat and multipeak fat -with water-dominant voxels showing lower residuals for the single peak model and fat-dominant voxels showing lower residuals for the multipeak model. 28Note that the implementation in 28 used complex fitting; see Supplementary Figure 1 for further detail on the relationship between complex and magnitude fitting with regards to multipoint search.Most recently, Triay Bagur et al. described a method for magnitude-only fitting (MAGO) relying on multipoint search. 25With this approach, each voxel is initialized as fat-dominant or water-dominant, and it is expected that one initialization will converge to the true solution whilst the other will converge to the swapped solution; the true and swapped solutions can be resolved on the basis that the former should have a lower sum of squared error, or a higher likelihood (see Figure 1).However, this approach to resolving fat-water ambiguity can be confounded in the presence of Rician noise, which can sometimes cause the true optimum to have a higher error than the swapped optimum, resulting in fat-water swaps (see Figure 2).
Secondly, the prevailing approach to magnitude-based fitting has been to use nonlinear least squares, which introduces a noise-related parameter bias arising directly from the nonzero mean of Rician noise.This problem can introduce a downward bias in R2* measurements, which is most severe at high R2* and/or low SNR. 29,30Several approaches have been proposed to address this problem, including baseline fitting, where an additional parameter is introduced into the model to capture the noise floor, 31 and truncation, whereby data from longer echo times are discarded. 32More recently, Hernando et al.
proposed complex fitting as an alternative solution and showed reduced bias compared to magnitude fitting. 29However, as described above, complex fitting assumes reliable and accessible phase data, which can be a limitation at some centres and in multicentre studies.
To address the limitations of traditional Gaussian noise-based magnitude-only fitting, we propose MAGORINO, which explicitly models the Rician noise during parameter estimation and uses two-point search to obtain both true and swapped solutions.We show that the use the Rician noise model can reduce ambiguity between the true and swapped solutions, leading to a reduction in the frequency of fat-water swaps.It can also reduce error related directly to the nonzero mean of the Rician noise after the magnitude operation is performed (independent of swaps), further reducing bias.

Study design
We implement and compare four fitting algorithms: (i) two-point Gaussian fitting, which is an equivalent implementation of the MAGO algorithm described by Triay Bagur et al. 25 , also referred to here as MAGO, (ii) the new two-point Rician fitting method, MAGORINO, (iii) an analogous two-point complex fitting implementation in which  & is estimated alongside  !,  # and  $ * , and (iv) a further two-point complex fitting implementation with fixed  & .
Implementation (iv) provides a valuable 'control' implementation demonstrating the achievable performance of complex fitting under idealized conditions (although this is not likely to be realistic in practice).Note that the two-point complex fitting algorithms (iii and iv) are somewhat similar in principle to the FLAME algorithm described by Yu et al., 28 although their method was based on comparison of residuals from fits with different models, rather than different start points.

Fitting implementation
For each voxel, we perform (i) magnitude fitting with Gaussian noise model (ii) magnitude fitting with Rician noise model, (iii) complex fitting with  & estimated as a parameter and (iv) complex fitting with  & fixed to the correct value (see also Supplementary Table 1 for implementation details).Each method is implemented twice using two different start points Each of these parameters are assigned a lower bound of 0, and  $ * is assigned an upper bound of 2ms -1 .For complex fitting, for implementation (iii)  & is correctly initialized at  & = 0 and is not constrained with either upper or lower bounds; for implementation (iv)  & is fixed to 0 and therefore not estimated.All fitting is performed by maximization of likelihood functions (equivalent to minimization of error functions under Gaussian noise); this approach ensures consistency across noise models the error function is not defined for the Rician case.For each of methods (i)-(iv), the solution providing the highest likelihood is chosen as the fit output.The frequency offsets and relative amplitudes for the multipeak fat spectrum are matched to those used in 25 and 33 , i.e. frequency shifts relative to the water peak of -3.90, -3.50, -2.70, -2.04, -0.49 and +0.50 ppm and relative amplitudes of 0.087, 0.694, 0.128, 0.004, 0.039 and 0.048.All fitting is performed in MATLAB 2020a (Mathworks, Natick, MA) using the fmincon minimizer with an interior point algorithm.All constraints and the objective function (the negated log likelihood) are normalized by their initial values prior to fitting.

Fitting assessment
To determine the effect of varying PDFF and R2* on parameter estimation, simulations were performed across a dense grid of PDFF and R2* combinations, with PDFF values between 0% and 100% (at 2% intervals) and R2* values between 0 and 1 ms -1 (at 0.1ms -1 intervals).For each PDFF / R2* combination, 1000 signals were simulated using Equation [1] and sampled at echo times corresponding to a typical in vivo protocol at 3T using the shortest available echo times (TE1=1.1msand ∆TE=1.1ms) 25 .Gaussian noise was added to the noise-free signals in both the real and imaginary channels according to the SNR.The simulations were performed at 'typical SNR' for a 3 Tesla protocol in vivo (SNR=60) 25 and at 'low SNR' (SNR = 20).Two-point Gaussian magnitude fitting (MAGO), Rician magnitude fitting (MAGORINO) and complex-fitting were then applied to the noisy magnitude images to obtain PDFF and R2* estimates, as described above.
Algorithm performance was assessed in three domains: (a) parameter error, specifically the mean error on PDFF, R2* and S0 estimates, where  0 =  # +  6 , (b) parameter standard deviation for PDFF, R2* and S0, and (c) fitting error, assessed in terms of (i) the sum-ofsquared error (SSE), (ii) the sum-of-squared error relative to the noiseless signal generated directly from the ground truth parameter values in the simulation, referred to here as the 'true SSE', and (iii) the estimated SSE of the noise compared to the true noise SSE.Note that (ii) and (iii) inform on the degree of overfitting, which results in an increase in true SSE and a reduction in the estimated noise.Note also that high performance in PDFF estimation should produce both low parameter error and low standard deviation, and that in some cases consistently poor performance (consistent fat-water swaps) can produce low standard deviation.
These performance metrics differ from those used in 25 where only the median parameter values were taken from the series of simulations before error calculation, presumably to mitigate the effect of noise.

Interrogation of specific PDFF/R2* combinations
To gain further insights into the differences in behavior between Gaussian and Rician fitting, we selected specific PDFF/R2* combinations showing larger error for more detailed analysis.
Specifically, for the selected PDFF/R2* combination, the parameter estimates from fat-and water-dominant initializations were displayed for each simulation instantiation using (i) fit We also generated FF/R2* scatterplots to investigate the distribution of R2* values arising from the true and swapped likelihood optima.

Likelihood function visualization
To gain deeper insights into the behaviour observed using fit success histograms and likelihood difference plots, we computed and visualized the likelihood functions for the chosen PDFF/R2* combinations.First, noise-free data were simulated based on the PDFF/R2* values chosen for interrogation, and Gaussian noise was added in real and imaginary channels.A grid of 'candidate' PDFF/R2* values was generated (PDFF 0-100% and R2* 0-1ms -1 ) and the likelihood at each point on the grid was computed based on Equations [3] and [8].This 2-D likelihood plot was displayed using a colourmap, enabling identification of 'true' optima (corresponding closely to the ground truth parameter values) and swapped optima (typically with a fat fraction in the opposite half of the range to the ground truth value).
Having generated the 2-D likelihood plot, the noisy complex signal was passed to the fitting algorithm as described above (see 'Fitting Implementation').The positions of the two candidate solutions (arising from fat-dominant and water-dominant initializations) were recorded and displayed on the likelihood function, and the chosen candidate solution was highlighted.The paths taken by the optimizer for both initializations were also displayed.A further estimate of the global optimum was obtained using a search over the generated 2-D grid of likelihood values.Note that the values from this search will generally be close to but not exactly match the values obtained from the fitting because the 2-D nature of the search means that the value of  0 is fixed; this provides a useful simplification which reduces the degrees of freedom and thus reduces the potential for overfitting.

Results
Results of the simulation experiments are shown in Figures 3-5 (for typical SNR) and in Supplementary Figures 2-4 (for low SNR).Figure 3 shows the parameter error, Figure 4 shows the parameter standard deviation and Figure 5 shows the fitting error on PDFF, R2* and S0.The subsequent analysis for specific 'interrogated' PDFF/R2* combinations, using fit success histograms, likelihood difference plots and likelihood function visualization, is shown in Figures 6-10.For R2* measurements (middle row), Gaussian fitting carries a substantial negative bias in R2* measurements, which is most severe at high R2* and in the intermediate PDFF range.

Parameter error
This bias is substantially reduced for by Rician and complex fitting, with similar performance for both algorithms.
For S0 measurements (bottom row), the results largely mirror those of R2* error: Gaussian fitting carries a negative bias at high R2* measurements, and the bias is reduced for Rician and complex fitting.
The benefits of Rician fitting over Gaussian fitting in terms of reduced bias are also observed at low SNR, with even more pronounced reductions in bias for PDFF and R2* estimation (Supplementary Figure 2).

Parameter standard deviation
Figure 4 shows the standard deviation of PDFF, R2* and S0 estimates.
For PDFF measurements (top row), as R2* increases, all three algorithms show an increase in PDFF SD.Note that Gaussian fitting shows areas of low PDFF SD in areas of frequent swapping (i.e. the areas of high bias in Figure 3a), whereas there are no corresponding areas of low PDFF SD associated with high bias for either Rician or complex fitting.The PDFF SD is markedly reduced for complex fitting with fixed fB.
For both R2* and S0, parameter variance increases with increasing R2* and is broadly similar between algorithms.
The effect of low SNR on parameter SD can also be seen in Supplementary Figure 3.

Fitting error
Figure 5 shows the SSE (top row), 'true SSE' (i.e.SSE relative to ground truth parameter estimates) (middle row) and estimated noise SD relative to the true noise SD (bottom row).
For Gaussian fitting, there is a substantial increase in the true SSE (e) at high R2* which is not seen in the standard SSE value (a) and is accompanied by a reduction in the noise estimate relative to the true noise (i), indicating overfitting.For Rician and complex fitting, this overfitting is markedly reduced.
For complex fitting, there are areas of increase SSE, true SSE and overestimation of the noise at low R2*.This is eliminated when fB is fixed, suggesting incorrect estimation of fB (despite the correct initialization) as a potential cause.The likelihood function analyses below give further insight into this behaviour.
The beneficial effect of Rician fitting on 'true SSE' (i.e.reducing overfitting) is demonstrated at low SNR in Supplementary Figure 4.  Similarly, Figure 7 plots the difference in likelihood for the true and swapped solutions from the fitting algorithms against the chosen (higher likelihood) estimate.Again, as R2* increases, the likelihood of the swapped solution arising increases, but this increase is mitigated by Rician and complex fitting.For Gaussian fitting, the incorrect (swapped) solution shows greater likelihood in a majority of simulations at high R2* (Fig 7g), whereas the true (non-swapped) solution shows greater likelihood in the majority of simulations for Rician and complex fitting fitting (Fig 7h,i).Note that for complex fitting there are a number of likelihood difference values clustered around 0, indicating that the two-point initialization found the same minimum, and some 'reversed' values (negative at low PDFF or positive at high PDFF), suggesting that both fits reached the opposite solution to their initialization.

Interrogation of specific PDFF/R2* combinations
Figure 8 shows the likelihood functions obtained for a single noise instantiation for the three chosen PDFF/R2* combinations.At low R2* (top and middle rows), all three methods can identify the true (non-swapped) solution.However, at higher R2* (third row), for Gaussian fitting the swapped solution assumes a higher likelihood and is chosen as the fit output by the MAGO algorithm.For Rician and complex fitting, the true solution has a higher likelihood and is chosen as the fit output.For complex fitting, note that only the true (nonswapped) optimum is visible on the plots; this is because the swapped optimum occurs at a different value for fB and is therefore not observed in this 2-D grid of likelihood values which effectively has fixed fB.
Figure 9 provides insight in to the origin of R2* error observed at high R2* values, and includes fit success histograms (top row) and PDFF/R2* scatterplots (bottom row).Figs 9a,d show that Gaussian R2* estimates are negatively shifted relative to the ground truth value.
10d shows again that this arises due to selection of the swapped optimum and also due to a downward shift in the position of both the true and swapped optimum relative to the ground truth.For Rician and complex fitting, both the number of swaps and the negative shift in the positions of the optima are reduced, contributing to a reduction in bias.the swapped optimum due to its higher likelihood, resulting in a further downward bias (this can be considered as an 'R2* swap').For Rician and complex fitting, the local optimum is closer to the ground truth value and has also been correctly chosen as the fit output, both mitigating bias and accounting for the behaviour shown in Fig 9.

Discussion
The choice of fitting algorithm for PDFF and R2* estimation represents a tradeoff.Complexbased fitting enables resolution of fat-water ambiguity based on phase data, has a greater number of datapoints and avoids Rician noise-related parameter bias, but dictates that phase information must be accessible and reliable and can fail in areas of large B0 inhomogeneity.Conversely, magnitude-based fitting can be performed without reliable phase data but requires fat-water ambiguity to be solved by another method and suffers from bias due to the Rician nature of the noise distribution in the magnitude signal.The strengths of the magnitude-based approach have led to its use as a 'final step' in processing of data from multisite studies 33 and recently motivated the development of a pure magnitude-only algorithm known as MAGO, which resolves fat-water ambiguity on the basis of the spectral complexity of fat 25 .Despite producing good agreement values with complexbased fitting, this method still suffers from noise-related bias and fat-water swaps, limiting the performance of the method and necessitating the use of regularization strategies and/or median filtering to achieve satisfactory parameter estimates.Here, we describe a new fitting algorithm known as MAGORINO, which has performance advantages over the MAGO algorithm arising from increased likelihood of selecting the true (non-swapped) optimum and from local shifts in the position of the optimum such that it is closer to the ground truth value.
Our study has two key results.Firstly, whereas the performance of MAGO at low PDFF begins to deteriorate with increasing R2* and/or low SNR (due to increased frequency of fat water swaps), the MAGORINO algorithm retains its performance at substantially higher R2* and/or lower SNR.We show that this behaviour arises because the difference in likelihood between the true (non-swapped) and swapped optima is, on average, greater for MAGORINO than for MAGO because of the use of Rician noise modelling.The use of the Rician noise model dictates that the true (non-swapped) optimum is selected by the algorithm in a greater proportion of cases, resulting in a reduction in PDFF bias and variance.Furthermore, our results demonstrate (Figure 9) that the true and swapped maxima typically occur at different R2* values: effectively the R2* measurement can also be 'swapped'.This problem is also mitigated by the MAGORINO algorithm.Although the overall performance is superior for MAGORINO compared to MAGO, a caveat is that MAGORINO shows some loss of performance at high PDFF and high R2*.This may occur because signal fluctuations occurring due to the spectral complexity of fat can be incorrectly attributed to noise, whereas the MAGO algorithm almost always assumes that signal fluctuations are due to spectral complexity and is therefore strongly biased in favour of high PDFF values.The combination of high PDFF and high R2* is also relatively uncommon in vivo.
Secondly, and separate from its effect on selection of the correct optimum, MAGORINO produces a substantial reduction in bias at in R2* measurements compared to MAGO.This effect is mediated by a shift in the position of the optima in parameter space as a direct result of the use of Rician noise modelling, which can effectively attribute nonzero signal intensities at longer echo times to noise where the Gaussian noise model attributes these to a reduction in decay rate.
The improvement in PDFF and R2* estimation accuracy may be valuable when imaging ironoverloaded tissues, when using iron-based contrast agents and when imaging tissues with natively high R2*, such as bone marrow and cortex (particularly at low field strength/SNR).
In severe iron overload, R2* values are typically greater than 0.58 ms -1 and can be as high as 2 ms -1 34 , beyond the upper end of the range of values simulated in this study, meaning that the biases observed here are biologically and clinically relevant.The superior performance of MAGORINO may also be important in normal tissues with high R2*, such as the bone marrow, where dephasing is caused by inhomogeneity as a result of calcium in bone trabeculae and cortex.When using iron-based contrast agents, R2* values as high as 0.45 ms -1 can be observed even in normal tissue. 16ur knowledge, this is the first study combining explicit modelling of Rician with chemical shift-encoded MRI.Yokoo et al. previously investigated the use of Rician noise modelling in R2* estimation in the liver but did not consider the effect of fat 34 , whereas Triay-Bagur et al.'s MAGO algorithm described a two-point search approach to magnitude-based fitting but did not include Rician noise modelling 25 .A significant contribution of our study is that combining two-point search with Rician noise modelling has a synergistic effect, particularly with regards to resolution of fat-water ambiguity.An additional advantage of MAGORINO is that the initialization values of  6 and  # are set automatically by the algorithm based on knowledge of the underlying physics, rather than set empirically as with MAGO.MAGORINO also automatically scales the objective function and all constraints prior to fitting, ensuring that the step size is not too large and reducing the chances of 'overshooting' the minimum closest to the initialization value.
An important aspect of Rician noise modelling is the need to estimate the noise standard deviation.Preferably, the local variance map should be calculated at the time of the multicoil k-space data combination, using the noise and coil sensitivity information in each channel acquired during the coil calibration scan 34 .Alternatively, for datasets for which this information is not available, noise standard deviation could be estimated from a homogenous region-of-interest within the image.The MAGORINO algorithm can also be easily modified to estimate the noise standard deviation as a fitted parameter.Although this introduces an additional unknown, estimates could potentially be locally smoothed to mitigate the effect of the additional degree of freedom before re-fitting with the smoothed estimate.Importantly, MAGORINO should be robust to inaccuracies in calculation of the noise standard deviation.Underestimation of the noise standard deviation will make the objective function closer to Gaussian (and therefore the performance of MAGORINO will approach that of MAGO), whilst overestimation of the noise standard deviation will exaggerate the differences in performance between the two methods but is unlikely to dramatically harm performance.
There are several aspects of the proposed methodology that are beyond the scope of the current study.Firstly, we only considered R2* estimation using a monoexponential model, whereas in some tissues the true behaviour may be more complex.However, the monoexponential model is broadly accepted to be a good approximation in various tissues including liver and bone marrow.Secondly, we did not explore the effect of variations in imaging parameters such as the choice of field strength, number of TEs, acquisition geometry or volumetric imaging.As these parameters affect SNR, they are likely to impact on the success with which fat and water can be resolved therefore influence PDFF and R2* estimates.Thirdly, even with the Rician noise model, MAGORINO cannot correctly resolve fat-water ambiguity in all voxels; incorporation of spatial regularization may therefore be of value in improving the spatial homogeneity of parameter estimates.Finally, further studies will be required to validate the advantages offered by MAGORINO with in vivo data for specific applications: the results of this study will enable this future work to be designed to focus on regions of parameter space where MAGORINO is likely to be of particular benefit.).This problem is substantially mitigated using Rician fitting (b), which approaches the performance of complex fitting (c), albeit with a tradeoff of some increase in error at high PDFF and high R2*.At low R2*, both Gaussian and Rician magnitude fitting (a,b) show lower error in PDFF than complex fitting (c); the Figures below show that this is because complex fitting does not reach the true (non-swapped) likelihood maximum in every case, resulting in a small positive bias and increase parameter SD.For R2* measurements, Rician fitting (f) substantially reduces the negative bias occurring at high R2* values for Gaussian fitting (e), with similar performance to complex fitting (g).Note that complex fitting with fB fixed (right hand column) is used as a 'control experiment' to demonstrate performance under idealized conditions.Note that this behaviour is eliminated by fixing   (right hand column), although this step is likely to be unrealistic in practice.

Conclusion
Figure 5 -Fitting error.The plots show the grayscale-coded sum of SSE (a-d), 'true SSE' (i.e.SSE calculated relative to the ground truth) (e-h) and estimated noise (SSE / simulated noise SSE) (i-l) for each combination of PDFF and R2* values over all simulations, with SNR=60.For Gaussian fitting, the 'true SSE' (e) increases substantially at higher R2* values, indicating overfitting to the noise.This problem is substantially reduced by Rician magnitude fitting and complex fitting.For complex fitting (third column), SSE and noise estimates are highest at low R2* values because the two-point initialization does not reach the true (non-swapped) likelihood maximum in every case.Note that this behaviour is eliminated by fixing fB (right hand column), although this step is likely to be unrealistic in practice.) were used to demonstrate the effect of varying R2*, with each R2* value on a separate row.Plots have been generated for both Gaussian fitting (a,d,g), Rician fitting (b,e,h) and complex fitting (c, f, i).For Gaussian fitting, as R2* increases, the likelihood of the swapped solution arising increases, but this increase is mitigated by the use of Rician or complex fitting.For Gaussian fitting, at high R2*, the majority of fitted solutions are incorrected (swapped) (e), whereas the majority of solutions are correct (nonswapped) for Rician fitting (h) and complex fitting (i).
Figure 7 -Origin of PDFF error: likelihood difference plots.For each plot, the y-axis shows the difference in likelihood between water-dominant and fat-dominant solutions.A positive likelihood difference indicates that the water-dominant (low PDFF) solution is more likely, whereas a negative likelihood indicates that the fatdominant (high PDFF) solution is more likely.All plots were generated with PDFF = 20%, whilst three different R2* values (0, 0.3 and 0.5 ms -1 ) were used to demonstrate the effect of varying R2*, with each R2* value on a separate row.Plots have been generated for both Gaussian fitting (a,d,g), Rician fitting (b,e,h) and complex fitting (c, f, i).For Gaussian fitting, the incorrect (swapped) solution shows greater likelihood in a majority of simulations at high R2* (g), whereas the true (non-swapped) solution shows greater likelihood in the majority of simulations for Rician fitting (h) and complex fitting (i).Note that for complex fitting the same minimum can sometimes be found from both fat-dominant and water dominant initializations, resulting in a likelihood difference of zero. ) with a fixed PDFF = 20%.Each plot shows the colour-coded likelihood over the clinically-feasible space of possible PDFF and R2* values for a given pair of ground-truth parameter measurements.Plots have been generated for both Gaussian fitting (a,d,g), Rician fitting (b,e,h) and complex fitting (c, f, i).Each plot labels the ground truth fat fraction and R2*, maximum likelihood estimate from two-dimensional PDFF/R2* grid search (MLE grid search), local optimum from grid search, likelihood optima from water-dominant and fatdominant initialisations (opt1 and opt2), with the chosen solution circled as the fit output, and paths on the objective function (path1 and path2 for opt1 and opt2 respectively).Note that all three methods arrive at the true (non-swapped solution) for R2*=0 and R2*=0.3, but at R2*=0.5 only Rician and complex fitting correctly resolve the fat-water ambiguity.Note that there is a small discrepancy between the position of the position of the MLE from the grid search and the fitting outputs; this arises because the grid search was performed in two dimensions (over PDFF and R2* values, in order to match the dimensions of the likelihood plot) whereas the fitting includes   and   as separate parameters, which is more realistic but leads to a greater degree of overfitting than the idealized grid search.For (a-c), the yaxis shows the frequency of R2* estimates relative to the ground truth value for Gaussian (a), Rician (b) and complex (c) fitting.For (d-f), the scatterplots show PDFF and R2* parameter estimates.All plots were generated with PDFF = 20%, R2* = 0.7 ms -1 and SNR = 60.The histograms show a substantial downward shift away from the ground truth R2* value for Gaussian fitting; for Rician and complex fitting this bias is substantially reduced and the majority of estimates cluster around the ground truth value.The scatterplots show that there are two separate reasons for the R2* bias observed with Gaussian fitting: (i) fat-water swaps, with the swapped solution having lower R2* than the true solution (this can be considered as an 'R2* swap') and (ii) a further negative bias for both true and swapped solutions relative to the ground truth and relative to solutions obtained from Rician and complex fitting.

1 |𝑆 1 1 |𝑆 1 | 1 |𝑆 1 |
(i.e. a 'two-point search' method is used); fitting is therefore run eight times in total for each voxel.The initial values of  !(for water-dominant initialization) and  # (for fat-dominant initialization) are set to the maximum signal magnitude from the multiecho data, max |, multiplied by a constant .The constant  compensates for reduction in the signal magnitude due to  $ * decay and chemical shift and avoids the need for empirical manual adjustment of initial values depending on scanner gain, as performed in 25 .Here, we use  = exp( 2(34  $ -.-1 * ), where  2(34 is the echo time corresponding to the maximum signal magnitude and  $ -.-1 * is the initialization  $ * value n.Specifically, for water-dominant initialization, the initial values are { 6 ,  # ,  $ * } = { * max , 0.001, 0.1ms 7+ }; for fatdominant initialization, the initial values are { 6 ,  # ,  $ * } = {0.001, * max , 0.1ms 7+ }. success histograms and (ii) likelihood difference plots.The fit success histograms show the frequency of parameter estimates over all simulation instantiations, displayed on a histogram relative to the ground truth.The likelihood difference plots are scatterplots in which the parameter estimates are plotted against the difference in likelihood for the fatdominant and water-dominant initializations.Assuming that the initialization functions correctly (and both fat-dominant and water-dominant optima are obtained from the fits), there are two main possibilities which can be captured by the likelihood difference plot: (i) the water-dominant solution is more likely than the fat-dominant solution, resulting in a positive likelihood difference for a low FF estimate, (ii) the fat-dominant solution is more likely than the fat-dominant solution, resulting in a negative likelihood difference for a high FF estimate.Additionally, there are several further possibilities which can occur if the optimization unexpectedly finds the opposite optimum to its initialization (i.e. a fatdominant optimization finds a water dominant-solution, or vice versa).These are (iii) both initializations find the same local optimum, resulting a likelihood difference of 0, or (iv) both initializations find the wrong local optimum, resulting in a 'reversed' likelihood difference such that a water-dominant solution has negative likelihood difference or a fat-dominant solution has a positive likelihood difference.

Figure 3
Figure3shows the mean error on PDFF, R2* and S0 relative to the ground truth values.Note that, for fat fraction estimates (top row), areas of positive error at low fat fraction and negative error at high fat fraction arise predominantly from fat-water swaps.

Figures 6 ,
Figures 6, 7 and 8 provide further insight into the behaviour of the algorithms for three PDFF/R2* combinations corresponding to the top half of the PDFF error plots shown in Figure 3(a-d), corresponding to a single PDFF of 20% with three PDFF values R2* (specifically 0, 0.3 and 0.5ms -1 , corresponding to the top, middle and bottom rows respectively for each Figure), which were chosen to illustrate the behaviour observed in Figure 3.

Figure 6
Figure6shows fit success histograms illustrating the frequency with which the two-point Gaussian, Rician and complex fitting algorithms finding the correct likelihood maximum (optimum) at different R2* values.As R2* increases, the frequency with which the swapped solution is selected increases, but this increase is mitigated by the use of Rician and complex fitting.For Gaussian fitting, at high R2*, the majority of fitted solutions are incorrect

Figure 10 shows
Figure 10 shows an example of likelihood functions for Gaussian, Rician and complex fitting at high R2*, and gives further insight into the R2* bias observed in Fig 3e-h and Fig 9.For Gaussian fitting, there are two sources of negative bias in R2*.First, the position of the true optimum (closest to the ground truth) is negatively shifted relative to the ground truth, as evidenced by the position of opt1 and the MLE from grid search.Second, the fit has chosen

Figure 2 -
Figure 2 -Conceptual illustration of failure in magnitude-only resolution of fat-water ambiguity in the presence of noise.The Gaussian likelihood of fat fraction estimates for a truly water-dominant voxel (PDFF=20%) are shown in 1-D (a) and in 2-D (b).In this case, the swapped (incorrect) PDFF maximum has a higher likelihood than the true (non-swapped) solution, leading the algorithm to select the wrong solution.In the 1-D plot, this manifests as a reversal in the size of the two likelihood peaks, whereas in the 2-D plot it manifests as a swap in the positions of the MLE and local optimum (i.e. the black and white diamonds have swapped position compared to Fig 1c, above).

Figure 3 -
Figure 3 -Parameter error.The plots show the colour-coded error in PDFF (a-d), R2* (e-h) and S0 (i-l) estimates for each combination of PDFF and R2* values over all simulations, with SNR=60.For fat fraction measurements (top row), Gaussian fitting suffers from fat-water swaps as R2* increases, particularly in the low PDFF range (producing the bright yellow area at the top of (a)).This problem is substantially mitigated using Rician fitting (b), which approaches the performance of complex fitting (c), albeit with a tradeoff of some increase in error at high PDFF and high R2*.At low R2*, both Gaussian and Rician magnitude fitting (a,b) show lower error in PDFF than complex fitting (c); the Figures below show that this is because complex fitting does not reach the true (non-swapped) likelihood maximum in every case, resulting in a small positive bias and increase parameter SD.For R2* measurements, Rician fitting (f) substantially reduces the negative bias occurring at high R2* values for Gaussian fitting (e), with similar performance to complex fitting (g).Note that complex fitting with fB fixed (right hand column) is used as a 'control experiment' to demonstrate performance under idealized conditions.

Figure 4 -
Figure 4 -Parameter SD.The plots show the colour-coded standard deviation in PDFF (a-d), R2* (e-h) and S0 (il) estimates for each combination of PDFF and R2* values over all simulations, with SNR=60.Note that parameter SD generally increased with increasing R2* because fat-water swaps become more frequent (as shown in the Figure above).At low R2*, both Gaussian and Rician magnitude fitting (a,b) show lower PDFF SD than complex fitting (c); the Figures below show thatthis is because complex fitting does not reach the true (non-swapped) likelihood maximum in every case, resulting in a small positive bias and increase parameter SD.Note that this behaviour is eliminated by fixing   (right hand column), although this step is likely to be unrealistic in practice.

Figure 6 -
Figure 6 -Origin of PDFF error: fit success histograms I.Each plot shows the frequency of fat fraction estimates relative to the ground truth value.All plots were generated with PDFF = 20%, whilst three different R2* values (0, 0.3 and 0.5 ms -1 ) were used to demonstrate the effect of varying R2*, with each R2* value on a separate row.Plots have been generated for both Gaussian fitting (a,d,g), Rician fitting (b,e,h) and complex fitting (c, f, i).For Gaussian fitting, as R2* increases, the likelihood of the swapped solution arising increases, but this increase is mitigated by the use of Rician or complex fitting.For Gaussian fitting, at high R2*, the majority of fitted solutions are incorrected (swapped) (e), whereas the majority of solutions are correct (nonswapped) for Rician fitting (h) and complex fitting (i).

Figure 8 -
Figure 8 -Origin of PDFF error: likelihood function visualization I. To gain further insight into the behaviours observed in Figures3-5, likelihood functions (and fitted solutions) were visualised for three R2* values (0, 0.3 and 0.5 ms -1 ) with a fixed PDFF = 20%.Each plot shows the colour-coded likelihood over the clinically-feasible space of possible PDFF and R2* values for a given pair of ground-truth parameter measurements.Plots have been generated for both Gaussian fitting (a,d,g), Rician fitting (b,e,h) and complex fitting (c, f, i).Each plot labels the ground truth fat fraction and R2*, maximum likelihood estimate from two-dimensional PDFF/R2* grid search (MLE grid search), local optimum from grid search, likelihood optima from water-dominant and fatdominant initialisations (opt1 and opt2), with the chosen solution circled as the fit output, and paths on the objective function (path1 and path2 for opt1 and opt2 respectively).Note that all three methods arrive at the true (non-swapped solution) for R2*=0 and R2*=0.3, but at R2*=0.5 only Rician and complex fitting correctly resolve the fat-water ambiguity.Note that there is a small discrepancy between the position of the position of the MLE from the grid search and the fitting outputs; this arises because the grid search was performed in two dimensions (over PDFF and R2* values, in order to match the dimensions of the likelihood plot) whereas the fitting includes   and   as separate parameters, which is more realistic but leads to a greater degree of overfitting than the idealized grid search.

Figure 9 -
Figure 9 -Origin of R2* error: Fit success histograms II (a-c) and PDFF/R2* scatterplots (d-f).For (a-c), the yaxis shows the frequency of R2* estimates relative to the ground truth value for Gaussian (a), Rician (b) and complex (c) fitting.For (d-f), the scatterplots show PDFF and R2* parameter estimates.All plots were generated with PDFF = 20%, R2* = 0.7 ms -1 and SNR = 60.The histograms show a substantial downward shift away from the ground truth R2* value for Gaussian fitting; for Rician and complex fitting this bias is substantially reduced and the majority of estimates cluster around the ground truth value.The scatterplots show that there are two separate reasons for the R2* bias observed with Gaussian fitting: (i) fat-water swaps, with the swapped solution having lower R2* than the true solution (this can be considered as an 'R2* swap') and (ii) a further negative bias for both true and swapped solutions relative to the ground truth and relative to solutions obtained from Rician and complex fitting.

Figure 10 -
Figure 10 -Origin of R2* error: likelihood function visualization II.Each plot shows the colour-coded likelihood over the clinically-feasible space of possible PDFF and R2* values for a given pair of ground-truth parameter measurements.Each plot labels the ground truth fat fraction and R2*, maximum likelihood estimate from two-dimensional PDFF/R2* grid search (MLE grid search), local optimum from grid search and likelihood optima obtained by fitting from water-dominant and fat-dominant initializations (opt1 and opt2), with the chosen solution circled as the fit output.For clarity of visualization, the fitting paths have been omitted from this figure.For Rician and complex fitting, both the MLE from the grid search and the fit results are closer to the ground truth solution than for Gaussian fitting.Again, the discrepancy between the position of the MLE from the grid search and the fitting outputs arises because the grid search was performed in two dimensions (over PDFF and R2* values, in order to match the dimensions of the likelihood plot) whereas the fitting includes   and   as separate parameters.The plot shows that the distance between the MLE and min1 is reduced for Rician fitting compared to Gaussian fitting, indicating reduced overfitting.