The neural link between subjective value and decision entropy

We evaluated whether the brain organises value and confidence signals in a systematic fashion that reflects the overall desirability of decision options. If so, regions that respond positively to increases in value should also respond positively to increases in confidence. Likewise, regions that respond negatively to both value and confidence should be widespread. We strongly confirmed these predictions through a model-based fMRI analysis of a mixed gambles task that assessed subjective value (SV) and inverse decision entropy (iDE), which is related to confidence. Purported value areas more strongly signalled iDE than SV, underscoring how intertwined value and confidence are. A gradient tied to the desirability of actions transitioned from positive SV and iDE in ventromedial prefrontal cortex to negative SV and iDE in dorsal medial prefrontal cortex. This non-accidental organisation of SV and iDE signals was found across the brain and was strongest in purported value areas.

: Behavioral analysis and voxel distribution. Three equations a) describe the behavioral model in which subjective value (SV) is a weighted combination of gains and losses, p accept is the probability of accepting a gamble, and inverse decision entropy (iDE) is the (negative) Shannon entropy of p accept and its complement p reject . b) p accept is a function of SV. High values of iDE arise from extreme values of SV, whereas iDE is low for middling values of SV in which p accept is close to 0.5. c) The 2x2 table shows all positive and negative combinations of SV and iDE. In each cell, the percentage of voxels (whole brain) that show that specific combination of SV and iDE effects is shown along with the expected percentage in parantheses according to the null hypothesis that SV value and iDE are independent. The results indicate SV and iDE tend to both be either positive or negative. The marginals for the rows and columns are also shown. The gambles in each cell are meant to represent different combinations of high and low SV and iDE for a typical participant that presents loss aversion. the blood oxygen level dependent (BOLD) signal. Sample sizes as large as these are uncommon for neuroeconomic 73 experiments, which makes this data set well-suited to answering how value and confidence are related in the brain 74 at large. We pitted inverse decision entropy and subjective value against each other with a focus on a whole-brain 75 corrected analysis of three canonical value areas: nucleus accumbens (NA), vmPFC, and the amygdala. These regions 76 of interest (ROI) were pre-selected in the original NARPS study (see SI) which focused on the analysis of gains and 77 losses but not confidence. The task was a mixed gambling task where participants either accepted or rejected each 78 gamble ( Figure 1c). 79 2 Results

80
The results are based on data collected by the NARPS team (Botvinik-Nezer et al., 2020, 2019). After applying 81 exclusion criteria (see Methods), data from 104 participants from the mixed-gambles task were analyzed. In the scanner, 82 they were asked to accept or reject prospects with a 50% chance of gaining or losing a certain amount of money ( Figure   83 1c). 84 Decision weights for gains and losses were estimated for each participant by logistic regression on the decision to 85 accept or reject the gamble. This approach models how biased a participant is when accepting or rejecting a given 86 gamble, based on properties of that gamble. The logistic regression models the participants' probability, p accept , of 87 accepting a gamble on a given trial (see Figure 1a) is 88 p accept = logit −1 (β gains × gains + β losses × losses + intercept).
(1) Using our model we computed the subjective value, which is how much a participant values the current gamble, and 89 the inverse decision entropy, which is how certain a participant is about accepting or rejecting the current gamble.

90
Subjective value for a specific trial was computed using the estimated beta coefficients β for gains (β gains ) and losses 91 (β losses ) as 92 SV = β gains × gains + β losses × losses. (2) From p accept , we calculate decision (Shannon) entropy as where p reject is 1 − p accept . Finally, inverse decision entropy (iDE) is simply negative DE. Although simple, this model 94 captures individual differences in both behaviour and brain response. For example, estimated behavioural loss aversion 95 for a participant, β losses /β gains , tracked the ratio of negative and positive SV voxels (see Supplemental Information, SI).

96
As can be seen in Figure 2a, iDE has a quadratic relation to p accept with a significant (above zero) mean Spearman  To evaluate the robustness of iDE, we considered how it varied for strongly vs. weakly accepts and rejects. Although  Our second question about preferential coding of SV or iDE is answered through the direct comparison of the effects of  Our final question concerns the relationship between SV and iDE. We predicted that these quantities would be 177 intertwined in a particular way, namely that SV and iDE would collocate and match in terms of positivity and negativity. 178 We confirmed these predictions in three ways.

179
First, in Figure 1c, we present the different contingencies for the intersection of voxels where both variables have an 180 effect in the whole brain (masked with task-active voxels), χ 2 = 25.59, p < 0.001. This analysis found that voxels tend 181 to either be both positive for SV and iDE or both negative. Figure 1c shows the expected and observed cell frequencies 182 underlying this analysis. One observation is that there is also a strong effect for voxels to code negative values for both 183 iDE and SV, which might relate to risk aversion (see SI). The relationship between iDE and SV was even stronger in ) or for f) task-active voxels across the whole brain. Each dot represents beta coefficients from one voxel estimated with FSL's mixed effects model with outlier deweighting (FLAME 1).
Second, rather than dichotomise the data, we present the correlations of beta weights between SV and iDE for these 187 same areas (Figure 4). Frontal medial cortex shows the strongest correlation for these variables (Figure 4e), r = 188 0.823, p < 0.001, and that the correlation remains positive at the whole brain level (Figure 4f), r = 0.379, p < 0.001. suggested that value and confidence combine into a single quantity such that confidence effectively adds to a basic value 208 signal to yield a combined signal that could be used to evaluate actions. This view is supported by data and is intuitive 209 in that being confident in an option should make it more attractive. In addition to the metacognitive roles confidence can for confidence to impact future choice. Although appealing, this view seems incomplete in that it neglects negative 212 neural coding of confidence -equivalent to presenting stronger activations as confidence diminishes. 213 We evaluated the possibility that the brain organises value and confidence representations in a systematic fashion that 214 reflects the overall desirability of choice options. This view holds that regions that respond positively to increases in 215 value should also respond positively to increases in confidence. Conversely, there should also be regions that respond 216 negatively to both value and confidence. If the brain represents options in terms of a general notion of desirability that 217 combines value and confidence signals, signals reflecting purely positive and purely negative pairings should be more 218 prevalent than mixed pairings of SV and iDE.

219
Our view was overwhelmingly supported by the data. As shown in Figure 3, regions that coded for both SV and iDE 220 tended to code both quantities either positively (e.g., vmPFC) or negatively (e.g., dmPFC). Across the whole brain at 221 the individual voxel level (Figure 1c), voxels were over-represented that responded positively or negatively to both iDE 222 frontal medial cortex. Likewise, across voxels, beta weights (and Z statistics) for SV and iDE positively correlated 224 across the whole brain and in purported value areas, particularly in frontal medial cortex (Figure 4e). 225 The organisation of positive and negative SV and iDE spans several regions. There appeared to be large gradients in 226 the brain that transition from positive SV and iDE to negative SV and iDE ( Figure 5). Traditional value areas, such as are stronger and more pervasive in these areas than value signals. This result is striking because these areas were 232 selected because they are understood to be value areas. 233 One suggestion is that these areas should no longer be referred to as value areas given they are more strongly driven by 234 uncertainty (e.g., iDE) when making risky decisions. Indeed, in this task, there is no strong evidence of pure value 235 signals. Of course, even though these areas are strongly driven by iDE, it would also be incorrect to refer to these areas 236 as uncertainty areas given the intertwined and highly non-accidental relationship between SV and iDE signals. Instead, 237 it appears that decision areas reflect a combined signal that is topographically organised from jointly positive to jointly 238 negative measures.

239
One question is why the brain might organise SV and iDE information in this jointly positive or jointly negative manner.   modulators (i.e., SV and iDE) were mean-centered. We used a spatial smoothing kernel of 5mm FWHM and FSL's 294 default highpass filter with 100 seconds cutoff (i.e., locally linear detrending of data and regressors). We also used 295 FSL's default settings for the locally regularized autocorrelation function. The four runs per subject were pooled with 296 fixed effects at the second level and modelled with FSL FEAT's "FLAME 1" with outlier deweighting at the third level.

297
For inference on the main effects of subjective value and inverse decision entropy, we ran whole-brain corrected  The third analysis was performed as two one sample t-tests with FSL randomise (5000 permutations, p < 0.01) on 315 the signed differences (i.e., both inverse decision entropy minus subjective value and subjective value minus inverse 316 decision entropy) between the Z statistics estimated at the second level GLM after pooling estimates with a fixed effects 317 model across the four runs. We use the Z statistics to avoid spurious results based on differences in variance or range 318 between SV and iDE. To account for the fact that a variable can show a larger effect simply because the other variable 319 shows a strong negative effect, we used the conjunction of the contrasts with the corresponding main effects (of either 320 subjective value or inverse decision entropy, respectively). To facilitate these conjunctions, we converted the p-values 321 from the mentioned FSL randomise analysis to Z statistics and further masked the output based on voxels that showed 322 differences in absolute value. Alternatively, testing for differences between absolute values of these variables can be 323 checked in the SI. We also report the number of voxels in our cluster activations to emphasize their relative size sampled 324 from MNI152 space at a resolution of 1mm x 1mm x1mm.

325
The fourth analysis focuses on the beta weights and the Z statistics (see SI) to compute correlations between SV and iDE 326 across voxels. The voxel activations were estimated across all participants with FSL's FLAME 1 mixed effects model 327 with outlier deweighing and mapped to the MNI template. FSL's mixed effects model considers between-participant 328 variance when estimating activations (Woolrich et al., 2004). Thus, the correlational analysis involved only two