Trustworthiness detection from faces: Does reliance on facial impressions pay off?

While people readily form and rely on trustworthiness impressions from faces, the question of whether these impressions are accurate remains debated. The present research examines whether having access to the facial appearance of counterparts provides a strategic advantage to participants when making trust decisions. Furthermore, we investigated whether people show above-chance accuracy in trustworthiness detection (a) when they make trust decisions vs. provide explicit trustworthiness ratings, (b) when judging male vs. female counterparts, and (c) when rating cropped images (with non-facial features removed) vs. uncropped images. Results showed that incentivized trust decisions (Study 1, n = 131) and predictions of counterparts’ trustworthiness (Study 2, n = 266) were unrelated to actual trustworthiness. Moreover, accuracy was not moderated by stimulus type (cropped vs. uncropped faces) or counterparts’ gender. Overall, these findings suggest that people are unable to detect the trustworthiness of strangers based on their facial appearance. University Additional studies are needed to examine the generalizability of our findings with larger and more diverse samples. Future studies should also examine the accuracy of trustworthiness impressions using varying types of stimuli. Cropped images, in which all non-facial aspects are removed, ensure that impressions are actually based on the facial features of counterparts. However, they do not represent the kinds of stimuli that people actually encounter in real life. Ultimately, we believe that studies using a range of different stimuli are needed to map the accuracy of trustworthiness decisions under varying conditions.


Trustworthiness Detection From Faces: Does Reliance on Facial Impressions Pay Off?
Trust is a valuable commodity in romantic relationships (Kim et al., 2015), professional organizations (Kramer, 1999), and society at large (Knack & Keefer, 1997). Yet, as trust only pays off when it is reciprocated and not betrayed, people are often faced with the challenge to identify counterparts that can be trusted. Previous studies point to one cognitive mechanism that could address this challenge: People readily form impressions of others' trustworthiness based on their facial appearance (Freeman & Johnson, 2016;Jack & Schyns, 2017;Krumhuber et al., 2007;. But can people actually detect the trustworthiness of others based on their facial features? Addressing this question is important for two reasons. First, a person's appearance is a readily available cue, and in many situations the only one. If trustworthiness impressions are accurate (at least to some extent), then reliance on these judgments would represent one way in which people can establish cooperative relationships with strangers. Accurate inferences would allow people to make adaptive trust decisions even when little is known about counterparts or when such information would be costly and effortful to obtain. Second, perceptions of trustworthiness influence many important outcomes, including legal sentencing decisions, personnel selection, and economic transactions (Olivola et al., 2014).
If trustworthiness judgments are not accurate, then this would imply that many consequential decisions are biased by irrelevant facial cues.
Previous studies have examined the accuracy of trustworthiness impressions in the context of social dilemma games such as the trust game (Berg et al., 1995). In this dyadic interaction, a participant (i.e., the trustor) decides whether to send a monetary endowment to another participant (i.e., the trustee). In case the endowment is transferred, the money is multiplied and the trustee decides how much to return to the trustor. Trust and reciprocity lead to higher payoffs for both, but trust is risky as trustees face the temptation to keep the transferred money. Bonnefon, De Neys, and Hopfensitz (2013) presented facial photographs of trustees who had either reciprocated or betrayed trust, showing that participants were more likely to transfer money to counterparts that were actually trustworthy. Other studies yielded similar results, leading to the conclusion that people are able to detect the trustworthiness of counterparts at levels slightly above chance (ca. 55%; Bonnefon et al., 2017;De Neys et al., 2013Tognetti et al., 2013;Verplaetse et al., 2007). Yet, evidence for the accuracy of trustworthiness detection is mixed. Some researchers did not find empirical support for accuracy when examining trust behavior in social dilemma games (Efferson & Vogt, 2013;Yamagishi et al., 2003) or when obtaining explicit ratings of counterparts varying in trustworthiness (Rule et al., 2013). Moreover, accuracy often depended on extraneous factors, which did not replicate across studies. For example, Tognetti and colleagues (2013) found above-chance accuracy for male but not female counterparts, when using images that were uncropped and included non-facial features (e.g., hair style). Bonnefon and colleagues (2013), on the other hand, found higher levels of accuracy for female counterparts, but only with cropped images that occluded all non-facial features.
Various scholars have also criticized the accuracy claim by arguing that the reliability of any facial feature as an indicator of trustworthiness might be easily undermined if individuals exhibit the feature but act selfishly (Efferson & Vogt, 2013;McCullough & Reed, 2016). This could lead to the emergence of imitators who appear trustworthy and garner the benefits of trust without paying the costs of reciprocating it. Furthermore, trustworthiness impressions of the same individual vary substantially across different perceivers (Hehman et al., 2017) and contexts (Brambilla et al., 2018), questioning whether they could be a reliable indicator of any disposition. In sum, evidence on accurate trustworthiness detection from faces has been inconsistent and the topic remains subject to vigorous debate.

Aims of the Present Research
We present the results of two studies on the accuracy of trustworthiness impressions that address three critical limitations of prior work. First, many prior studies relied on the same set of facial photographs De Neys et al., 2013 and explicitly selected photographs of trustees that were judged with the highest levels of accuracy in prior investigations (De Neys et al., 2015. Thus, these results do not provide unbiased accuracy estimates and it is unclear whether findings generalize to other stimulus sets. Here, we provide a strong test of the generalizability of prior results by examining accuracy using independent samples of participants and stimuli. Second, past research uncovered several moderators (e.g., above-chance accuracy for female, but not male counterparts), which, however, did not consistently emerge across studies Tognetti et al., 2013). We examine the robustness of the proposed moderators by testing whether participants show above-chance accuracy (a) when they make trust decisions vs. provide explicit trustworthiness ratings, (b), when they rate cropped images (with non-facial features being removed) vs. uncropped images, and (c) when the trustee is male vs. female.
Third, several scholars have posited that facial appearance is not indicative of actual trustworthiness (Efferson & Vogt, 2013;. Yet, existing studies have exclusively focused on statistical methods that cannot provide evidence for such a null hypothesis. The present research addresses this issue by reporting the results of Bayesian analyses (alongside frequentist statistics), which can quantify evidence in favor of the null hypothesis (Wagenmakers, 2007).
In short, our studies constitute a strong test of the hypothesis that people can detect the trustworthiness of others based on their facial appearance. In two studies, we test whether participants' are more likely to entrust money to counterparts who are in fact trustworthy (Study 1, n = 131). We also compare participants' earnings to those expected by simple decision strategies that ignore facial appearance altogether (i.e., trust at random, always trust, never trust).
This allows us to demonstrate whether knowing the facial appearance of counterparts gives participants a strategic advantage in social dilemmas. In Study 2 (n = 266), we examine accuracy using an alternative experimental design. For this, we employ an incentivized prediction task and test whether participants can accurately predict the trustworthiness of counterparts based on facial photographs.
All data and analysis scripts are available at the Open Science Framework (https://osf.io/8wejn/). We report how our sample sizes were determined and all data exclusions and measures for each study.

Study 1
Study 1 consisted of two phases. In the first phase (n = 31), we obtained facial photographs and behavioral data from participants who acted as trustees in the trust game. In the second phase, a separate sample of participants (n = 131) made trust game decisions in the role of trustors while being matched with (and seeing photos of) the trustees of the first phase. All decisions were incentivized and both trustors and trustees received additional payments to control for social preferences (Engelmann, Meyer, et al., 2019). We first examined whether participants relied on their counterparts' facial appearance when making trust decisions. We tested whether participants exhibit more trust towards counterparts that are perceived as more trustworthy in two ways, by 1. identifying the effects of variations in the perceived trustworthiness of faces, and 2. identifying the causal effects of trustworthiness via manipulating counterpart's perceived trustworthiness using face morphing. The main goal of this study was to examine whether participants could accurately detect the trustworthiness of counterparts. We therefore tested (a) whether participants were more likely to transfer money to trustworthy counterparts and (b) whether knowledge of their counterparts' facial appearance allowed them to accumulate higher earnings in the trust game than simple decision strategies that ignore facial appearance (i.e., trust at random, always trust, always distrust).

Methods
Stimuli (trustees). We first collected facial photographs and behavioral strategies for a sample of trustees. Participants (n = 84) were recruited from the University of Zurich participant pool and received a fixed payment of 20 CHF (ca. $22) and additional payment that depended on their behavior in the study. At the end of the study, one round of the trust game was selected at random and participants received their earnings from that round. All decisions were therefore fully incentivized, which is an important aspect for the main studies as they reflect the true preferences of the trustees.
Participant received a written description of the "decision situation" (i.e., the trust game) and were informed that they would play five rounds with different counterparts in the role of the trustor or the trustee. In each round, both participants received an endowment of 12 CHF and the trustor could decide whether to send 10 CHF to the trustee. If the money was sent, it was tripled and transferred to the trustee. The trustee could then decide how much to send back to the trustor (between 0 and 30 CHF). We recorded trustees' behavior with the strategy method. Trustees indicated how much they want to send back in case the trustor decided to send 10 CHF. That is, they indicated their decision without knowing whether the trustor had in fact sent anything.
Participants played five rounds with anonymous counterparts and they did not receive feedback on their counterpart's behavior, except when they found out about their earnings after the payout relevant trial was selected at the end of the experiment. This approach precludes learning and history effects from influencing decisions. The average amount of money that trustees returned to trustors (across the five rounds) constituted our measure of trustworthiness.
After completing the trust games, participants filled out a series of unrelated questionnaires and we took photographs of their faces. All photographs were taken from the same distance against a uniform background and participants were instructed to display a neutral facial expression. Similar to previous research , we cropped the photographs to remove all non-facial features, such as hairstyle and earrings (see Figure 1 for an example). Sixty-three participants consented to having their photographs and behavioral data used in future studies. In the current study, we focused on the photographs and behavioral data of trustees. One trustee was removed from analysis for being considerably older (> 3 SD above the mean) than the rest, leaving a final sample of 31 trustees (14 female).

Participant (trustors).
We recruited a separate sample of 273 participants from the University of Zurich participant pool. In the current study, we focus on 131 participants (Mage = 22.85, SDage = 4.45; 45.80% female, 54.20% male) who were assigned the role of the trustor in the trust game. Participants received a fixed payment of 10 CHF (ca. $11) and were informed that they would receive an additional payment that depended on their behavior in the study. At the end of the study, one round of the trust game was selected at random and participants, both the trustor and the trustee, received their earnings from that round.
Procedure. Participants received the same instructions explaining the trust game as in the first phase of the study. They were informed that they would play 31 rounds in the role of the trustor with different counterparts. In each round, participants saw a photo of the trustee and decided whether to transfer nothing, or 10 CHF of their 12 CHF endowment (see Figure S1).
Participants also indicated what they expected the trustee to do (i.e., how much the trustee would send back in case they transferred the money) by designating amounts between 0 to 30 CHF.
They indicated their confidence in the estimate on an eleven-point Likert scale ranging from "not at all certain" to "very certain". Participants did not receive feedback on their counterparts' behavior. After completing the 31 rounds of the trust game, participants saw the photographs of the trustees again and rated them on various characteristics, including trustworthiness, on a seven-point scale (see Table S1 for a description of all measures). Treatment groups. Participants were randomly assigned to one of two conditions. In the "unmodified" condition (n = 56), participants saw the original facial photographs of the trustees.
In the "modified" condition (n = 75), participants saw photographs of the same 31 trustees, but we used face morphing software to manipulate the perceived trustworthiness of trustees.
Specifically, we used computer-generated face prototypes that reflect the typical appearance of a trustworthy-looking or untrustworthy-looking faces (see Figure 1; Oosterhof & Todorov, 2008).
For each trustee, we created a trustworthy-looking and an untrustworthy-looking version by morphing their face, using the software Psychomorph (Tiddeman et al., 2001), with a trustworthy-looking or untrustworthy-looking face prototype. We transformed each trustee's face shape towards the face shape of the computer-generated prototype by 30%. This procedure created subtle differences in facial appearance (without compromising the realistic nature of the face stimuli), which affects the perceived trustworthiness of trustees (see Figure 1). On approximately half of the 31 rounds, participants in the modified condition saw the untrustworthy-looking (vs. trustworthy-looking) version of the trustee. They only played once with each trustee, that is, they only saw one face version for each trustee.
Analysis strategy. Analyses were based on 1,736 observations in the unmodified condition (56 participants interacting with 31 trustees) and 2,325 observations in the modified condition (75 participants interacting with 31 trustees), which were analyzed separately. All analyses were conducted in R (R Core Team, 2020). We used the lme4 package (Bates et al., 2015) and the lmerTest package (Kuznetsova et al., 2016) to estimate multilevel regression models with random intercepts and slopes. 1 All continuous predictors were z-standardized prior to analysis (full model results are reported in the Supplemental Materials).
We followed the approach proposed by Wagenmakers (2007) to compute associated Bayes factors. Specifically, we estimated regression models with and without the variable of interest and computed the Bayesian information criterion (BIC), an indicator of model fit, for both models. By comparing BICs of both models, we can estimate the extent to which the variable of interest increases model fit. We converted this measure to an approximation of the Bayes factor using the following formula: 10 ≈ ( 2 ), where BF10 represents the Bayes factor in favor of the alternative hypothesis and BIC(H1) and BIC(H0) denote the fit of the models with and without the variable of interests (Wagenmakers, 2007). We used the BayesFactor package with default priors (i.e., a Cauchy distribution with a width of r = √2 2 , Morey & Rouder, 2018) to calculate Bayes factors for t-tests. We always display Bayes factors so that they reflect support for the favored hypothesis (i.e., BF10 when evidence favors the 1 Some models only converged when we implemented simpler random effects structures. Models with maximal and simplified random effects structure yielded very similar effect size estimates and significance levels. We therefore report the results of models with maximal random effects structure throughout the paper. alternative hypothesis and BF01 when evidence favors the null hypothesis). To aid the interpretation of Bayes factors, we classify the evidence as anecdotal, moderate, strong, very strong, or decisive (see Jeffreys, 1961).

Sensitivity analysis.
We conducted sensitivity analyses for our main effect of interest (the relationship between participants' trust decisions and trustees' actual trustworthiness). We used the simr package (Green & Macleod, 2016) in R (R Core Team, 2020) to determine the smallest effect size we were able to detect with 80% power (and α = 5%). The package provides power estimates for fixed effects in multilevel regression models. We varied the effect of interest in our model and calculated power at each level. This showed that we had 80% power to detect an odds ratio of 1.29. Thus, for a one standard deviation increase in trustworthiness, we could detect a change in the probability of trust from, for example, 50.00% to 54.29%. Thus, our design had sufficient power to detect even low levels of accuracy. Figure 1. Exemplary stimuli. The image in the middle shows the original photograph that was displayed to participants in the unmodified condition. This image was morphed with the computer-generated trustworthy-looking and untrustworthy-looking face prototypes on the left and right, respectively, to create realistic faces with decreased or increased perceived facial trustworthiness. These morphed faces were displayed to participants in the modified condition.

Results
Descriptive statistics. Across the five rounds, trustees returned an average of 6.78 CHF (SD = 6.87 CHF) of the transferred money. Nine trustees never returned anything, one trustee always returned half of the transferred money, and no trustee always returned everything. Thus, in the current sample, trust did not pay off on average, as trustees would have to return at least 10 CHF for trustors to break even. In the unmodified condition, trustors sent their 10 CHF  Reliance on facial appearance. First, we examined whether participants who saw the unmodified photographs relied on the facial appearance of trustees when deciding whom to trust.
We estimated a multilevel regression model with random intercepts and slopes per participant in which we regressed participants' trust behavior (0 = did not transfer endowment, 1 = transferred endowment) on their trustworthiness ratings. This yielded a positive effect with very strong evidence in favor of the alternative hypothesis, β = 0.864, SE = 0.173, OR = 2.37, 95% CI [1.65, 3.58], p < .001, BF10 = 41.23 (see Figure 2A, Table S2). Participants were more likely to trust when they perceived their counterparts as trustworthy.
The positive relationship between perceived facial trustworthiness and trust behavior may also reflect a consistency effect. Rather than relying on the facial appearance of counterparts when making trust decisions, participants may have rated counterparts as more trustworthy because they trusted them. We addressed this alternative explanation in two ways. First, we computed average trustworthiness ratings of counterparts across all participants. Using this average trustworthiness rating instead of individual ratings, perceived trustworthiness was again positively related to the probability of trust (with decisive evidence in favor of the alternative hypothesis), β = 0.556, SE = 0.093, OR = 1.74, 95% CI [1.44, 2.16], p < .001, BF10 = 601.0 (see Figure 2B,  Figure 2C, Table S4). Together, these results show that participants relied on the facial appearance of counterparts when making trust decisions. Trustworthiness detection. The above results clearly show that participants use their estimates of facial trustworthiness in their trust decisions. But is this wise given that decisions have real financial consequences? We therefore asked whether participants were able to detect the true trustworthiness of counterparts based on the facial photographs. To address our main research question, we tested whether participants who saw the unmodified photos were more likely to trust counterparts that were actually more trustworthy. We regressed trust behavior on

Trustworthiness Perceptions Causal Effects
the behavioral (not the perceived) trustworthiness of counterparts (i.e., the average amount of money that trustees had returned to trustors), which did not yield a significant effect and very strong evidence in favor of the null hypothesis, β = 0.048, SE = 0.075, OR = 1.05, 95% CI [0.89, 1.23], p = .52, BF01 = 34.50 (see Table S4, Model 1). Thus, participants were not able to detect the true trustworthiness of counterparts based on trustworthiness inferences from photographs, although they clearly use it to inform their decisions.
This last result suggests that the reliance on the facial appearance of counterparts should not pay off, which we tested directly by comparing our participants average performance to that of other strategies. If knowledge about the facial appearance of trustees actually gives trustors a strategic advantage, then participants' earnings across the 31 rounds should be higher than the  Figure 3). Together, these results suggest that having access to the facial appearance of trustees did not give participants a strategic advantage. In fact, knowledge about the base rate of trustworthiness in the current sample of trustees (i.e., the fact that trust did not pay off on average) and a resulting strategy of consistent distrust would have resulted in higher earnings.  (Table S4, Model 2). We also explored whether accuracy varied as a function of trustors' gender or trustors' confidence in the accuracy of their expectations of reciprocity (Table S4, Models 3 and 4), but found no significant results and very strong to decisive evidence in favor of the null hypothesis.
Additional analyses. Two additional variables were recorded that provide additional insights about participants' knowledge of the trustworthiness of the trustees: participants' expectancy of reciprocity and their explicit trustworthiness ratings. We first analyzed whether participants' reciprocity expectation was associated with their counterparts' actual trustworthiness. The relationship between how much participants expected trustees to return and how much they actually returned was not significant with decisive evidence in favor of the null hypothesis, β = 0.050, SE = 0.104, 95% CI [-0.151, 0.249], p = .63, BF01 = 142.1 (Table S5).
There was, however, a significant positive (but small) association between explicit trustworthiness ratings and trustees' actual trustworthiness, β = 0.078, SE = 0.027, 95% CI [0.028, 0.136], p = .004, BF01 = 9.91 (Table S6). It should be noted that this relationship was very small and Bayesian analyses indicated substantial evidence in favor of the null hypothesis.
Thus, overall, we do not find convincing evidence that participants could detect the trustworthiness of counterparts.

Study 2
Results of Study 1 suggest that participants were not able to detect the true trustworthiness of counterparts based on facial photographs. However, decisions in the trust game may be motivated by considerations other than the expected trustworthiness of counterparts. For instance, people may transfer money not because they think that their counterpart will reciprocate trust, allowing them to maximize their earnings, but because there is an injunctive norm to trust and not to question a counterpart's character (Dunning et al., 2014).
In Study 2, we therefore examined trustworthiness detection accuracy with an incentivized prediction task, in which participants' earnings were tied to the accuracy of their predictions.
Participants viewed the same cropped images used in the unmodified condition of Study 1 and predicted the trustworthiness of trustees. We also examined detection accuracy for uncropped images in a separate condition.

Participants.
We recruited a sample of 266 participants from the University of Zurich Procedure and treatment groups. Participants received written instructions that explained the trust game played by the stimulus group. They were asked to view photographs of these players and to guess the behavior of the players as accurately as possible. The instructions were followed by a comprehension test that tested whether participants had understood the game and the manner in which their own payments related to their guessing accuracy. Participants could not begin the study until all comprehension questions had been answered correctly. When viewing images of trustors, participants were asked to guess in what percentage of rounds the person sent 10 CHF, on a scale that ranged from 0% to 100%. When viewing images of trustees, participants were asked to guess the average amount that the trustee sent back, on a scale that ranged from 0 CHF to 30 CHF. For each guess, participants also indicated their confidence in the estimate on an 11-point Likert scale ranging from "not at all certain" to "very certain". Here, we analyze participants' predictions of trustees' behavior.
Of the 266 participants that participated in Study 2, 174 were randomly assigned to the "cropped" condition, and 92 to the "uncropped" condition. In the cropped condition, participants viewed the same set of 31 facial photographs as participants in the unmodified condition of Study 1. That is faces were cropped to remove all non-facial features, such as hairstyle and earrings (see Figure 1). In the "uncropped" condition (n = 92), participants viewed the original images without the oval cropping.
Analysis strategy. We followed the same analysis strategy as in Study 1. For all tests, we report the results of frequentist and Bayesian analyses. We estimated cross-classified multilevel regression models with random intercepts and slopes per participant and trustee (full model results are reported in the Supplemental Materials).

Sensitivity analysis.
We again conducted sensitivity analyses for our main effect of interest (the relationship between predicted and actual trustworthiness in the cropped and uncropped conditions). For participants in the cropped condition, we had 80% power to detect an effect of 0.10. In other words, for a one-point increase in actual trustworthiness, we could detect a 0.10-point increase in predicted trustworthiness. For participants in the uncropped condition, we had 80% power to detect an effect of 0.14. In other words, for a one-point increase in actual trustworthiness, we could detect a 0.14-point increase in predicted trustworthiness. Thus, our design had sufficient power to detect even low levels of accuracy.  (Table S10). Together, these results suggest that participants were not able to predict the trustworthiness of counterparts based on facial photographs.

General Discussion
Can people detect the trustworthiness of strangers based on their facial appearance? Prior studies have yielded mixed results and the question remains the subject of vigorous debate Todorov, Funk, et al., 2015;Wilson & Rule, 2017). Yet, the empirical evidence on the topic is limited. Many studies were based on the same set of stimuli, which limits the generalizability of findings De Neys et al., 2015.
Conversely, studies providing evidence against accuracy relied on statistical techniques that cannot quantify evidence in favor of such a null hypothesis, which complicates the interpretation of results (Efferson & Vogt, 2013;Rule et al., 2013).
We conducted two studies to address these limitations. Confirming results from previous studies (e.g., Jaeger et al., 2019), we found that participants relied on the perceived trustworthiness of counterparts when making trust decisions. However, on average, participants failed to entrust money to counterparts that were actually more trustworthy. Bayesian analyses yielded very strong support for the null hypothesis indicating that our participants were not able to accurately detect the trustworthiness of their interaction partners. We also found that participants' earnings were not higher than the expected earnings of a decision strategy that trusts at random. This suggests that knowledge of their counterparts' facial appearance did not give participants a strategic advantage. In fact, participants would have earned more by consistently distrusting all counterparts, as trust did not pay off in the current sample.
Previous studies found evidence in favor of detection accuracy only under specific conditions, and these conditions varied across studies Tognetti et al., 2013;Verplaetse et al., 2007). Here, we tested these proposed moderators, but found no evidence for better-than-chance trustworthiness detection (a) for male or female counterparts, (b) when making trust decisions or when providing explicit trustworthiness ratings, and (c) when viewing cropped images (in which all non-facial features were removed) or uncropped images. In sum, our results provide consistent evidence against accuracy in trustworthiness detection from faces across various conditions.
Previous investigations have shown that trustworthiness impressions guide decisionmaking in many domains, including legal sentencing, personnel selection, and financial decisionmaking (Olivola et al., 2014). People even rely on trustworthiness impressions from faces when more diagnostic cues are available (Jaeger et al., 2019) and when decisions are highly consequential (Wilson & Rule, 2015). Future studies should explore whether some people are more prone to the biasing influence of first impressions, whether this depends on antisocial character traits (Engelmann, Schmid, et al., 2019), and, importantly how biases could be mitigated (for a first attempt, see Jaeger et al., 2020). An important future task in this line of research will be to delineate how difficult it is to override these biases, particularly when other more reliable information sources are available that may require more cognitive effort to process.
Several limitations and constraints on the generalizability of the current results should be mentioned. Our results were based on samples of relatively young decision-makers from the University of Zurich. Additional studies are needed to examine the generalizability of our findings with larger and more diverse samples. Future studies should also examine the accuracy of trustworthiness impressions using varying types of stimuli. Cropped images, in which all nonfacial aspects are removed, ensure that impressions are actually based on the facial features of counterparts. However, they do not represent the kinds of stimuli that people actually encounter in real life. Ultimately, we believe that studies using a range of different stimuli are needed to map the accuracy of trustworthiness decisions under varying conditions.