The molecular basis of socially mediated phenotypic plasticity in a eusocial paper wasp

Phenotypic plasticity, the ability to produce multiple phenotypes from a single genotype, represents an excellent model with which to examine the relationship between gene expression and phenotypes. Analyses of the molecular foundations of phenotypic plasticity are challenging, however, especially in the case of complex social phenotypes. Here we apply a machine learning approach to tackle this challenge by analyzing individual-level gene expression profiles of Polistes dominula paper wasps following the loss of a queen. We find that caste-associated gene expression profiles respond strongly to queen loss, and that this change is partly explained by attributes such as age but occurs even in individuals that appear phenotypically unaffected. These results demonstrate that large changes in gene expression may occur in the absence of outwardly detectable phenotypic changes, resulting here in a socially mediated de-differentiation of individuals at the transcriptomic level but not at the levels of ovarian development or behavior.


47
The relationship between gene expression and external phenotype is complex and unresolved.

48
Much research in behavioural and evolutionary ecology is based on the implicit assumption that 49 phenotypic traits can be modelled as though they directly reflect gene expression patterns, and 50 that evolutionary trajectories can therefore be studied while remaining agnostic with regard to the   2018), allowing changes in the behavioural, physiological and molecular traits that define caste 84 identity to be tracked. An additional benefit-and challenge-of studying social insect colonies is 85 that they involve complex social structures. Such interactions can be hard to study, but offer the 86 opportunity to assess the effects of social interactions upon phenotypes and transcriptomes.

88
In the present study, we explore the relationship between transcriptome and phenotype in the   their sensitivity, SVMs should be ideally suited to quantify gene expression variation across the 122 spectrum between differentiated worker and queen roles, but to our knowledge this is the first time 123 SVMs have been used to interrogate the molecular basis of behavioural traits.

125
Applying an SVM approach along with standard differential expression and gene co-expression 126 analyses, we show that brain gene expression responses to queen removal in P. dominula include 127 a colony-wide response that does not match that observed at the phenotypic level. Our results

128
indicate that gene expression in P. dominula colonies reflects both a generalised response to 129 queen loss which is independent of phenotype, and a phenotype-specific response that tracks

143
Ovarian and dominance indices from queens and control workers were used to produce a logistic regression model for caste 144 classification (0=worker, 1=queen). Data from individuals on queenless post-removal nests were then passed through this model to fit 6 caste estimates and thereby identify individuals with high 'queenness', i.e. those that exhibited strongly queen-like phenotypes following 146 queen removal and thus represented possible replacement queens. Of the individuals for which data are shown here, 27 queens, 12 147 workers from control nests, and 62 individuals from queen removal nests were subsequently sequenced to generate the data discussed 148 in the present study.

163
applying a process of iterative feature selection we were able to remove genes that had low 164 weights in the classification and contributed to overfitting ( Figure S1A; Table S1). Doing so 165 allowed us to identify an optimal model containing 1992 genes with a substantially reduced root 166 mean squared classification error of 0.021 (Table S2)

177
The 1992 genes in our optimised SVM represent a much larger set than that which is found to 178 differentiate queens and workers based on a standard differential expression approach. When we 179 applied a DESeq2 analysis with a 1.5 fold-change threshold to the same set of queens and

227
Interestingly, the strong colony-wide perturbation following queen loss that this SVM approach 228 identifies would have been entirely missed using a standard differential expression approach:

229
DESeq2 identified just five genes as differentially expressed between control workers and 230 individuals from manipulated nests, with no associated GO enrichment (Table S4)

268
In order to further discern the factors that shaped individuals' gene expression profiles following 269 queen removal, we performed weighted gene co-expression network analysis (WGCNA) using the 270 full set of annotated genes. We generated 22 consensus modules across all samples, and then 271 determined which traits significantly predicted the expression of a given module within each group 272 (Figure 2D-F). In doing so, we identified three modules of particular interest.

274
Module 11 (Red) consists of 519 genes associated with 14 enriched GO terms, including a number 275 of terms associated with molecular binding (Figure 2G). This module is notable in that its  This study provides one of the most comprehensive analyses to date regarding the transcriptomic 298 signatures of plastic phenotypes, and the first to employ an SVM approach to interrogate the 299 molecular basis of behaviour. We have applied this approach to identify caste-specific gene 300 expression profiles in P. dominula and to explore the relationship between transcriptomic and 301 plastic phenotypic changes following a major social perturbation in paper wasp colonies. We 302 identify a set of nearly 2000 genes that optimally capture gene expression differences between 12 established queens and workers. Using a caste classifier based on these genes, we find that

338
Our discovery of colony-wide responses to queen loss suggests that this social perturbation 339 provokes a significant reaction even from individuals that have little hope of attaining the vacant 340 reproductive role. This is a surprising finding given that P. dominula is thought to express a 13 'conventional' gerontocratic mechanism of dominance and queen succession that mitigates the 342 need for costly intragroup conflicts over the identity of the replacement queen (Pardi 1948

363
To our knowledge, this study represents the first application of a support vector classification 364 approach to behavior-associated transcriptomic data. Using this approach we identified a large 365 group of genes as differing meaningfully between Polistes castes-over 10% of annotated genes, 366 a much larger set than those found using standard analytic approaches in this study and others

451
was run on all groups and contrasts were then calculated for each pair of groups. Differential expression was 452 calculated relative to a baseline fold change of 1.5, i.e. p-values refer to the probability that absolute change 453 between two groups was greater than 50%. Genes were considered differentially expressed between 454 conditions if p<0.05 after false discovery rate correction according to the Benjamini-Hochberg procedure.

456
Gene co-expression network analysis

458
Weighted gene co-expression network analysis was performed in R using the WGCNA package (Langfelder

459
& Horvath 2008). As WGCNA is particularly sensitive to genes with low expression, data were first subjected 460 to a second round of filtering in which genes that had <10 reads in >90% of samples were removed.

461
Consensus gene modules across all samples were then constructed using a soft-threshold power of 9.