Evaluating the Impact of Physiological Variability in Genome-Wide Association Studies of Resting Heart Rate

Genome-wide association studies (GWAS) have discovered hundreds of genetic loci for resting heart rate (RHR). However, the impact of intra-individual variation in RHR on GWAS results is unclear. We evaluated this impact by analyzing two RHR recordings from N ~61,000 subjects from UK Biobank. In addition, we modelled variations in RHR as independent white zero-mean Gaussian noise with a standard deviation of 0.5x, 1x, and 2x the standard deviation of the difference between the original RHR values (4, 8, and 16 bpm, respectively). The two original RHR recordings were highly correlated ( ρ =0.77), but results from the genetic analyses were slightly different: the number of genome-wide significant (p < 5x10 -8 ) variants at the locus with the strongest reported association (MYH6): n=39 vs. n=34; the p-value of the corresponding lead-variant, 3.6x10 -24 vs. 2.1x10 -19 ; and the estimated heritability 20.0% vs. 16.7%. Simulated data showed an inverse relationship between RHR variation and genetic association strength and heritability. Results formally demonstrate the impact of intra-individual RHR variability on the discovery of genetic variants in single-measurement studies.

Importantly, the reliability and interpretation of GWAS results relies on the reproducibility of the phenotypic data.From this perspective, heart rate is not constant but varies as a result of physiological (autonomic and hormonal control) and environmental factors (time of day, mental stress, etc) [6].As a result, RHR measurements taken from the same individuals can differ significantly depending on when, where and how the measurements were taken.Studies investigating the genetic associations of blood pressure have estimated a 20% gain in statistical power with long-term average measurements as compared to single-measurement associations.However, it is unclear how the level of variation in RHR affects the discovery of genetic variants.Increased may introduce false-negative GWAS results by increasing the p-values of associations to values above significance threshold.Alternatively, it is also possible that increased variation may promote overestimation of certain associations leading to falsepositive results (e.g.spurious associations).Replication of genetic association results in an independent cohort is important to lessen the likelihood of false positives.However, if replication data is based on data with increased variation in intra-individual RHR, it may impair replication of genetic associations.
The objective of this work was to evaluate the effect of variability in RHR measurements on genetic association results by conducting a physiological and a simulation study based on a large dataset of two repeated recordings of RHR from the UK Biobank study [7].

Materials
Two automated recordings of RHR were obtained from N=74,725 participants from the UK Biobank study.RHR was both recorded during a blood pressure assessment (referred to as RHR recording 1), and during the pre-exercise resting phase of an ECG exercise stress test (referred to as RHR recording 2).We applied genetic quality control to exclude individuals with poor genotype quality including high heterozygosity / missingness and sex discordance, as supplied by UK Biobank (N=3126) and restricted our analysis to individuals with European ancestry (N=65,042).In line with previous studies on RHR, we further excluded individuals with existing cardiovascular conditions known to affect RHR (including atrial fibrillation, history of myocardial infarction or heart failure, (supra)-ventricular tachycardia) as well as individuals on RHR altering medications (nondihydropyridine calcium antagonists).Individuals with extreme RHR measurements (<40 or >120 bpm) were also excluded.The remaining 60,913 individuals were selected for analysis.

Simulation study
In addition to physiological variability observed by comparing the two RHR recordings, we also simulated physiological variability at different levels by creating additive zero-mean Gaussian noise models with standard deviation 0.5x, 1x, and 2x σHR (4 ,8, and 16bpm, respectively), where σHR represents the standard deviation of the difference between the two original RHR measurements.These values were added in a random order to the first of the two original RHR recordings, as the corresponding heritability was higher compared to the second recording.The simulation was repeated 10 times for each variability level.

Genetic studies
We performed a genetic association study of chromosome 14, which harbours the locus with the strongest reported association for RHR (MYH6) [3].Each RHR trait (two original traits and 30 simulated traits) were analysed using linear mixed model method (BOLT-LMM) [8] under the additive genetic model including ~328,000 imputed single nucleotide variants (SNVs) with minor allele frequency ≥ 1% and imputed quality > 0.3.The model included sex, age, and age 2 as covariates, in line with previous studies [3].Three methods were then used to compare the genetic results between traits and we used the results from the first original RHR recording as reference value: 1) Counting the number of genome-wide significant (P < 5x10 -8 ) variants at MYH6, and measuring the P-value of the lead variant (rs365990, discovered in RHR recording 1).
2) Counting the number of genome-wide significant loci and the overlap with loci discovered for RHR recording 1.
3) Comparing the estimated heritability.Individual loci were defined based on genomic distance of >500 Kb to each side of the lead variants.The heritability was estimated using BOLT-REML [8] with the same covariates included as in the GWAS.

Physiological study
The correlation between the two original RHR measurements was 0.77.From both recordings, the strongest genetic associations were obtained from the first recording: lead variant rs365990, P-value 3.6 x 10 -24 , compared to rs422068 with P-value 2.1 x 10 -19 for the second recording.Both variants were located within only 3kb of each other.The number of genome-wide significant SNVs at MYH6 was also higher for the first recording: N=39 compared to N=34 (all included in the N=39) for RHR recording 2. Finally, the estimated heritability was higher for recording 1 compared to recording 2: 20.0% versus 16.9%, respectively.

Discussion
This work formally demonstrates the impact of physiological variability in RHR on the discovery of associated genetic variants.The main finding is that increased variation in RHR attenuates the strength of genetic associations and heritability estimates, but we did not observe evidence for spurious associations (e.g.falsepositive associations).
Variability in RHR was observed between the two original RHR recordings.The fact that he second recording was taken just before an exercise stress test may have increased the variability as subjects may have experienced an anticipatory stress response to the actual stress test RHR [9].We observed that the genetic associations were attenuated in the second recording demonstrating a potential effect of increased RHR variability.This finding supports previous studies that have shown that averaging blood pressure measurements increased statistical power to detect genetic associations compared to a single-measurement GWAS [10].
The results from the physiological study were compatible with the results from the simulation study.The simulation study highlights important implications on GWAS results.For example, even for 8 bpm of variation, on average only 2 out the 3 loci were discovered and the estimated heritability dropped from 20% to approximately 15%.Although our results are based on the analysis of a single chromosome, our results suggest a potential loss of discoverable loci over the whole genome because of (increased) physiological variation in heart rate.
This work is a first explorative study on the effect of RHR variation on GWAS results and has some limitations.RHR variations were modelled as independent white Gaussian noise, whereas, physiologically, the magnitude of the variation may depend on the RHR itself.For example, Fig. 2 shows that simulation data slightly overestimates the decrease in genetic association strength compared to the original data.Furthermore, we did not observe evidence of spurious associations, but the number of simulations was limited and we only investigated a section of the genome (chromosome 14).Future studies may increase the number of genetic variants to further evaluate false-  positive associations.However, given the consistency of increasing P-values and decreasing heritability measures, we suspect false-positive results are unlikely to occur.In summary, this work highlights the important implications of physiological variability of RHR on the genetic association results.The importance of averaging measurements to improve accuracy is well established, but our results provide a more detailed overview of the impact of different levels of variation on genetic results for a single-measurement RHR GWAS.Whilst results focus on RHR, they may have broader implications, as many cardiovascular traits are prone to physiological variation.

Fig. 1 :
Fig. 1: Correlation between original and simulated RHR measurements.Black circles and error bars show the median correlation coefficient and interquartile range for each simulated level of variation.For comparison, the red triangle shows the variation and correlation measured between the two original recordings.

Fig. 2 :
Fig. 2: Effect of simulated and real physiological variation in RHR on GWAS results and heritability estimates.(A) number of genome-wide significant (GWS) SNVs at MYH6 locus, (B) -log10 of the P-value of the lead variant and genome-wide significance threshold (p=5x10 -8, red horizontal line) , and (C) estimated heritability, all plotted as function of the simulated variation level and with respect to the original results.Bottom row shows same parameters but plotted against phenotypic correlation between the simulated and original RHR data.Black circles represent median values from the simulation results and corresponding interquartile ranges are given by error bars.For comparison, red triangles present the results from comparing the two original RHR recordings.