Finding my variants in the GWAS Catalog and calculating my polygenic risk scores

Sequencing my genome

This is one of a series of notebooks that document my attempt to learn more about myself through sequencing. I’m a bioinformatics scientist with lots of experience tinkering with genomic data sets. When I heard that Nebula Genomics would sequence my whole genome at 30x coverage for $300 AND let me download all of the data (raw .fastq, ..bam, and .vcf), I jumped on the chance to take a look at my own source code. Nebula provided me with a vcf file containing 4,785,184 QC passing variants.

I want to prioritize which of these millions of variants might deserve a closer look. In a previous post, I annotated these variants to see if any were predicted to disrupt protein function or play a role in disease.

In this post, I’ll show how I searched the GWAS Catalog for variant-phenotype associations and calculated polygenic risk scores. First, I search the GWAS catalog to see which of my genomic variants have been associated with disease phenotypes in genome-wide association studies. This only provides information about single variants, which may be informative in the context of monogenic disease, but individual variants usually have very small effects on complex traits. So next, in order to understand how multiple variants may act together to influence complex traits, I use the PRS Knowledge Base to calculate polygenic risk scores for various traits.

Disclaimer! It is important to understand the limitations of GWAS and why polygenic risk scores need to be interpreted with caution. It would be foolish to make any medical decisions based solely on these results. See here and here for thoughtful discussions about PRS.

Anyone with a vcf file from Nebula and a little experience with Linux and R should be able to recreate these analyses for themselves. I ran everything below on my little personal laptop (i5, 16G RAM) running Windows 11 with an Ubuntu Virtualbox install.

Publically availible tools and databases

The GWAS Catalog (NHGRI-EBI Catalog of human genome-wide association studies) provides a consistent, searchable, visualisable and, freely available database of SNP-trait associations. GWA studies are identified by literature search and assessed by curators, who then extract the reported trait, significant SNP-trait associations, and sample metadata.
gwascat is an R Bioconductor package written by Vincent Carey that provides convenient tools for working with contents of GWAS Catalog database. I use this package to identify GWA studies that found significant associations for my variants. See here for a vignette that describes more of its features.
The PRS Knowledge Base provides tools to calculate polygenic risk scores using data downloaded and curated from the NHGRI-EBI GWAS Catalog. This is the git repo for the PRSKB software.
GATK is incredible software with many tools for genomic analyses. I only use it here to convert the vcf into something easier to read.

Finding my variants in the GWAS Catalog

Load Libraries

start_time <- Sys.time()

library(gwascat)
library(snakecase)
suppressPackageStartupMessages(library(tidyverse))
library(gt)

# Wong, B. Points of view: Color blindness. Nat Methods (2011).
bla <- '#000000'
blu <- '#0072b2'
grb <- '#56b4e9'
lir <- '#cc79a7'
gre <- '#009e73'
red <- '#d55e00'
org <- '#e69f00'
yel <- '#f0e442'
gry <- '#BBBBBB'

make.hyperlink <-  function(myurl,mytext=myurl) {
  paste('<a href="',myurl,'">',mytext,'</a>')
}

Setup

GATK is quick and easy to install.

sudo apt install default-jre
wget https://github.com/broadinstitute/gatk/releases/download/4.2.5.0/gatk-4.2.5.0.zip
unzip gatk-4.2.5.0.zip
# set alias in bash profile 
# alias gatk='/media/sf_dna/apps/gatk-4.2.5.0/gatk'

I convert the vcf provided by Nebula to a tab delimited text file with GATK. This makes the data a little easier to read and can be fed into R as a data frame. I also replace my Nebula ID with ‘myVariants’ to make downstream scripts more generic.

gatk VariantsToTable -V nebula.vcf -F CHROM -F POS -F TYPE -F ID -F ANN -F LOF -F NMD -GF AD -GF DP -GF GQ -GF GT -O myVariants.ann.txt

nebulaID="##########" 
sed -i "1s/$nebulaID/myVariants/g" myVariants.ann.txt

Search the GAWS Catalog

Create a data frame that joins rsIDs from the Nebula vcf with rsIDs in the catalog. I find 171,943 associations for 81,834 variants.

myVariants_path <- "C:/Users/jmcgirr/dna/nebula/vcf/myVariants.txt"

# Nebula vcf
vcf <- read_tsv(myVariants_path, col_select = c("CHROM","POS","TYPE","ID","myVariants.GT")) |>
       # only check SNPs
       filter(TYPE == "SNP") |> select(-c(TYPE)) |>
       separate(myVariants.GT, c("vcf_allele1","vcf_allele2"), sep = "/|\\|") |>
       # remove deletions labeled as "*"
       filter(vcf_allele1 != "*",vcf_allele2 != "*") |>
       # remove rows with missing rsid
       filter(str_detect(ID,fixed("rs")))
head(as.data.frame(vcf))

##   CHROM   POS           ID vcf_allele1 vcf_allele2
## 1  chr1 10904   rs10218493           G           A
## 2  chr1 10927   rs10218527           A           G
## 3  chr1 13757  rs879805273           G           A
## 4  chr1 13813 rs1213979446           T           G
## 5  chr1 13838   rs28428499           C           T
## 6  chr1 15274    rs2758118           G           T

gwtrunc <- makeCurrentGwascat()
#topTraits(gwtrunc)

catalog_matches <-  as.data.frame(gwtrunc) |> 
                    filter(SNPS %in% vcf$ID) |>
                    select(-c(DATE.ADDED.TO.CATALOG, FIRST.AUTHOR, width, STUDY, REGION, CHR_ID,
                              UPSTREAM_GENE_ID, DOWNSTREAM_GENE_ID, SNP_GENE_IDS, UPSTREAM_GENE_DISTANCE,
                              DOWNSTREAM_GENE_DISTANCE,MERGED, SNP_ID_CURRENT,PVALUE_MLOG,PLATFORM..SNPS.PASSING.QC.,
                              STUDY.ACCESSION,P.VALUE..TEXT.,PUBMEDID)) |>
                    separate(STRONGEST.SNP.RISK.ALLELE, c("STRONGEST_SNP","RISK_ALLELE"), sep = "-" ,remove = TRUE) |>
                    select(-c(STRONGEST_SNP)) |>
                    arrange(P.VALUE) |>
                    inner_join(vcf, by = c("SNPS" = "ID"))

names(catalog_matches) <- to_snake_case(names(catalog_matches))

match_stats <- data.frame(column = c("rsIDs",
                                   "phenotypes",
                                   "journals"),
                          n = c(length(unique(catalog_matches$snps)),
                                length(unique(catalog_matches$mapped_trait)),
                                length(unique(catalog_matches$journal))))

match_stats |> gt() |> tab_header(title = "Catalog Matches")

Catalog Matches
column	n
rsIDs	81834
phenotypes	3733
journals	598

catalog_matches[4:9,] |> gt() |> tab_header(title = "Examples of Catalog Matches")

Examples of Catalog Matches
seqnames	start	end	strand	date	journal	link	disease_trait	initial_sample_size	replication_sample_size	chr_pos	reported_gene_s	mapped_gene	risk_allele	snps	context	intergenic	risk_allele_frequency	p_value	or_or_beta	x_95_ci_text	cnv	mapped_trait	mapped_trait_uri	genotyping_technology	chrom	pos	vcf_allele_1	vcf_allele_2
4	71744491	71744491	*	2016-08-17	PLoS Genet	www.ncbi.nlm.nih.gov/pubmed/27532455	Blood protein levels	1,340 European ancestry individuals	NA	71744491	NR	GC	?	rs222047	intron_variant	0	NA	1e-307	1.153000	[NR] unit decrease	N	blood protein measurement	http://www.ebi.ac.uk/efo/EFO_0007937	Genome-wide genotyping array	chr4	71744491	C	A
16	56951227	56951227	*	2020-03-23	PLoS Med	www.ncbi.nlm.nih.gov/pubmed/32203549	HDL cholesterol levels	403,943 European ancestry individuals	NA	56951227	CETP	HERPUD1 - CETP	A	rs9989419	regulatory_region_variant	1	0.394080	1e-307	0.143745	[0.14-0.15] unit decrease	N	high density lipoprotein cholesterol measurement	http://www.ebi.ac.uk/efo/EFO_0004612	Genome-wide genotyping array	chr16	56951227	A	G
11	116778201	116778201	*	2020-03-23	PLoS Med	www.ncbi.nlm.nih.gov/pubmed/32203549	HDL cholesterol levels	403,943 European ancestry individuals	NA	116778201	ZNF259	ZPR1	G	rs964184	3_prime_UTR_variant	0	0.133633	1e-307	0.105482	[0.1-0.11] unit decrease	N	high density lipoprotein cholesterol measurement	http://www.ebi.ac.uk/efo/EFO_0004612	Genome-wide genotyping array	chr11	116778201	C	C
16	89968733	89968733	*	2018-05-08	Nat Commun	www.ncbi.nlm.nih.gov/pubmed/29739929	Low tan response	46,768 European ancestry low tanning cases, 74,528 European ancestry moderate and high tanning controls	15,547 European ancestry low tanning cases, 39,835 European moderate and high tanning ancestry controls	89968733	NR	DEF8 - CENPBD1	A	rs4785752	intergenic_variant	1	NA	1e-307	0.345000	[0.24-0.45] unit increase	N	suntan	http://www.ebi.ac.uk/efo/EFO_0004279	Genome-wide genotyping array	chr16	89968733	G	G
16	89579029	89579029	*	2018-05-08	Nat Commun	www.ncbi.nlm.nih.gov/pubmed/29739929	Low tan response	46,768 European ancestry low tanning cases, 74,528 European ancestry moderate and high tanning controls	15,547 European ancestry low tanning cases, 39,835 European moderate and high tanning ancestry controls	89579029	MC1R	CPNE7	G	rs369230	intron_variant	0	NA	1e-307	0.469000	[0.34-0.6] unit increase	N	suntan	http://www.ebi.ac.uk/efo/EFO_0004279	Genome-wide genotyping array	chr16	89579029	T	T
20	34077942	34077942	*	2018-05-08	Nat Commun	www.ncbi.nlm.nih.gov/pubmed/29739929	Low tan response	46,768 European ancestry low tanning cases, 74,528 European ancestry moderate and high tanning controls	15,547 European ancestry low tanning cases, 39,835 European moderate and high tanning ancestry controls	34077942	RALY, ASIP	RALY	A	rs6059655	intron_variant	0	NA	1e-307	0.527000	[0.36-0.69] unit increase	N	suntan	http://www.ebi.ac.uk/efo/EFO_0004279	Genome-wide genotyping array	chr20	34077942	G	G

Heterozygous for risk allele

As a first pass I’ll look at variants for which I have a copy of the allele implicated as the risk allele. I filter and order by p-value to see the strongest associations.

p_value < 1e-20

het_or_hom_for_risk_allele <- filter(catalog_matches,vcf_allele_1 == risk_allele | vcf_allele_2 == risk_allele, p_value < 1e-20) |> 
                              arrange(risk_allele_frequency, p_value) |>
                              select(c(disease_trait, p_value, or_or_beta, risk_allele, snps, link))

head(het_or_hom_for_risk_allele) |> gt() |> 
  tab_header(title = "Heterozygous for risk allele") |> 
  fmt(columns = 'link',fns = make.hyperlink)

Heterozygous for risk allele
disease_trait	p_value	or_or_beta	risk_allele	snps	link
Total cholesterol levels	2e-55	0.1210000	C	rs964184	www.ncbi.nlm.nih.gov/pubmed/28334899
Monocyte count	4e-136	0.6380717	C	rs140221307	www.ncbi.nlm.nih.gov/pubmed/27863252
Monocyte percentage of white cells	6e-114	0.5812110	C	rs140221307	www.ncbi.nlm.nih.gov/pubmed/27863252
Granulocyte percentage of myeloid white cells	3e-78	0.4813244	C	rs140221307	www.ncbi.nlm.nih.gov/pubmed/27863252
Platelet distribution width	5e-56	0.3910234	T	rs142680548	www.ncbi.nlm.nih.gov/pubmed/27863252
Monocyte count	1e-307	0.6809273	C	rs140221307	www.ncbi.nlm.nih.gov/pubmed/32888494

Homozygous for rare risk allele

Next I’ll be more strict and look at variants for which I am homozygous for the risk allele. I further filter to only include those that are low frequency and showed strong associations with the phenotype being studied.

risk_allele_frequency < 0.1 and p_value < 1e-20

homozygous_for_risk_allele <- filter(catalog_matches,vcf_allele_1 == risk_allele, vcf_allele_2 == risk_allele, p_value < 1e-20) |> 
                              arrange(risk_allele_frequency, p_value)

homozygous_for_risk_allele_low_freq <- filter(catalog_matches,vcf_allele_1 == risk_allele, vcf_allele_2 == risk_allele, risk_allele_frequency < 0.1, p_value < 1e-20) |> 
                                       arrange(p_value) |>
                                       select(c(disease_trait, p_value, or_or_beta, risk_allele, snps, link))

p1 <- catalog_matches |> 
      ggplot(aes(risk_allele_frequency)) +
      geom_histogram(bins = 100) +
      theme_minimal() +
      theme(axis.title.x=element_text(size=14),
       axis.title.y=element_text(size=12),
       axis.title=element_text(size=14),
       axis.text=element_text(size=12),
       plot.title=element_text(size=18)) +
      xlim(0,1) +
      ggtitle("Distribution of risk allele frequencies") +
      geom_vline(xintercept = 0.1, color = red, linetype = 2)
print(p1)

head(homozygous_for_risk_allele_low_freq) |> gt() |> 
  tab_header(title = "Homozygous for risk allele") |>
  fmt(columns = 'link',fns = make.hyperlink)

Homozygous for risk allele
disease_trait	p_value	or_or_beta	risk_allele	snps	link
Serum levels of protein BTD	3e-126	1.2828700	G	rs71627145	www.ncbi.nlm.nih.gov/pubmed/35078996
Systemic lupus erythematosus	4e-68	2.2820000	A	rs34723256	www.ncbi.nlm.nih.gov/pubmed/33272962
Total cholesterol levels	2e-55	0.1210000	C	rs964184	www.ncbi.nlm.nih.gov/pubmed/28334899
Hematology traits	5e-47	0.0612000	G	rs4657616	www.ncbi.nlm.nih.gov/pubmed/23263863
Urinary metabolites (H-NMR features)	9e-37	0.6700000	G	rs40200	www.ncbi.nlm.nih.gov/pubmed/24586186
Male-pattern baldness	2e-27	0.0604074	C	rs17250872	www.ncbi.nlm.nih.gov/pubmed/30573740

Polygenic risk scores

Setup

The polygenic risk score calculator from the PRS Knowledge Base inputs either a vcf or a txt with lines formatted as rsID:allele1,allele2. I chose to create the txt input. The following uses the genotype information output by GATK to create a list of all variants.

Table is written out and used as input for the PRSKB command line tool.

myVariants_path <- "C:/Users/jmcgirr/dna/nebula/vcf/myVariants.txt"

# Nebula vcf
rsid_and_genotype <- read_tsv(myVariants_path, col_select = c("ID","myVariants.GT")) |>
  mutate(myVariants.GT = gsub("/|\\|", "," ,myVariants.GT)) |>
  # remove deletions labeled as "*"
  #filter(vcf_allele1 != "*",vcf_allele2 != "*") |>
  # remove rows with missing rsid
  filter(str_detect(ID,fixed("rs"))) |>
  unite("ID_genotype", c("ID", "myVariants.GT"), sep = ":")
head(as.data.frame(rsid_and_genotype))

##                            ID_genotype
## 1 rs376342519:CCGCCGTTGCAAAGGCGCGCCG,C
## 2                       rs10218493:G,A
## 3                       rs10218527:A,G
## 4                      rs879805273:G,A
## 5                     rs1213979446:T,G
## 6                       rs28428499:C,T

# write.table(rsid_and_genotype,"C:/Users/jmcgirr/dna/prskb/rsid_and_genotype.txt", row.names = FALSE, quote = FALSE, sep = "\t", col.names = FALSE)

PRSKB command line

Required installed programs: Bash and jq for bash, Python3 and the PyVCF, filelock, and requests Python modules.

wget https://prs.byu.edu/download_cli

./runPrsCLI.sh -f /media/sf_dna/prskb/rsid_and_genotype.txt -o PRS_default.tsv -r hg38 -c 0.05 -p EUR

Identifying the percentile rank of my PRS

Since PRS is a relative measure, percentile rank is a way to understand where I fall within the population of individuals included in the study.

I filter to only look at entries with percentiles calculated and used more than 10 SNPs in the calculation.

prs_output_path <- "C:/Users/jmcgirr/dna/prskb/PRS_default.tsv"

prs <- read_tsv(prs_output_path) |>
  filter(!is.na(Percentile)) |>
  filter(`SNP Overlap` > 10) |>
  arrange(desc(Percentile))
#view(prs)

# high_percentile <- tail(prs,(nrow(prs)-8572)) |>
#   filter(`SNP Overlap` > 10)
# nrow(high_percentile)

head(select(prs,c(`Reported Trait`, `Trait`, `Citation`, `Included SNPs`, `Polygenic Risk Score`, `Percentile`))) |> 
  gt() |> 
  tab_header(title = "PRSKB command line tool output")

PRSKB command line tool output
Reported Trait	Trait	Citation	Included SNPs	Polygenic Risk Score	Percentile
Phosphatidylcholine-O(39:1)_[M+H]1+/Phosphatidylethanolamine-P(39:0)_[M+H]1+ Levels	Phosphatidylcholine Measurement	Harshfield et al. 2021	17	0.02047195489920295	99-100
Cholesterol_[M+H-H2o]1+ Levels	Sterol Measurement	Harshfield et al. 2021	21	-0.005463744170670547	99-100
Bipolar I Disorder	Bipolar I Disorder	Stahl et al. 2019	59	1.051042489158195	99
Waist-To-Hip Ratio Adjusted For Body Mass Index	Bmi-Adjusted Waist-Hip Ratio	Shungin et al. 2015	23	0.018457758580973937	99
Waist-To-Hip Ratio Adjusted For Body Mass Index	Bmi-Adjusted Waist-Hip Ratio	Shungin et al. 2015	23	0.01816876299264591	99
Dupuytren's Disease	Dupuytren Contracture	Ng et al. 2017	18	1.1085373026007015	99

A cautionary example

You can see that the percentile I fall within can vary widely for some traits between different studies

trait_prs <- filter(prs, str_detect(Trait, "Rheumatoid Arthritis"))
trait_prs <- filter(prs, str_detect(`Reported Trait`, "Prostate Cancer"))


head(select(trait_prs,c(`Reported Trait`, `Trait`, `Citation`, `Included SNPs`, `Polygenic Risk Score`, `Percentile`))) |> 
  gt() |> 
  tab_header(title = "PRSKB command line tool output")

PRSKB command line tool output
Reported Trait	Trait	Citation	Included SNPs	Polygenic Risk Score	Percentile
Prostate Cancer	Prostate Carcinoma	Hoffmann et al. 2015	27	1.057623148439582	7
Prostate Cancer	Prostate Carcinoma	Eeles et al. 2013	22	1.0380169625339535	68
Prostate Cancer	Prostate Carcinoma	Rashkin et al. 2020	61	1.0566025143469873	63
Prostate Cancer	Prostate Carcinoma	Schumacher et al. 2018	132	1.0374492178389345	51
Prostate Cancer	Prostate Carcinoma	Emami et al. 2020	47	1.0765225210586062	50
Prostate Cancer	Prostate Carcinoma	Takata et al. 2019	36	1.078003117178711	31

Column descriptions for PRSKB output

Copied from https://github.com/kauwelab/PolyRiskScore

Study ID – The study identifier assigned by the GWAS Catalog (or the user if they uploaded their own GWAS summary statistics)
Reported Trait – Trait based on the phenotype being studied, as described by the authors
Trait – Trait assigned by the GWAS Catalog, standardized from the Experimental Factor Ontology
Citation – The citation of the study
P-Value Annotation – Additional information about the p-values
Beta Annotation – Additional information about the beta values
Score Type – This indicates if the study used odds ratios or beta values
Units (if applicable) – This column will contain the beta units if the Score Type is beta.
SNP Overlap – Details the number of SNPs that are in the sample vcf/txt file which are 1. in the study, 2. not excluded from the calculation (see below), and 3. not removed from the calculation due to linkage-disequilibrium clumping.
SNPs Excluded Due To Cutoffs – Details the number of snps excluded from the study calculation due to p-value cutoff or minor allele frequency threshold
Included SNPs – The total number of SNPs included in the calculation
Used Super Population – The super population used for linkage disequillibrium
Columns Available Only In The Full Version
Percentile – Indicates the percentile rank of the samples polygenic risk score *(also included in the condensed version of .txt input files)
Protective Variants – Variants that are protective against the phenotype of interest
Risk Variants – Variants that add risk for the phenotype of interest
Variants Without Risk Alleles – Variants that are present in the study, but the sample does not possess the allele reported with association. Note that a SNP may be in this list and also in the Protective Variants or Risk Variants list. This is caused by an individual being heterozygous for the alleles at that point.
Variants in High LD – Variants that are not used in the calculation, due to them being in high linkage disequillibrium with another variant in the study.

Notes

Git repo

https://github.com/joemcgirr/joe_genome/

R run time and session info

end_time <- Sys.time()
print(end_time - start_time)

## Time difference of 6.870454 mins

sessionInfo()

## R version 4.1.3 (2022-03-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22000)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] gt_0.4.0         forcats_0.5.1    stringr_1.4.0    dplyr_1.0.8     
##  [5] purrr_0.3.4      readr_2.1.2      tidyr_1.2.0      tibble_3.1.6    
##  [9] ggplot2_3.3.5    tidyverse_1.3.1  snakecase_0.11.0 gwascat_2.26.0  
## 
## loaded via a namespace (and not attached):
##   [1] colorspace_2.0-3            rjson_0.2.21               
##   [3] ellipsis_0.3.2              XVector_0.34.0             
##   [5] GenomicRanges_1.46.1        fs_1.5.2                   
##   [7] rstudioapi_0.13             farver_2.1.0               
##   [9] bit64_4.0.5                 AnnotationDbi_1.56.2       
##  [11] fansi_1.0.3                 lubridate_1.8.0            
##  [13] xml2_1.3.3                  splines_4.1.3              
##  [15] snpStats_1.44.0             cachem_1.0.6               
##  [17] knitr_1.38                  jsonlite_1.8.0             
##  [19] Rsamtools_2.10.0            broom_0.7.12               
##  [21] dbplyr_2.1.1                png_0.1-7                  
##  [23] compiler_4.1.3              httr_1.4.2                 
##  [25] backports_1.4.1             assertthat_0.2.1           
##  [27] Matrix_1.4-0                fastmap_1.1.0              
##  [29] cli_3.2.0                   htmltools_0.5.2            
##  [31] prettyunits_1.1.1           tools_4.1.3                
##  [33] gtable_0.3.0                glue_1.6.2                 
##  [35] GenomeInfoDbData_1.2.7      rappdirs_0.3.3             
##  [37] Rcpp_1.0.8.3                Biobase_2.54.0             
##  [39] cellranger_1.1.0            jquerylib_0.1.4            
##  [41] vctrs_0.4.0                 Biostrings_2.62.0          
##  [43] rtracklayer_1.54.0          xfun_0.30                  
##  [45] rvest_1.0.2                 lifecycle_1.0.1            
##  [47] restfulr_0.0.13             XML_3.99-0.9               
##  [49] zlibbioc_1.40.0             scales_1.1.1               
##  [51] vroom_1.5.7                 BSgenome_1.62.0            
##  [53] VariantAnnotation_1.40.0    hms_1.1.1                  
##  [55] MatrixGenerics_1.6.0        parallel_4.1.3             
##  [57] SummarizedExperiment_1.24.0 yaml_2.3.5                 
##  [59] curl_4.3.2                  memoise_2.0.1              
##  [61] sass_0.4.1                  biomaRt_2.50.3             
##  [63] stringi_1.7.6               RSQLite_2.2.11             
##  [65] highr_0.9                   S4Vectors_0.32.4           
##  [67] BiocIO_1.4.0                checkmate_2.0.0            
##  [69] GenomicFeatures_1.46.5      BiocGenerics_0.40.0        
##  [71] filelock_1.0.2              BiocParallel_1.28.3        
##  [73] GenomeInfoDb_1.30.1         rlang_1.0.2                
##  [75] pkgconfig_2.0.3             matrixStats_0.61.0         
##  [77] bitops_1.0-7                evaluate_0.15              
##  [79] lattice_0.20-45             labeling_0.4.2             
##  [81] GenomicAlignments_1.30.0    bit_4.0.4                  
##  [83] tidyselect_1.1.2            magrittr_2.0.3             
##  [85] bookdown_0.25               R6_2.5.1                   
##  [87] IRanges_2.28.0              generics_0.1.2             
##  [89] DelayedArray_0.20.0         DBI_1.1.2                  
##  [91] pillar_1.7.0                haven_2.4.3                
##  [93] withr_2.5.0                 survival_3.2-13            
##  [95] KEGGREST_1.34.0             RCurl_1.98-1.6             
##  [97] modelr_0.1.8                crayon_1.5.1               
##  [99] utf8_1.2.2                  BiocFileCache_2.2.1        
## [101] tzdb_0.3.0                  rmarkdown_2.13             
## [103] progress_1.2.2              grid_4.1.3                 
## [105] readxl_1.4.0                blob_1.2.2                 
## [107] rmdformats_1.0.3            reprex_2.0.1               
## [109] digest_0.6.29               stats4_4.1.3               
## [111] munsell_0.5.0               bslib_0.3.1

Finding my variants in the GWAS Catalog and calculating my polygenic risk scores