Finding my variants in the GWAS Catalog and calculating my polygenic risk scores

Sequencing my genome

This is one of a series of notebooks that document my attempt to learn more about myself through sequencing. I’m a bioinformatics scientist with lots of experience tinkering with genomic data sets. When I heard that Nebula Genomics would sequence my whole genome at 30x coverage for $300 AND let me download all of the data (raw .fastq, ..bam, and .vcf), I jumped on the chance to take a look at my own source code. Nebula provided me with a vcf file containing 4,785,184 QC passing variants.

I want to prioritize which of these millions of variants might deserve a closer look. In a previous post, I annotated these variants to see if any were predicted to disrupt protein function or play a role in disease.

In this post, I’ll show how I searched the GWAS Catalog for variant-phenotype associations and calculated polygenic risk scores. First, I search the GWAS catalog to see which of my genomic variants have been associated with disease phenotypes in genome-wide association studies. This only provides information about single variants, which may be informative in the context of monogenic disease, but individual variants usually have very small effects on complex traits. So next, in order to understand how multiple variants may act together to influence complex traits, I use the PRS Knowledge Base to calculate polygenic risk scores for various traits.

Disclaimer! It is important to understand the limitations of GWAS and why polygenic risk scores need to be interpreted with caution. It would be foolish to make any medical decisions based solely on these results. See here and here for thoughtful discussions about PRS.

Anyone with a vcf file from Nebula and a little experience with Linux and R should be able to recreate these analyses for themselves. I ran everything below on my little personal laptop (i5, 16G RAM) running Windows 11 with an Ubuntu Virtualbox install.

Publically availible tools and databases

  • The GWAS Catalog (NHGRI-EBI Catalog of human genome-wide association studies) provides a consistent, searchable, visualisable and, freely available database of SNP-trait associations. GWA studies are identified by literature search and assessed by curators, who then extract the reported trait, significant SNP-trait associations, and sample metadata.

  • gwascat is an R Bioconductor package written by Vincent Carey that provides convenient tools for working with contents of GWAS Catalog database. I use this package to identify GWA studies that found significant associations for my variants. See here for a vignette that describes more of its features.

  • The PRS Knowledge Base provides tools to calculate polygenic risk scores using data downloaded and curated from the NHGRI-EBI GWAS Catalog. This is the git repo for the PRSKB software.

  • GATK is incredible software with many tools for genomic analyses. I only use it here to convert the vcf into something easier to read.

Finding my variants in the GWAS Catalog

Load Libraries

start_time <- Sys.time()

library(gwascat)
library(snakecase)
suppressPackageStartupMessages(library(tidyverse))
library(gt)

# Wong, B. Points of view: Color blindness. Nat Methods (2011).
bla <- '#000000'
blu <- '#0072b2'
grb <- '#56b4e9'
lir <- '#cc79a7'
gre <- '#009e73'
red <- '#d55e00'
org <- '#e69f00'
yel <- '#f0e442'
gry <- '#BBBBBB'

make.hyperlink <-  function(myurl,mytext=myurl) {
  paste('<a href="',myurl,'">',mytext,'</a>')
}

Setup

GATK is quick and easy to install.

sudo apt install default-jre
wget https://github.com/broadinstitute/gatk/releases/download/4.2.5.0/gatk-4.2.5.0.zip
unzip gatk-4.2.5.0.zip
# set alias in bash profile 
# alias gatk='/media/sf_dna/apps/gatk-4.2.5.0/gatk'

I convert the vcf provided by Nebula to a tab delimited text file with GATK. This makes the data a little easier to read and can be fed into R as a data frame. I also replace my Nebula ID with ‘myVariants’ to make downstream scripts more generic.

gatk VariantsToTable -V nebula.vcf -F CHROM -F POS -F TYPE -F ID -F ANN -F LOF -F NMD -GF AD -GF DP -GF GQ -GF GT -O myVariants.ann.txt

nebulaID="##########" 
sed -i "1s/$nebulaID/myVariants/g" myVariants.ann.txt

Search the GAWS Catalog

Create a data frame that joins rsIDs from the Nebula vcf with rsIDs in the catalog. I find 171,943 associations for 81,834 variants.

myVariants_path <- "C:/Users/jmcgirr/dna/nebula/vcf/myVariants.txt"

# Nebula vcf
vcf <- read_tsv(myVariants_path, col_select = c("CHROM","POS","TYPE","ID","myVariants.GT")) |>
       # only check SNPs
       filter(TYPE == "SNP") |> select(-c(TYPE)) |>
       separate(myVariants.GT, c("vcf_allele1","vcf_allele2"), sep = "/|\\|") |>
       # remove deletions labeled as "*"
       filter(vcf_allele1 != "*",vcf_allele2 != "*") |>
       # remove rows with missing rsid
       filter(str_detect(ID,fixed("rs")))
head(as.data.frame(vcf))
##   CHROM   POS           ID vcf_allele1 vcf_allele2
## 1  chr1 10904   rs10218493           G           A
## 2  chr1 10927   rs10218527           A           G
## 3  chr1 13757  rs879805273           G           A
## 4  chr1 13813 rs1213979446           T           G
## 5  chr1 13838   rs28428499           C           T
## 6  chr1 15274    rs2758118           G           T
gwtrunc <- makeCurrentGwascat()
#topTraits(gwtrunc)

catalog_matches <-  as.data.frame(gwtrunc) |> 
                    filter(SNPS %in% vcf$ID) |>
                    select(-c(DATE.ADDED.TO.CATALOG, FIRST.AUTHOR, width, STUDY, REGION, CHR_ID,
                              UPSTREAM_GENE_ID, DOWNSTREAM_GENE_ID, SNP_GENE_IDS, UPSTREAM_GENE_DISTANCE,
                              DOWNSTREAM_GENE_DISTANCE,MERGED, SNP_ID_CURRENT,PVALUE_MLOG,PLATFORM..SNPS.PASSING.QC.,
                              STUDY.ACCESSION,P.VALUE..TEXT.,PUBMEDID)) |>
                    separate(STRONGEST.SNP.RISK.ALLELE, c("STRONGEST_SNP","RISK_ALLELE"), sep = "-" ,remove = TRUE) |>
                    select(-c(STRONGEST_SNP)) |>
                    arrange(P.VALUE) |>
                    inner_join(vcf, by = c("SNPS" = "ID"))

names(catalog_matches) <- to_snake_case(names(catalog_matches))

match_stats <- data.frame(column = c("rsIDs",
                                   "phenotypes",
                                   "journals"),
                          n = c(length(unique(catalog_matches$snps)),
                                length(unique(catalog_matches$mapped_trait)),
                                length(unique(catalog_matches$journal))))

match_stats |> gt() |> tab_header(title = "Catalog Matches")
Catalog Matches
column n
rsIDs 81834
phenotypes 3733
journals 598
catalog_matches[4:9,] |> gt() |> tab_header(title = "Examples of Catalog Matches")
Examples of Catalog Matches
seqnames start end strand date journal link disease_trait initial_sample_size replication_sample_size chr_pos reported_gene_s mapped_gene risk_allele snps context intergenic risk_allele_frequency p_value or_or_beta x_95_ci_text cnv mapped_trait mapped_trait_uri genotyping_technology chrom pos vcf_allele_1 vcf_allele_2
4 71744491 71744491 * 2016-08-17 PLoS Genet www.ncbi.nlm.nih.gov/pubmed/27532455 Blood protein levels 1,340 European ancestry individuals NA 71744491 NR GC ? rs222047 intron_variant 0 NA 1e-307 1.153000 [NR] unit decrease N blood protein measurement http://www.ebi.ac.uk/efo/EFO_0007937 Genome-wide genotyping array chr4 71744491 C A
16 56951227 56951227 * 2020-03-23 PLoS Med www.ncbi.nlm.nih.gov/pubmed/32203549 HDL cholesterol levels 403,943 European ancestry individuals NA 56951227 CETP HERPUD1 - CETP A rs9989419 regulatory_region_variant 1 0.394080 1e-307 0.143745 [0.14-0.15] unit decrease N high density lipoprotein cholesterol measurement http://www.ebi.ac.uk/efo/EFO_0004612 Genome-wide genotyping array chr16 56951227 A G
11 116778201 116778201 * 2020-03-23 PLoS Med www.ncbi.nlm.nih.gov/pubmed/32203549 HDL cholesterol levels 403,943 European ancestry individuals NA 116778201 ZNF259 ZPR1 G rs964184 3_prime_UTR_variant 0 0.133633 1e-307 0.105482 [0.1-0.11] unit decrease N high density lipoprotein cholesterol measurement http://www.ebi.ac.uk/efo/EFO_0004612 Genome-wide genotyping array chr11 116778201 C C
16 89968733 89968733 * 2018-05-08 Nat Commun www.ncbi.nlm.nih.gov/pubmed/29739929 Low tan response 46,768 European ancestry low tanning cases, 74,528 European ancestry moderate and high tanning controls 15,547 European ancestry low tanning cases, 39,835 European moderate and high tanning ancestry controls 89968733 NR DEF8 - CENPBD1 A rs4785752 intergenic_variant 1 NA 1e-307 0.345000 [0.24-0.45] unit increase N suntan http://www.ebi.ac.uk/efo/EFO_0004279 Genome-wide genotyping array chr16 89968733 G G
16 89579029 89579029 * 2018-05-08 Nat Commun www.ncbi.nlm.nih.gov/pubmed/29739929 Low tan response 46,768 European ancestry low tanning cases, 74,528 European ancestry moderate and high tanning controls 15,547 European ancestry low tanning cases, 39,835 European moderate and high tanning ancestry controls 89579029 MC1R CPNE7 G rs369230 intron_variant 0 NA 1e-307 0.469000 [0.34-0.6] unit increase N suntan http://www.ebi.ac.uk/efo/EFO_0004279 Genome-wide genotyping array chr16 89579029 T T
20 34077942 34077942 * 2018-05-08 Nat Commun www.ncbi.nlm.nih.gov/pubmed/29739929 Low tan response 46,768 European ancestry low tanning cases, 74,528 European ancestry moderate and high tanning controls 15,547 European ancestry low tanning cases, 39,835 European moderate and high tanning ancestry controls 34077942 RALY, ASIP RALY A rs6059655 intron_variant 0 NA 1e-307 0.527000 [0.36-0.69] unit increase N suntan http://www.ebi.ac.uk/efo/EFO_0004279 Genome-wide genotyping array chr20 34077942 G G

Heterozygous for risk allele

As a first pass I’ll look at variants for which I have a copy of the allele implicated as the risk allele. I filter and order by p-value to see the strongest associations.

p_value < 1e-20

het_or_hom_for_risk_allele <- filter(catalog_matches,vcf_allele_1 == risk_allele | vcf_allele_2 == risk_allele, p_value < 1e-20) |> 
                              arrange(risk_allele_frequency, p_value) |>
                              select(c(disease_trait, p_value, or_or_beta, risk_allele, snps, link))

head(het_or_hom_for_risk_allele) |> gt() |> 
  tab_header(title = "Heterozygous for risk allele") |> 
  fmt(columns = 'link',fns = make.hyperlink)
Heterozygous for risk allele
disease_trait p_value or_or_beta risk_allele snps link
Total cholesterol levels 2e-55 0.1210000 C rs964184 www.ncbi.nlm.nih.gov/pubmed/28334899
Monocyte count 4e-136 0.6380717 C rs140221307 www.ncbi.nlm.nih.gov/pubmed/27863252
Monocyte percentage of white cells 6e-114 0.5812110 C rs140221307 www.ncbi.nlm.nih.gov/pubmed/27863252
Granulocyte percentage of myeloid white cells 3e-78 0.4813244 C rs140221307 www.ncbi.nlm.nih.gov/pubmed/27863252
Platelet distribution width 5e-56 0.3910234 T rs142680548 www.ncbi.nlm.nih.gov/pubmed/27863252
Monocyte count 1e-307 0.6809273 C rs140221307 www.ncbi.nlm.nih.gov/pubmed/32888494

Homozygous for rare risk allele

Next I’ll be more strict and look at variants for which I am homozygous for the risk allele. I further filter to only include those that are low frequency and showed strong associations with the phenotype being studied.

risk_allele_frequency < 0.1 and p_value < 1e-20

homozygous_for_risk_allele <- filter(catalog_matches,vcf_allele_1 == risk_allele, vcf_allele_2 == risk_allele, p_value < 1e-20) |> 
                              arrange(risk_allele_frequency, p_value)

homozygous_for_risk_allele_low_freq <- filter(catalog_matches,vcf_allele_1 == risk_allele, vcf_allele_2 == risk_allele, risk_allele_frequency < 0.1, p_value < 1e-20) |> 
                                       arrange(p_value) |>
                                       select(c(disease_trait, p_value, or_or_beta, risk_allele, snps, link))

p1 <- catalog_matches |> 
      ggplot(aes(risk_allele_frequency)) +
      geom_histogram(bins = 100) +
      theme_minimal() +
      theme(axis.title.x=element_text(size=14),
       axis.title.y=element_text(size=12),
       axis.title=element_text(size=14),
       axis.text=element_text(size=12),
       plot.title=element_text(size=18)) +
      xlim(0,1) +
      ggtitle("Distribution of risk allele frequencies") +
      geom_vline(xintercept = 0.1, color = red, linetype = 2)
print(p1)

head(homozygous_for_risk_allele_low_freq) |> gt() |> 
  tab_header(title = "Homozygous for risk allele") |>
  fmt(columns = 'link',fns = make.hyperlink)
Homozygous for risk allele
disease_trait p_value or_or_beta risk_allele snps link
Serum levels of protein BTD 3e-126 1.2828700 G rs71627145 www.ncbi.nlm.nih.gov/pubmed/35078996
Systemic lupus erythematosus 4e-68 2.2820000 A rs34723256 www.ncbi.nlm.nih.gov/pubmed/33272962
Total cholesterol levels 2e-55 0.1210000 C rs964184 www.ncbi.nlm.nih.gov/pubmed/28334899
Hematology traits 5e-47 0.0612000 G rs4657616 www.ncbi.nlm.nih.gov/pubmed/23263863
Urinary metabolites (H-NMR features) 9e-37 0.6700000 G rs40200 www.ncbi.nlm.nih.gov/pubmed/24586186
Male-pattern baldness 2e-27 0.0604074 C rs17250872 www.ncbi.nlm.nih.gov/pubmed/30573740

Polygenic risk scores

Setup

The polygenic risk score calculator from the PRS Knowledge Base inputs either a vcf or a txt with lines formatted as rsID:allele1,allele2. I chose to create the txt input. The following uses the genotype information output by GATK to create a list of all variants.

Table is written out and used as input for the PRSKB command line tool.

myVariants_path <- "C:/Users/jmcgirr/dna/nebula/vcf/myVariants.txt"

# Nebula vcf
rsid_and_genotype <- read_tsv(myVariants_path, col_select = c("ID","myVariants.GT")) |>
  mutate(myVariants.GT = gsub("/|\\|", "," ,myVariants.GT)) |>
  # remove deletions labeled as "*"
  #filter(vcf_allele1 != "*",vcf_allele2 != "*") |>
  # remove rows with missing rsid
  filter(str_detect(ID,fixed("rs"))) |>
  unite("ID_genotype", c("ID", "myVariants.GT"), sep = ":")
head(as.data.frame(rsid_and_genotype))
##                            ID_genotype
## 1 rs376342519:CCGCCGTTGCAAAGGCGCGCCG,C
## 2                       rs10218493:G,A
## 3                       rs10218527:A,G
## 4                      rs879805273:G,A
## 5                     rs1213979446:T,G
## 6                       rs28428499:C,T
# write.table(rsid_and_genotype,"C:/Users/jmcgirr/dna/prskb/rsid_and_genotype.txt", row.names = FALSE, quote = FALSE, sep = "\t", col.names = FALSE)

PRSKB command line

Required installed programs: Bash and jq for bash, Python3 and the PyVCF, filelock, and requests Python modules.

wget https://prs.byu.edu/download_cli

./runPrsCLI.sh -f /media/sf_dna/prskb/rsid_and_genotype.txt -o PRS_default.tsv -r hg38 -c 0.05 -p EUR

Identifying the percentile rank of my PRS

Since PRS is a relative measure, percentile rank is a way to understand where I fall within the population of individuals included in the study.

I filter to only look at entries with percentiles calculated and used more than 10 SNPs in the calculation.

prs_output_path <- "C:/Users/jmcgirr/dna/prskb/PRS_default.tsv"

prs <- read_tsv(prs_output_path) |>
  filter(!is.na(Percentile)) |>
  filter(`SNP Overlap` > 10) |>
  arrange(desc(Percentile))
#view(prs)

# high_percentile <- tail(prs,(nrow(prs)-8572)) |>
#   filter(`SNP Overlap` > 10)
# nrow(high_percentile)

head(select(prs,c(`Reported Trait`, `Trait`, `Citation`, `Included SNPs`, `Polygenic Risk Score`, `Percentile`))) |> 
  gt() |> 
  tab_header(title = "PRSKB command line tool output")
PRSKB command line tool output
Reported Trait Trait Citation Included SNPs Polygenic Risk Score Percentile
Phosphatidylcholine-O(39:1)_[M+H]1+/Phosphatidylethanolamine-P(39:0)_[M+H]1+ Levels Phosphatidylcholine Measurement Harshfield et al. 2021 17 0.02047195489920295 99-100
Cholesterol_[M+H-H2o]1+ Levels Sterol Measurement Harshfield et al. 2021 21 -0.005463744170670547 99-100
Bipolar I Disorder Bipolar I Disorder Stahl et al. 2019 59 1.051042489158195 99
Waist-To-Hip Ratio Adjusted For Body Mass Index Bmi-Adjusted Waist-Hip Ratio Shungin et al. 2015 23 0.018457758580973937 99
Waist-To-Hip Ratio Adjusted For Body Mass Index Bmi-Adjusted Waist-Hip Ratio Shungin et al. 2015 23 0.01816876299264591 99
Dupuytren's Disease Dupuytren Contracture Ng et al. 2017 18 1.1085373026007015 99

A cautionary example

You can see that the percentile I fall within can vary widely for some traits between different studies

trait_prs <- filter(prs, str_detect(Trait, "Rheumatoid Arthritis"))
trait_prs <- filter(prs, str_detect(`Reported Trait`, "Prostate Cancer"))


head(select(trait_prs,c(`Reported Trait`, `Trait`, `Citation`, `Included SNPs`, `Polygenic Risk Score`, `Percentile`))) |> 
  gt() |> 
  tab_header(title = "PRSKB command line tool output")
PRSKB command line tool output
Reported Trait Trait Citation Included SNPs Polygenic Risk Score Percentile
Prostate Cancer Prostate Carcinoma Hoffmann et al. 2015 27 1.057623148439582 7
Prostate Cancer Prostate Carcinoma Eeles et al. 2013 22 1.0380169625339535 68
Prostate Cancer Prostate Carcinoma Rashkin et al. 2020 61 1.0566025143469873 63
Prostate Cancer Prostate Carcinoma Schumacher et al. 2018 132 1.0374492178389345 51
Prostate Cancer Prostate Carcinoma Emami et al. 2020 47 1.0765225210586062 50
Prostate Cancer Prostate Carcinoma Takata et al. 2019 36 1.078003117178711 31

Column descriptions for PRSKB output

Copied from https://github.com/kauwelab/PolyRiskScore

  • Study ID – The study identifier assigned by the GWAS Catalog (or the user if they uploaded their own GWAS summary statistics)
  • Reported Trait – Trait based on the phenotype being studied, as described by the authors
  • Trait – Trait assigned by the GWAS Catalog, standardized from the Experimental Factor Ontology
  • Citation – The citation of the study
  • P-Value Annotation – Additional information about the p-values
  • Beta Annotation – Additional information about the beta values
  • Score Type – This indicates if the study used odds ratios or beta values
  • Units (if applicable) – This column will contain the beta units if the Score Type is beta.
  • SNP Overlap – Details the number of SNPs that are in the sample vcf/txt file which are 1. in the study, 2. not excluded from the calculation (see below), and 3. not removed from the calculation due to linkage-disequilibrium clumping.
  • SNPs Excluded Due To Cutoffs – Details the number of snps excluded from the study calculation due to p-value cutoff or minor allele frequency threshold
  • Included SNPs – The total number of SNPs included in the calculation
  • Used Super Population – The super population used for linkage disequillibrium
  • Columns Available Only In The Full Version
  • Percentile – Indicates the percentile rank of the samples polygenic risk score *(also included in the condensed version of .txt input files)
  • Protective Variants – Variants that are protective against the phenotype of interest
  • Risk Variants – Variants that add risk for the phenotype of interest
  • Variants Without Risk Alleles – Variants that are present in the study, but the sample does not possess the allele reported with association. Note that a SNP may be in this list and also in the Protective Variants or Risk Variants list. This is caused by an individual being heterozygous for the alleles at that point.
  • Variants in High LD – Variants that are not used in the calculation, due to them being in high linkage disequillibrium with another variant in the study.

Notes

R run time and session info

end_time <- Sys.time()
print(end_time - start_time)
## Time difference of 6.870454 mins
sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22000)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] gt_0.4.0         forcats_0.5.1    stringr_1.4.0    dplyr_1.0.8     
##  [5] purrr_0.3.4      readr_2.1.2      tidyr_1.2.0      tibble_3.1.6    
##  [9] ggplot2_3.3.5    tidyverse_1.3.1  snakecase_0.11.0 gwascat_2.26.0  
## 
## loaded via a namespace (and not attached):
##   [1] colorspace_2.0-3            rjson_0.2.21               
##   [3] ellipsis_0.3.2              XVector_0.34.0             
##   [5] GenomicRanges_1.46.1        fs_1.5.2                   
##   [7] rstudioapi_0.13             farver_2.1.0               
##   [9] bit64_4.0.5                 AnnotationDbi_1.56.2       
##  [11] fansi_1.0.3                 lubridate_1.8.0            
##  [13] xml2_1.3.3                  splines_4.1.3              
##  [15] snpStats_1.44.0             cachem_1.0.6               
##  [17] knitr_1.38                  jsonlite_1.8.0             
##  [19] Rsamtools_2.10.0            broom_0.7.12               
##  [21] dbplyr_2.1.1                png_0.1-7                  
##  [23] compiler_4.1.3              httr_1.4.2                 
##  [25] backports_1.4.1             assertthat_0.2.1           
##  [27] Matrix_1.4-0                fastmap_1.1.0              
##  [29] cli_3.2.0                   htmltools_0.5.2            
##  [31] prettyunits_1.1.1           tools_4.1.3                
##  [33] gtable_0.3.0                glue_1.6.2                 
##  [35] GenomeInfoDbData_1.2.7      rappdirs_0.3.3             
##  [37] Rcpp_1.0.8.3                Biobase_2.54.0             
##  [39] cellranger_1.1.0            jquerylib_0.1.4            
##  [41] vctrs_0.4.0                 Biostrings_2.62.0          
##  [43] rtracklayer_1.54.0          xfun_0.30                  
##  [45] rvest_1.0.2                 lifecycle_1.0.1            
##  [47] restfulr_0.0.13             XML_3.99-0.9               
##  [49] zlibbioc_1.40.0             scales_1.1.1               
##  [51] vroom_1.5.7                 BSgenome_1.62.0            
##  [53] VariantAnnotation_1.40.0    hms_1.1.1                  
##  [55] MatrixGenerics_1.6.0        parallel_4.1.3             
##  [57] SummarizedExperiment_1.24.0 yaml_2.3.5                 
##  [59] curl_4.3.2                  memoise_2.0.1              
##  [61] sass_0.4.1                  biomaRt_2.50.3             
##  [63] stringi_1.7.6               RSQLite_2.2.11             
##  [65] highr_0.9                   S4Vectors_0.32.4           
##  [67] BiocIO_1.4.0                checkmate_2.0.0            
##  [69] GenomicFeatures_1.46.5      BiocGenerics_0.40.0        
##  [71] filelock_1.0.2              BiocParallel_1.28.3        
##  [73] GenomeInfoDb_1.30.1         rlang_1.0.2                
##  [75] pkgconfig_2.0.3             matrixStats_0.61.0         
##  [77] bitops_1.0-7                evaluate_0.15              
##  [79] lattice_0.20-45             labeling_0.4.2             
##  [81] GenomicAlignments_1.30.0    bit_4.0.4                  
##  [83] tidyselect_1.1.2            magrittr_2.0.3             
##  [85] bookdown_0.25               R6_2.5.1                   
##  [87] IRanges_2.28.0              generics_0.1.2             
##  [89] DelayedArray_0.20.0         DBI_1.1.2                  
##  [91] pillar_1.7.0                haven_2.4.3                
##  [93] withr_2.5.0                 survival_3.2-13            
##  [95] KEGGREST_1.34.0             RCurl_1.98-1.6             
##  [97] modelr_0.1.8                crayon_1.5.1               
##  [99] utf8_1.2.2                  BiocFileCache_2.2.1        
## [101] tzdb_0.3.0                  rmarkdown_2.13             
## [103] progress_1.2.2              grid_4.1.3                 
## [105] readxl_1.4.0                blob_1.2.2                 
## [107] rmdformats_1.0.3            reprex_2.0.1               
## [109] digest_0.6.29               stats4_4.1.3               
## [111] munsell_0.5.0               bslib_0.3.1