10 GO term enrichment analysis

Slides

You can download the slides for this tutorial below.

10.1 Setting up for GO term enrichment

Create a new script or R markdown in the same project you used for the DESeq2 tutorial and install and load the required packages. Since these are Bioconductor packages, to install them you must use BiocManager::install().

Load the necessary packages.

library(AnnotationDbi)
library(org.Mm.eg.db)
library(GO.db)
library(GOstats)

Copy the significant results from the DESeq2 tutorial into a new file that we will be modifying.

annotated_significant_results <- significant_res

The row names of your significant genes from the DEseq2 tutorial are mouse Ensembl Gene IDs. Convert the Ensemble Gene IDs in the rownames to Entrez IDs as a new column (and add the symbols too!).

annotated_significant_results$symbol <- mapIds(
  org.Mm.eg.db,
  keys = rownames(annotated_significant_results),
  keytype = "ENSEMBL",
  column = "SYMBOL",
  multiVals = "first"
)

## 'select()' returned 1:many mapping between keys and columns

annotated_significant_results$entrez <- mapIds(
  org.Mm.eg.db,
  keys = rownames(annotated_significant_results),
  keytype = "ENSEMBL",
  column = "ENTREZID",
  multiVals = "first"
)

## 'select()' returned 1:many mapping between keys and columns

Create a non-redundant list of genes from your enriched list.

all_genes <- annotated_significant_results %>% 
  as.data.frame() %>% 
  pull(entrez) %>% 
  unique()

Filter your significant genes by log2FoldChange to pull out upregulated genes.

genes_upregulated <- annotated_significant_results %>% 
  as.data.frame() %>% 
  filter(log2FoldChange > 4) %>% 
  pull(entrez) %>% 
  unique()

Create GO hyperGTest object from a new GOHyperGParams object that you will create with your upregulated terms and gene IDs, looking in the Biological Process (“BP”) ontology.

go_bp_upregulated <- hyperGTest(new("GOHyperGParams",
                                    geneIds = genes_upregulated,
                                    universeGeneIds = all_genes,
                                    annotation = "org.Mm.eg.db",
                                    ontology = "BP",
                                    pvalueCutoff = 0.01,
                                    conditional = FALSE,
                                    testDirection = "over"))

MICB 405 Bioinformatics: 2021.22

10 GO term enrichment analysis

Slides

10.1 Setting up for GO term enrichment

10.2 Additional questions

10.3 Additional resources