This function determines which Signatures (GPS) from a collection of GPS data (GPSrepo argument) for the specified pathway repository are present in the specified list of genes of interest (queryList argument)). It then uses the distribution function of hypergeometric probabilities to identify the pathways whose GPS are over-represented among the present GPS and saves the results to the file specified in the saveFile argument.

sigora(
  GPSrepo,
  level,
  markers = FALSE,
  queryList = NULL,
  saveFile = NULL,
  weighting.method = "invhm",
  idmap = load_data("idmap")
)

Arguments

GPSrepo

An object created by makeGPS or one of the precompiled GPS data collections that are provided with this package (currently for KEGG and Reactome). e.g. reaH for human Reactome GPS, kegH for human KEGG GPS, and reaM and kegM for corresponding mouse GPS. See the examples section for creating and using your own GPS.

level

In hierarchical repositories (e.g. Reactome) number of levels to consider. Recommended value for KEGG: 2, for Reactome: 4.

markers

Whether to take single genes that are uniquely associated with only one pathway into account (i.e. should pathway unique genes/PUGs be considered GPS?). Recommended value: TRUE (1).

queryList

A user specified list of genes of interest ('query list'), as a vector of ENSEMBL/ ENTREZ IDs or gene symbols (HGNC/MGI).

saveFile

If provided, the results are saved here as a tab delimited File (including , for each pathway, a list of genes ordered by their contribution to the statistical significance of the pathway).

weighting.method

The weighting method or GPS. The default weighting scheme for the GPS is the reciproc of the harmonic mean of the degrees of the two component genes of a GPS. A wide range of alternative weighting schemes are pre-implemented (see below). Additional user defined weighting schemes are also supported. Currently, the following alternatives are pre-implemented:
'noweights','cosine','topov','reciprod','jac','justPUGs'and'invhm'.
Additional user defined weighting schemes are also supported (see section examples).
'noweights': assigns a constant of 1 to all GPS.
'cosine': all GPS are weighted by the cosine of the degrees of their consituent genes.
'topov': all GPS are weighted by topological overlap of their consituent genes.
'reciprod': all GPS are weighted by reciproc of product of the number of pathway annotations of their consituent genes.
'jac':all GPS are weighted by the jaccard similarity of the pathway annotations consituent genes.
'justPUGs': Analysis is performed using PUGs only.
'invhm': all GPS are weighted by the reciproc of the harmonic mean of the degrees of their consituent genes (default).

idmap

A dataframe for converting between different gene-identifier types (e.g. ENSEMBL, ENTREZ and HGNC-Symbols of genes). Most users do not need to set this argument, as there is a built-in conversion table.

Value

summary_results

A dataframe listing the analysis results.

detailed_results

A dataframe describing the detailed evidence (present Gene-Pair Signatures) for each pathway.

References

Foroushani AB, Brinkman FS and Lynn DJ (2013).“Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures.”PeerJ, 1

See also

Examples


##query list
ils<-grep("^IL",load_data('idmap')[["Symbol"]],value=TRUE)
## using precompiled GPS repositories:
sigRes.ilreact<-sigora(queryList=ils,GPSrepo=load_data('reaH'),level=4)
#> [1] "Mapped identifiers from Symbol  to  EntrezGene.ID ..."
#>        pathwy.id                                     description    pvalues
#> 1   R-HSA-446652                  Interleukin-1 family signaling 3.847e-243
#> 2  R-HSA-8854691                 Interleukin-20 family signaling  2.827e-98
#> 3  R-HSA-6785807      Interleukin-4 and Interleukin-13 signaling  1.716e-78
#> 4   R-HSA-451927                  Interleukin-2 family signaling  1.550e-60
#> 5   R-HSA-448424                        Interleukin-17 signaling  1.471e-46
#> 6  R-HSA-5673001                          RAF/MAP kinase cascade  2.309e-25
#> 7   R-HSA-449836                     Other interleukin signaling  8.310e-24
#> 8  R-HSA-6783783                        Interleukin-10 signaling  1.503e-19
#> 9  R-HSA-9020702                         Interleukin-1 signaling  2.176e-17
#> 10 R-HSA-6788467 IL-6-type cytokine receptor ligand interactions  5.636e-15
#> 11  R-HSA-447115                 Interleukin-12 family signaling  4.494e-13
#> 12 R-HSA-9020591                        Interleukin-12 signaling  3.018e-10
#> 13 R-HSA-5684996                           MAPK1/MAPK3 signaling  8.049e-09
#> 14 R-HSA-8983432                        Interleukin-15 signaling  5.950e-08
#> 15  R-HSA-110056                         MAPK3 (ERK1) activation  3.581e-07
#>    Bonferroni successes PathwaySize        N sample.size
#> 1  3.851e-240    114.71      642.09 603749.5      391.96
#> 2   2.830e-95     42.03      149.33 603749.5      391.96
#> 3   1.718e-75     54.32     1301.05 603749.5      391.96
#> 4   1.552e-57     25.82       78.45 603749.5      391.96
#> 5   1.472e-43     24.00      204.16 603749.5      391.96
#> 6   2.311e-22     34.00     4439.66 603749.5      391.96
#> 7   8.318e-21     22.05     1335.85 603749.5      391.96
#> 8   1.505e-16     12.83      230.96 603749.5      391.96
#> 9   2.178e-14     10.00      158.19 603749.5      391.96
#> 10  5.642e-12      8.28      100.92 603749.5      391.96
#> 11  4.498e-10      7.89       94.29 603749.5      391.96
#> 12  3.021e-07      6.50      124.19 603749.5      391.96
#> 13  8.057e-06      5.26      100.03 603749.5      391.96
#> 14  5.956e-05      3.30       12.19 603749.5      391.96
#> 15  3.585e-04      3.00       21.17 603749.5      391.96
sigRes.ilkeg<-sigora(queryList=ils,GPSrepo=load_data('kegH'),level=2)
#> [1] "Mapped identifiers from Symbol  to  EntrezGene.ID ..."
#>   pathwy.id                            description   pvalues Bonferroni
#> 1  hsa04060 Cytokine-cytokine receptor interaction 0.000e+00  0.000e+00
#> 2  hsa04630             Jak-STAT signaling pathway 3.962e-17  1.208e-14
#> 3  hsa05330                    Allograft rejection 8.856e-11  2.701e-08
#>   successes PathwaySize        N sample.size
#> 1    942.26    14774.33 452219.2      984.55
#> 2     27.29     1406.10 452219.2      984.55
#> 3     15.00      703.00 452219.2      984.55
## user created GPS repository:
nciH<-makeGPS(pathwayTable=load_data('nciTable'))
#> Time difference of 0.9443071 secs
sigRes.ilnci<-sigora(queryList=ils,GPSrepo=nciH,level=2)
#> [1] "Mapped identifiers from Symbol  to  Ensembl.Gene.ID ..."
#>       pathwy.id                    description   pvalues Bonferroni successes
#> 1   il23pathway IL23-mediated signaling events 5.494e-64  1.049e-61     36.27
#> 2   il27pathway IL27-mediated signaling events 3.164e-34  6.043e-32     18.14
#> 3 il12_2pathway IL12-mediated signaling events 3.188e-12  6.089e-10     13.20
#> 4    il1pathway  IL1-mediated signaling events 1.115e-09  2.130e-07      8.42
#> 5  il4_2pathway  IL4-mediated signaling events 1.070e-05  2.044e-03      9.03
#>   PathwaySize        N sample.size
#> 1      172.95 46257.95       93.08
#> 2       65.51 46257.95       93.08
#> 3      420.16 46257.95       93.08
#> 4      156.05 46257.95       93.08
#> 5      687.89 46257.95       93.08
## user defined weighting schemes :
myfunc<-function(a,b){1/log(a+b)}
sigora(queryList=ils,GPSrepo=nciH,level=2, weighting.method = myfunc)
#> [1] "Mapped identifiers from Symbol  to  Ensembl.Gene.ID ..."
#>       pathwy.id                    description   pvalues Bonferroni successes
#> 1   il23pathway IL23-mediated signaling events 4.951e-72  9.456e-70     41.51
#> 2   il27pathway IL27-mediated signaling events 1.188e-39  2.269e-37     21.16
#> 3 il12_2pathway IL12-mediated signaling events 3.589e-21  6.855e-19     21.30
#> 4    il1pathway  IL1-mediated signaling events 4.429e-12  8.459e-10     10.44
#> 5  il4_2pathway  IL4-mediated signaling events 6.339e-06  1.211e-03     10.34
#>   PathwaySize        N sample.size
#> 1      196.67 57510.17      116.58
#> 2       76.34 57510.17      116.58
#> 3      526.02 57510.17      116.58
#> 4      179.14 57510.17      116.58
#> 5      804.61 57510.17      116.58