makeGPS.RdGiven a repository of gene-pathway associations either in a tab delimited
file with three columns (pathwayID,pathway Description,Gene) or a
corresponding dataframe, this function identifies all Gene Pair Signatures
(pairs of genes that are as a combination unique to a single pathway) and
Pathway Unique Genes (genes that are uniquely associated with a single
pathway) and stores them in a format that is usable by sigora.
Please also see the "details" and "note" sections below.
makeGPS( pathwayTable = NULL, fn = NULL, maxLevels = 5, saveFile = NULL, repoName = "userrepo", maxFunperGene = 100, maxGenesperPathway = 500, minGenesperPathway = 10 )
| pathwayTable | A data frame describing gene-pathway associations in following format: pathwayID,pathwayName,Gene. Either pathwayTable or fn should be provided. |
|---|---|
| fn | Where to find the repository.Either pathwayTable or fn should be provided. |
| maxLevels | For hierarchical repositories, the number of levels to consider. |
| saveFile | Where to save the object as an rda file. |
| repoName | Repository name. |
| maxFunperGene | A cutoff threshold, genes with more than this number of associated pathways are excluded to speed up the GPS identification process. |
| maxGenesperPathway | A cutoff threshold, pathways with more than this number of associated genes are excluded to speed up the GPS identification process. |
| minGenesperPathway | A cutoff threshold, pathways with less than this number of associated genes are excluded to speed up the GPS identification process. |
A GPS repository, to be used by sigora and ora.
The primary purpose of makeGPS is to convert a user-supplied
gene-pathway association table to a repository of weighted Gene Pair
Signatures (GPS) that are unique features of pathways. Such GPS can than be
used for signature (gene-pair) based analyses using sigora.
Additionally, the resulting object also retains the original "single
gene"-"pathway" associations for the purpose of followup analyses, such as
comparison of sigora-results to traditional methods. ora is an
implementation of the traditional (individual gene) Overrepresentation
Analysis.
This function relies on package slam, which should be installed
from CRAN. It is fairly memory intensive, and it is recommended to be run
on a machine with at least 6GB of RAM. Also, make sure to save and reuse the
resulting GPS repository in future analyses!
Foroushani AB, Brinkman FS and Lynn DJ (2013).“Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures.”PeerJ, 1
data(nciTable); data(idmap) ## what the input looks like: head(nciTable) #> pathwayId pathwayName #> 1 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 2 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 3 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 4 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 5 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 6 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> gene #> 1 ENSG00000140992 #> 2 ENSG00000196689 #> 3 ENSG00000142208 #> 4 ENSG00000145675 #> 5 ENSG00000138741 #> 6 ENSG00000152495 ## create a SigObject. use the saveFile parameter for reuse. nciH<-makeGPS(pathwayTable=load_data('nciTable')) #> Time difference of 1.044506 secs ils<-grep("^IL",idmap[,"Symbol"],value=TRUE) ilnci<-sigora(queryList=ils,GPSrepo=nciH,level=3) #> [1] "Mapped identifiers from Symbol to Ensembl.Gene.ID ..." #> pathwy.id description pvalues Bonferroni successes #> 1 il23pathway IL23-mediated signaling events 5.494e-64 1.049e-61 36.27 #> 2 il27pathway IL27-mediated signaling events 3.164e-34 6.043e-32 18.14 #> 3 il12_2pathway IL12-mediated signaling events 3.188e-12 6.089e-10 13.20 #> 4 il1pathway IL1-mediated signaling events 1.115e-09 2.130e-07 8.42 #> 5 il4_2pathway IL4-mediated signaling events 1.070e-05 2.044e-03 9.03 #> PathwaySize N sample.size #> 1 172.95 46257.95 93.08 #> 2 65.51 46257.95 93.08 #> 3 420.16 46257.95 93.08 #> 4 156.05 46257.95 93.08 #> 5 687.89 46257.95 93.08