makeGPS.Rd
Given a repository of gene-pathway associations either in a tab delimited
file with three columns (pathwayID,pathway Description,Gene) or a
corresponding dataframe, this function identifies all Gene Pair Signatures
(pairs of genes that are as a combination unique to a single pathway) and
Pathway Unique Genes (genes that are uniquely associated with a single
pathway) and stores them in a format that is usable by sigora
.
Please also see the "details" and "note" sections below.
makeGPS( pathwayTable = NULL, fn = NULL, maxLevels = 5, saveFile = NULL, repoName = "userrepo", maxFunperGene = 100, maxGenesperPathway = 500, minGenesperPathway = 10 )
pathwayTable | A data frame describing gene-pathway associations in following format: pathwayID,pathwayName,Gene. Either pathwayTable or fn should be provided. |
---|---|
fn | Where to find the repository.Either pathwayTable or fn should be provided. |
maxLevels | For hierarchical repositories, the number of levels to consider. |
saveFile | Where to save the object as an rda file. |
repoName | Repository name. |
maxFunperGene | A cutoff threshold, genes with more than this number of associated pathways are excluded to speed up the GPS identification process. |
maxGenesperPathway | A cutoff threshold, pathways with more than this number of associated genes are excluded to speed up the GPS identification process. |
minGenesperPathway | A cutoff threshold, pathways with less than this number of associated genes are excluded to speed up the GPS identification process. |
A GPS repository, to be used by sigora
and ora.
The primary purpose of makeGPS
is to convert a user-supplied
gene-pathway association table to a repository of weighted Gene Pair
Signatures (GPS) that are unique features of pathways. Such GPS can than be
used for signature (gene-pair) based analyses using sigora
.
Additionally, the resulting object also retains the original "single
gene"-"pathway" associations for the purpose of followup analyses, such as
comparison of sigora-results to traditional methods. ora
is an
implementation of the traditional (individual gene) Overrepresentation
Analysis.
This function relies on package slam
, which should be installed
from CRAN. It is fairly memory intensive, and it is recommended to be run
on a machine with at least 6GB of RAM. Also, make sure to save and reuse the
resulting GPS repository in future analyses!
Foroushani AB, Brinkman FS and Lynn DJ (2013).“Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures.”PeerJ, 1
data(nciTable); data(idmap) ## what the input looks like: head(nciTable) #> pathwayId pathwayName #> 1 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 2 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 3 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 4 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 5 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> 6 pi3kplctrkpathway Trk receptor signaling mediated by PI3K and PLC-gamma #> gene #> 1 ENSG00000140992 #> 2 ENSG00000196689 #> 3 ENSG00000142208 #> 4 ENSG00000145675 #> 5 ENSG00000138741 #> 6 ENSG00000152495 ## create a SigObject. use the saveFile parameter for reuse. nciH<-makeGPS(pathwayTable=load_data('nciTable')) #> Time difference of 1.044506 secs ils<-grep("^IL",idmap[,"Symbol"],value=TRUE) ilnci<-sigora(queryList=ils,GPSrepo=nciH,level=3) #> [1] "Mapped identifiers from Symbol to Ensembl.Gene.ID ..." #> pathwy.id description pvalues Bonferroni successes #> 1 il23pathway IL23-mediated signaling events 5.494e-64 1.049e-61 36.27 #> 2 il27pathway IL27-mediated signaling events 3.164e-34 6.043e-32 18.14 #> 3 il12_2pathway IL12-mediated signaling events 3.188e-12 6.089e-10 13.20 #> 4 il1pathway IL1-mediated signaling events 1.115e-09 2.130e-07 8.42 #> 5 il4_2pathway IL4-mediated signaling events 1.070e-05 2.044e-03 9.03 #> PathwaySize N sample.size #> 1 172.95 46257.95 93.08 #> 2 65.51 46257.95 93.08 #> 3 420.16 46257.95 93.08 #> 4 156.05 46257.95 93.08 #> 5 687.89 46257.95 93.08