TileReadCustom {TileShuffle} | R Documentation |
Reads tiling array data from custom-formatted files.
TileReadCustom(custom.filename, custom2.filename, custominc.filename, pmonly=TRUE, mmonly=FALSE, normalize=TRUE, mod.tstat=FALSE, gc=TRUE, verbose=FALSE)
custom.filename |
A vector of one or more filenames of custom-
formatted files (as character ) that contain the probe
intensities in the first cellular condition and are formatted as
described. Note that replicates are simply defined by more than
one filename and used according to mod.tstat . |
custom2.filename |
A vector of one or more filenames of custom-
formatted files (as character ) that contain the probe
intensities in the second cellular condition and are formatted as
described. Note that replicates are simply defined by more than one
one filename and used according to mod.tstat . |
custominc.filename |
A vector of one or more filenames of
custom-formatted files (as character ) containing probe
intensities that should be included in the normalization. This may
be desirable in case tiling array data of more than two different
cellular states is available and multiple transitions between them
are being analyzed. |
pmonly |
Indicates whether only intensities of perfect match (PM)
probes on the tiling array are incorporated in the probe intensity
estimation. If neither pmonly nor mmonly is set to
TRUE , the specific hybridization effect of a probe is
estimated by taking PM-MM . |
mmonly |
Indicates whether only intensities of mismatch (MM) probes
are incorporated in the probe intensity estimation. If neither
pmonly nor mmonly is set to TRUE , the specific
hybridization effect of a probe is estimated by taking PM-MM .
This option is mutually exclusive with the pmonly parameter
and is only recommended for investigating the behaviour of mismatch
probes but not in common (differential) expression analysis. |
normalize |
Indicates whether the probe intensities of the given files
in custom.filename , custom2.filename , and
custominc.filename are normalized by use of full-quantile
normalization. The normalization is recommended if replicates are
available or a differential analysis is executed and, hence, the
transition between cellular states is analyzed. Note that PM and MM
probe intensities are not included in the normalization if
mmonly and pmonly is set to TRUE , respectively. |
mod.tstat |
Indicates the use of replicate information. If TRUE ,
the score is the value of the moderated t-stastistic (see
eBayes of limma package for further details).
Otherwise, the median probe log2 -intensity among the given
replicates or the median of all pairwise log2 -fold changes
between both states will be used as estimate of the probe
differential score. Note that the moderated t-statistic can only be
used if replicate information is available. |
gc |
Indicates whether GC content of probe sequences will be
calculated. It is defined as fraction of both Gs and Cs in the
probe sequence. The probe sequences may be set arbitrarily if
gc is disabled. |
verbose |
Indicates whether information on progress are printed. |
Reads tiling array data from custom-formatted files that may be created by
any tiling array platform. It includes probe information that is separated
by tabulators. Except for comment (indicated by '#' on beginning) or empty
line, each line must contain the following columns: probe identifier that is
unique within each file and consistent among the given files in terms of
their probe coordinates, name of the reference sequence, start and end
position of the probe on the reference sequence (both 0-based, as
integer
), intensity value of PM probe (non-log scale), intensity value
of MM probe (on non-log scale), and probe sequence in order to calculate the
GC content. The data must not contain 'NA' values. Moreover, the probe
sequences may be set arbitrary if gc
is disabled. The method generates
a data.frame
comprising all required data on probes that are necessary
for the subsequent shuffling analysis.
Returns a data.frame
containing information on all probes,
i.e., the name of the reference sequence, the chromosome name
(here: both are equal), the probe center position, the length of
the probe sequence, the GC content of the probe sequence, the match
score (if matchscore
is enabled), and the log2
probe
score.
## This example requires the custom-formatted files ## in the extdata folder of this package. Otherwise, ## it aborts with an error. path <- system.file("extdata", package = "TileShuffle") stopifnot(path != "") ## define filename to custom-formatted file custom.filename <- file.path(path, "custom.txt") stopifnot(file.exists(custom.filename)) ## read custom-formatted file and return data.frame ## with information such as genomic localization ## of probes, GC content of probe sequences, and ## probe intensities. custom <- TileReadCustom(custom.filename=custom.filename, pmonly=TRUE, gc=TRUE, verbose=FALSE) ## getting an overview on the reported data.frame str(custom) ## investigating data ## e.g. plot density of intensities pdf(file="custom_int_density.pdf") plot(density(custom$intensity), main="", xlab="Intensity") dev.off() ## or GC bias with three GC content bins pdf(file="custom_gc_bias.pdf") boxplot(custom$intensity ~ cut(custom$gc, breaks=c(0,0.36,0.52,Inf),right=FALSE), xlab="GC content", ylab="Intensity") dev.off() ## cleanup rm(path, custom.filename, custom)