TileReadCustom {TileShuffle} | R Documentation |
Reads tiling array data from custom-formatted files.
TileReadCustom(custom.filename, custom2.filename, custominc.filename, pmonly=TRUE, mmonly=FALSE, normalize=TRUE, gc=TRUE, verbose=FALSE)
custom.filename |
A vector of one or more filenames of custom-
formatted files (as character ) that contain the probe
intensities in the first cellular condition and are formatted as
described. Note that replicates are simply defined by more than
one filename. In case custom2.filename is set to NULL ,
the median probe log2 -intensity among the given replicates
is reported as estimate of the probe intensity. On the other hand,
if custom2.filename is given, the median of all possible
pairwise log2 -fold changes between files from the cellular
condition (custom.filename ) and ones from the second cellular
condition (custom2.filename ) of a probe will be reported as
estimate of the probe intensity change. |
custom2.filename |
A vector of one or more filenames of custom-
formatted files (as character ) that contain the probe
intensities in the second cellular condition and are formatted as
described. Note that replicates are simply defined by more than one
one filename. This parameter is only required in case of differential
expression analysis where the median of all possible pairwise
log2 -fold changes between files in the first cellular
condition (custom.filename ) and ones from the second cellular
condition (custom2.filename ) of a probe will be reported as
estimate for the probe intensity change. Otherwise, this parameter
must be set to NULL . |
custominc.filename |
A vector of one or more filenames of
custom-formatted files (as character ) containing probe
intensities that should be included in the normalization. This may
be desirable in case tiling array data of more than two different
cellular states is available and multiple transitions between them
are being analyzed. In the analysis of any of these transitions,
the files corresponding to the remaining cellular states, i.e.,
those not given by custom.filename or custom2.filename ,
may be defined as custominc.filename . Hence, the full-quantile
normalization is always done on the entire set of available intensity
data and the log2 -fold changes among different analyzed
transitions are comparable and not biased by the normalization
procedure. This parameter is futile if normalization is
disabled. |
pmonly |
Indicates whether only intensities of perfect match (PM)
probes on the tiling array are incorporated in the probe intensity
estimation. If neither pmonly nor mmonly is set to
TRUE , the specific hybridization effect of a probe is
estimated by taking PM-MM . |
mmonly |
Indicates whether only intensities of mismatch (MM) probes
are incorporated in the probe intensity estimation. If neither
pmonly nor mmonly is set to TRUE , the specific
hybridization effect of a probe is estimated by taking PM-MM .
This option is mutually exclusive with the pmonly parameter
and is only recommended for investigating the behaviour of mismatch
probes but not in common (differential) expression analysis. |
normalize |
Indicates whether the probe intensities of the given files
in custom.filename , custom2.filename , and
custominc.filename are normalized by use of full-quantile
normalization. The normalization is recommended if replicates are
available or a differential analysis is executed and, hence, the
transition between cellular states is analyzed. Note that PM and MM
probe intensities are not included in the normalization if
mmonly and pmonly is set to TRUE , respectively. |
gc |
Indicates whether GC content of probe sequences will be
calculated. It is defined as fraction of both Gs and Cs in the
probe sequence. The probe sequences may be set arbitrarily if
gc is disabled. |
verbose |
Indicates whether information on progress are printed. |
Reads tiling array data from custom-formatted files that may be created by
any tiling array platform. It includes probe information that is separated
by tabulators. Except for comment (indicated by '#' on beginning) or empty
line, each line must contain the following columns: probe identifier that is
unique within each file and consistent among the given files in terms of
their probe coordinates, name of the reference sequence, start and end
position of the probe on the reference sequence (both 0-based, as
integer
), intensity value of PM probe (non-log scale), intensity value
of MM probe (on non-log scale), and probe sequence in order to calculate the
GC content. The data must not contain 'NA' values. Moreover, the probe
sequences may be set arbitrary if gc
is disabled. The method generates
a data.frame
comprising all required data on probes that are necessary
for the subsequent shuffling analysis.
Returns a data.frame
containing information on all probes,
i.e., the name of the reference sequence, the chromosome name
(here: both are equal), the probe center position, the length of
the probe sequence, the GC content of the probe sequence, the match
score (if matchscore
is enabled), and the log2
probe
intensity or the probe log2
-fold change.
## This example requires the custom-formatted files ## in the extdata folder of this package. Otherwise, ## it aborts with an error. path <- system.file("extdata", package = "TileShuffle") stopifnot(path != "") ## define filename to custom-formatted file custom.filename <- file.path(path, "custom.txt") stopifnot(file.exists(custom.filename)) ## read custom-formatted file and return data.frame ## with information such as genomic localization ## of probes, GC content of probe sequences, and ## probe intensities. custom <- TileReadCustom(custom.filename=custom.filename, pmonly=TRUE, gc=TRUE, verbose=FALSE) ## getting an overview on the reported data.frame str(custom) ## investigating data ## e.g. plot density of intensities pdf(file="custom_int_density.pdf") plot(density(custom$intensity), main="", xlab="Intensity") dev.off() ## or GC bias with three GC content bins pdf(file="custom_gc_bias.pdf") boxplot(custom$intensity ~ cut(custom$gc, breaks=c(0,0.36,0.52,Inf),right=FALSE), xlab="GC content", ylab="Intensity") dev.off() ## cleanup rm(path, custom.filename, custom)