Example data set


For a quick start with BAT and its modules, we have assembled a small example dataset, adopted from data used in a recent lymphoma publication (link). It is a subset of a paired-end human WGBS dataset, comprising 8 samples (S1-S8), each with two sequencing runs. The samples are split up in two groups (control: S1-S4, case: S5-S8). Either start with the most minimal example data set or use a run-script and additional files to test BAT.

Minimum input data

$ tar xvf BAT_example_input.tar.gz

This minimum example data set comprises the raw reads of one sample and the already called, but not filtered reads of that sample and further 7 samples. The samples blong to two groups, each of four samples. The unmapped sample consists of two sequencing runs. These reads could be mapped to a reduced genome and merged prior to methylation calling. In addition to the raw and calles methylation data are provided. This will enable you, to run the entire toolkit on a small example region.

In a quite basic version, the tool calls are shown at the example pages. There, the tool calls are given, all output files are stated and, if plots are produced, they are presented.

Extended input data

We recommand to download the entire BAT example directory (985 MB), since a variety of additional files is provided to run all BAT tools, eg., a reduced reference genome, some gene annotations and gene expression data. The directory BAT_example_structure contains a basic folder structure, i.e.,

Extract the example directory using

$ tar xvf BAT_example_structure.tar.gz

Using the example data, given the directory structure and provided files described above, the following scripts can be tested.

For each script, a link to the more details explanation (including the description of all parameters), the example run command, the output, and a short glimpse at the output files and plots is provided.

The entire calls for running the example data are given in the run script, which is based on the given directory structure.


(top)