bbq Implementation Documentation
Main Page | Namespace List | Class List | File List | Namespace Members | Class Members | File Members

bbq::weighted_footprint_detector< w_bit_set > Class Template Reference

Class weighted_footprint_detector implements most crucial parts of the algorithms underlying bbq, including computation of the interval arrangements and finding best intersections between cells. More...

#include <weighted_footprint_detector.h>

List of all members.

Public Types

typedef std::set< w_bit_set > arrangement
 Procedure enumerate_differences needs fast finding of elements in the interval arrangements, which is achieved by this type of an stl multiset.

typedef std::pair< typename
arrangement::iterator, typename
arrangement::iterator > 
arrangement_range
typedef std::set< w_bit_set > cell_set

Public Member Functions

 weighted_footprint_detector ()
 default constructor.

 weighted_footprint_detector (int, int, char **, char **, char **, char **, int *, int *, int *)
 ~weighted_footprint_detector ()
 destructor.

void clear ()
 free all memory used.

void init_arrangement_cells ()
 based on the occurences of the fragments s in the genomes T, compute the cells C[i] of the interval arrangement induced by the occurences in each T[i].

bool bound (const w_bit_set &M, const unsigned int &j)
 bound function for branch-and-bound procedure enumerate_subsets (Algorithm (A2))

bool w_bound (const w_bit_set &M, const unsigned int &j)
 bound function for the weighted version of the branch-and-bound procedure enumerate_subsets (Algorithm (A2))

bool h_best_w_bound (const w_bit_set &M, const unsigned int &j)
 bound function for the h-best-match and weighted version of the branch-and-bound procedure enumerate_subsets (Algorithm (A2)).

void enum_w_sets (w_bit_set M, int j)
 needed for the weighted version of the branch-and-bound-version of Algorithm (A1) implemented in enumerate_subsets()

void enum_h_best_w_sets (w_bit_set M, int j)
 weighted version for computing the h best solutions using Algorithm (A2).

void enum_h_best_sets (w_bit_set M, int j)
 unweighted version of computing the h best solutions using Algorithm (A2).

void enum_sets (w_bit_set M, int j)
 needed for the branch-and-bound-version of Algorithm (A1) implemented in enumerate_subsets()

void enumerate_subsets ()
 implements the branch-and-bound-version of Algorithm (A2)

void enumerate_h_best_subsets ()
 implements the branch-and-bound-version of Algorithm (A2) for finding the h best matches.

void compute_intersections ()
 implements the branch-and-bound-version of Algorithm (A1)

void update_h_best_list (const float &wght, int *nu, w_bit_set &intersection)
 if a new hit was found that might belong to the list of the h best solutions, this procedure is called by compute_h_best_intersections().

void compute_h_best_intersections ()
 compute the h best hits using the branch-and-bound-version of algorithm (A1)

w_bit_set difference (w_bit_set M, w_bit_set &D)
 return M setminus D

bool test_diff_set (w_bit_set &M)
 test whether M setminus D is a suitable barbque.

void get_element_array (w_bit_set M, int *E, int &l)
 return l=M.cardinality(), with E[0],...E[l-1] containing the l elements contained in M.

void enumerate_differences (int delta)
 compute a best barbeque with delta-bounded difference based on a parameterized version of the bbq problem.

void get_frag_occurences (int i, int j, match_list &)
 compute the occurences of s[j] in T[i], and produce match_list as a result.

void get_occurences (int i, int j)
 compute the occurences of s[j] in T[i]

void get_group_occurences (int i, int j)
 compute the occurences of the fragment in group j in T[i]

void run (int, float)
 start the matching process

void reverse_compl ()
 compute the reverse complement of the patterns, so that in effect

float get_result ()
 return the best solution found so far.

void print_h_results ()
 print the best h solution found so far into stdout.

float print_single_result (int *nu, w_bit_set &intersection, const float &weight)
 print the solution specified by intersection into stdout.

float print_result ()
 print the result of the whole alignment procedure into stdout.

void print_postscript (std::string filename)
 print the result of the whole alignment procedure into a postscript file named filename.

void print_ps_labels (std::string filename)
 print labels of the current instance into a postscript file named filename.

void print_ps_h_results (std::string &)
 print the best h results of the whole alignment procedure into a postscript file named filename.

void print_ps_single_result (int *nu, std::ostream &o, w_bit_set &intersection)
 print the solution specified by nu into a postscript file associated with o

void print_with_names (w_bit_set &)
 print the names of the sequences (or groups) contained in the argument passed.

void set_number_of_hits (int)
 set the number h of hits to be computed, i.e., the best h (substantially different) solutions are computed.

void set_weighted (bool)
 switch weighted intersections on and off.

void set_count_results (bool)
 switch between couting and printing results for thresholded instances.

void set_weight_type (int wt)
 choose a weighting scheme.

void set_reverse (bool)
 if set to true, the fragments s[k/2],...,s[k-1] are regarded as fragments on the 5' strand, while s[1],...,s[k/2-1] are regarded as fragments on the 3' strand.

void set_postscript (bool)
 switch weighted intersections on and off.

void set_ps_filename (std::string)
 switch weighted intersections on and off.

void set_progress_report (bool)
 switch printing of progress report during execution.

void set_algorithm (int A)
 Choose the algorithm to be applied.

void set_grouping (bool)
 switch grouping of fragments on and off.

void set_group_characters (int)
 switch number of leading chars used for grouping of fragments on and off.

void get_groups ()
 compute group names form fragment names as well as the corresponding mapping group_indices that assigns a set of fragment index to each group index.

void set_output_detail (bool)
 pass true if more detailled output is desired.

void set_delta (int)
 set the bound for enumerate_differences().

bool increase (int *nu, int I)
 increase the index vector nu at position I.

bool differs (int *, int *)
 When calling compute_h_best_intersections(), we want to obtain substantially different solutions.

double p_value (int i, int j, int nu)
 Compute a first approximation to a p-value like measure of significance for the nu-th occurence of s[j] in T[i].

void get_freq ()
 We count neucleotide frequencies and dinucleotide frequencies in the genome, so that we can determine (approximations of) p- and E-values later on.

float get_weight (int i, int j, int nu)
 compute a weight for the nu-th occurence of s[j] in T[i] (with position p[i][j][nu] )

void compute_index_table ()
 do some preprocessing for p_value() and get_freq()


Detailed Description

template<class w_bit_set>
class bbq::weighted_footprint_detector< w_bit_set >

Class weighted_footprint_detector implements most crucial parts of the algorithms underlying bbq, including computation of the interval arrangements and finding best intersections between cells.

It takes a class w_bit_set that implements all set operations needed by weighted_footprint_detector. Different classes provided for w_bit_set implement weighted/unweighted sets or multisets with different contraints on the cardinality of the sets (and hence on the number of motifs that bbq can handle).


Member Typedef Documentation

template<class w_bit_set>
typedef std::set< w_bit_set > bbq::weighted_footprint_detector< w_bit_set >::arrangement
 

Procedure enumerate_differences needs fast finding of elements in the interval arrangements, which is achieved by this type of an stl multiset.

template<class w_bit_set>
typedef std::pair< typename arrangement::iterator , typename arrangement::iterator > bbq::weighted_footprint_detector< w_bit_set >::arrangement_range
 

template<class w_bit_set>
typedef std::set< w_bit_set > bbq::weighted_footprint_detector< w_bit_set >::cell_set
 


Constructor & Destructor Documentation

template<class w_bit_set>
bbq::weighted_footprint_detector< w_bit_set >::weighted_footprint_detector  ) 
 

default constructor.

template<class w_bit_set>
bbq::weighted_footprint_detector< w_bit_set >::weighted_footprint_detector int  ,
int  ,
char **  ,
char **  ,
char **  ,
char **  ,
int *  ,
int *  ,
int * 
 

template<class w_bit_set>
bbq::weighted_footprint_detector< w_bit_set >::~weighted_footprint_detector  ) 
 

destructor.


Member Function Documentation

template<class w_bit_set>
bool bbq::weighted_footprint_detector< w_bit_set >::bound const w_bit_set &  M,
const unsigned int &  j
 

bound function for branch-and-bound procedure enumerate_subsets (Algorithm (A2))

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::clear  ) 
 

free all memory used.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::compute_h_best_intersections  ) 
 

compute the h best hits using the branch-and-bound-version of algorithm (A1)

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::compute_index_table  ) 
 

do some preprocessing for p_value() and get_freq()

by default, we want to use asymmetric match tables, i.e., wildcard Ns are only wildcards if they occur in a pattern fragment, but not in the text.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::compute_intersections  ) 
 

implements the branch-and-bound-version of Algorithm (A1)

template<class w_bit_set>
w_bit_set bbq::weighted_footprint_detector< w_bit_set >::difference w_bit_set  M,
w_bit_set &  D
 

return M setminus D

template<class w_bit_set>
bool bbq::weighted_footprint_detector< w_bit_set >::differs int *  ,
int * 
 

When calling compute_h_best_intersections(), we want to obtain substantially different solutions.

The procedure differs() decides whether the two solutions given as an argument are "substantially different". In fact, there are numerous ways of formally defining solutions to be different. Hence, this procedure may get outsorced in a template class in later versions, so that several difference policies can be implemented.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enum_h_best_sets w_bit_set  M,
int  j
 

unweighted version of computing the h best solutions using Algorithm (A2).

Uses a difference policy in the same way as compute_h_best_intersections().

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enum_h_best_w_sets w_bit_set  M,
int  j
 

weighted version for computing the h best solutions using Algorithm (A2).

Uses a difference policy in the same way as compute_h_best_intersections().

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enum_sets w_bit_set  M,
int  j
 

needed for the branch-and-bound-version of Algorithm (A1) implemented in enumerate_subsets()

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enum_w_sets w_bit_set  M,
int  j
 

needed for the weighted version of the branch-and-bound-version of Algorithm (A1) implemented in enumerate_subsets()

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enumerate_differences int  delta  ) 
 

compute a best barbeque with delta-bounded difference based on a parameterized version of the bbq problem.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enumerate_h_best_subsets  ) 
 

implements the branch-and-bound-version of Algorithm (A2) for finding the h best matches.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::enumerate_subsets  ) 
 

implements the branch-and-bound-version of Algorithm (A2)

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::get_element_array w_bit_set  M,
int *  E,
int &  l
 

return l=M.cardinality(), with E[0],...E[l-1] containing the l elements contained in M.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::get_frag_occurences int  i,
int  j,
match_list match_pos
 

compute the occurences of s[j] in T[i], and produce match_list as a result.

ATTENTION: DYNAMIC PROGRAMMING HAS BEEN DISABLED IN THE FOLLOWING TWO LINES! MISMATCHES INSTEAD OF EDIT-DISTANCE ARE USED; TO SWITCH BACK TO EDIT DISTANCE, REPLACE INT_MAX BY THE COMMENTED CODE.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::get_freq  ) 
 

We count neucleotide frequencies and dinucleotide frequencies in the genome, so that we can determine (approximations of) p- and E-values later on.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::get_group_occurences int  i,
int  j
 

compute the occurences of the fragment in group j in T[i]

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::get_groups  ) 
 

compute group names form fragment names as well as the corresponding mapping group_indices that assigns a set of fragment index to each group index.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::get_occurences int  i,
int  j
 

compute the occurences of s[j] in T[i]

template<class w_bit_set>
float bbq::weighted_footprint_detector< w_bit_set >::get_result  ) 
 

return the best solution found so far.

template<class w_bit_set>
float bbq::weighted_footprint_detector< w_bit_set >::get_weight int  i,
int  j,
int  nu
 

compute a weight for the nu-th occurence of s[j] in T[i] (with position p[i][j][nu] )

template<class w_bit_set>
bool bbq::weighted_footprint_detector< w_bit_set >::h_best_w_bound const w_bit_set &  M,
const unsigned int &  j
 

bound function for the h-best-match and weighted version of the branch-and-bound procedure enumerate_subsets (Algorithm (A2)).

Uses a difference policy in the same way as compute_h_best_intersections().

template<class w_bit_set>
bool bbq::weighted_footprint_detector< w_bit_set >::increase int *  nu,
int  I
 

increase the index vector nu at position I.

If nu[I]>Lambda[I], then an overflow is handled, i.e., nu[I+1] is increased. This procedure is needed by compute_intersections() for the branch-and-bound version of Algorithm (A1).

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::init_arrangement_cells  ) 
 

based on the occurences of the fragments s in the genomes T, compute the cells C[i] of the interval arrangement induced by the occurences in each T[i].

std::cout<<"*** "<<i<<"\n";

template<class w_bit_set>
double bbq::weighted_footprint_detector< w_bit_set >::p_value int  i,
int  j,
int  nu
 

Compute a first approximation to a p-value like measure of significance for the nu-th occurence of s[j] in T[i].

The log of this p-value is used for an advanced weighting scheme.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::print_h_results  ) 
 

print the best h solution found so far into stdout.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::print_postscript std::string  filename  ) 
 

print the result of the whole alignment procedure into a postscript file named filename.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::print_ps_h_results std::string &   ) 
 

print the best h results of the whole alignment procedure into a postscript file named filename.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::print_ps_labels std::string  filename  ) 
 

print labels of the current instance into a postscript file named filename.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::print_ps_single_result int *  nu,
std::ostream &  o,
w_bit_set &  intersection
 

print the solution specified by nu into a postscript file associated with o

template<class w_bit_set>
float bbq::weighted_footprint_detector< w_bit_set >::print_result  ) 
 

print the result of the whole alignment procedure into stdout.

template<class w_bit_set>
float bbq::weighted_footprint_detector< w_bit_set >::print_single_result int *  nu,
w_bit_set &  intersection,
const float &  weight
 

print the solution specified by intersection into stdout.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::print_with_names w_bit_set &   ) 
 

print the names of the sequences (or groups) contained in the argument passed.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::reverse_compl  ) 
 

compute the reverse complement of the patterns, so that in effect

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::run int  ,
float 
 

start the matching process

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_algorithm int  A  ) 
 

Choose the algorithm to be applied.

A=1 chooses (A1) (default), A=2 chooses (A2). All other values are ignored and leave the current method unchanged.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_count_results bool   ) 
 

switch between couting and printing results for thresholded instances.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_delta int   ) 
 

set the bound for enumerate_differences().

If set_delta() is called, matching will automatically be performed by enumerate_differences().

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_group_characters int   ) 
 

switch number of leading chars used for grouping of fragments on and off.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_grouping bool   ) 
 

switch grouping of fragments on and off.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_number_of_hits int   ) 
 

set the number h of hits to be computed, i.e., the best h (substantially different) solutions are computed.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_output_detail bool   ) 
 

pass true if more detailled output is desired.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_postscript bool   ) 
 

switch weighted intersections on and off.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_progress_report bool   ) 
 

switch printing of progress report during execution.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_ps_filename std::string   ) 
 

switch weighted intersections on and off.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_reverse bool   ) 
 

if set to true, the fragments s[k/2],...,s[k-1] are regarded as fragments on the 5' strand, while s[1],...,s[k/2-1] are regarded as fragments on the 3' strand.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_weight_type int  wt  ) 
 

choose a weighting scheme.

See weight_type for weighting schemes available.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::set_weighted bool   ) 
 

switch weighted intersections on and off.

template<class w_bit_set>
bool bbq::weighted_footprint_detector< w_bit_set >::test_diff_set w_bit_set &  M  ) 
 

test whether M setminus D is a suitable barbque.

template<class w_bit_set>
void bbq::weighted_footprint_detector< w_bit_set >::update_h_best_list const float &  wght,
int *  nu,
w_bit_set &  intersection
 

if a new hit was found that might belong to the list of the h best solutions, this procedure is called by compute_h_best_intersections().

If the new candidate is sufficiently different from the h best solutions found so far, the list of the h best solutions is updated by the new solution correspondingly.

template<class w_bit_set>
bool bbq::weighted_footprint_detector< w_bit_set >::w_bound const w_bit_set &  M,
const unsigned int &  j
 

bound function for the weighted version of the branch-and-bound procedure enumerate_subsets (Algorithm (A2))


The documentation for this class was generated from the following files:
bbq Implementation Documentation
Axel Mosig, Bioinformatics, University of Leipzig