3. Quick Start¶
3.1. Input files¶
The most important input file is a DNA sequence. This could be a multifasta
sequence that belongs from a common specie (i.e. complete genome or group of
particular sequences). At the same time, previous to execute miRNAture a
you have to download a pre-calculated dataset (as indicated on Pre-calculated datasets Section)
that contains default data as CMs, HMMs, and required files to perform mature prediction.
Once located in your computer, the path might be indicated with the flag -dataF.
3.1.1. Activate the mirnature environment¶
conda activate mirnature
3.1.2. Run miRNAture¶
A complete mode should be run as follows:
./miRNAture -stage complete -dataF <Precalculated_folder> \
-speG <Target Genome> -speN <Specie_name> \
-speT <Tag_specie> -w <Output_dir> \
-m <Mode> (-str <Blast_strategy>) \
-blastq <Blast_queries_folder>
But it is always recommended to look up specific parameters. Do not use the default parameters for all experiments.
3.1.3. Output files¶
Final predicted miRNAs will be written on the <Output_dir> indicated with the -w flag.
The final candidates are described on the folder Final_miRNA_evaluation/ as
follows:
Final_miRNA_evaluation/
├── Fasta/
├── MFE/
├── miRNA_annotation_<Tag_specie>_accepted_conf.bed
├── miRNA_annotation_<Tag_specie>_accepted_conf.gff3
├── miRNAture_summary_<Tag_specie>.txt
└── Tables/
Inside this folder, miRNAture will create 3 folders containing their
correspondent results: sequences in fasta format (Fasta/), minimum free energy and
lengths from described sequences (MFE/) and the supporting information ordered in tables
for each annotated candidate (Tables/). Additionally, associated genomic positions
for the miRNA candidates are reported in BED and GFF3 formats and a summary file,
miRNAture_summary_<Tag_specie>.txt, that describes overall descriptive statistics from found miRNA
families.
3.1.4. Pre-calculated datasets¶
Pre-calculated data composed by miRNA CMs, HMMs and required input files to perform mature annotation has
to be downloaded before run the full miRNAture pipeline. Available datasets are listed below:
NEW: Curated metazoan families from miRBase v.22.1, available to the structural validation stage.
Required data to re-annotate human miRNAs: include CMs and HMMs build from miRBase without human sequences. Stored in Zenodo here.