Accurately detecting functional genes in metagenomes.
For installation instructions, see INSTALL.md. For license information, see LICENSE.txt.
Once you have installed ROCker, the easiest way to use it is by searching pre-existing models. We maintain a list of precomputed models that you're free to use.
-
Obtain the model of interest either downloading it from our repository or creating one yourself (see below).
-
Execute ROCker search. The minimum required parameters are:
$> ROCker search -q input.fasta -k model.rocker -o output.blast
Where
input.fasta
is the input metagenome in FastA format,model.rocker
is the ROCker model, andoutput.blast
is the output file to be created in tabular BLAST format. For additional supported options, executeROCker search -h
. -
If you have a pre-computed BLAST file, you can execute instead:
$> ROCker filter -x input.blast -k model.rocker -o output.blast
Where
input.blast
is the input search to be filtered in tabular BLAST format,model.rocker
is the ROCker model, andoutput.blast
is the output file to be created in tabular BLAST format. For additional supported options, executeROCker filter -h
.
Collect a good reference collection of the gene of interest. This is the most important step, but there are some resources to help you. In general, we find the resources at UniProt very useful.
-
Create a list of UniProt identifiers (IDs and/or accessions) representing proteins of the family of interest, in a raw text file (one per line).
-
If you want to explicitly exclude certain proteins from the model (e.g., if there are very similar proteins with distinct functional properties), create a similar list with those, we will refer to them as a negative set and it's optional.
-
Build the model files. The minimum required parameters are:
$> ROCker build -P positive.txt -o prep
Where
positive.txt
is the set from step 1, andprep
is the base name for the output files. You can also pass the negative set from step 2 using-N
(or-n
). For additional supported options, executeROCker build -h
. This is by far the most computationally-expensive step, so you might want to consider using multiple threads (-t
) or even re-using files in case the run fails (--reuse-files
and--nocleanup
). Also, consider setting the simulated read length to match that of your metagenomes (-l
). -
Compile the model. The minimum required parameters are:
$> ROCker compile -a prep.aln -b prep.blast -k model.rocker
Where
prep.aln
is the alignment generated in step 3 (manual curation is strongly encouraged),prep.blast
is the reference BLAST generated in step 3, andmodel.rocker
is the model to compile. -
Register your model (optional). If you would like to share your model with the community, please Contact us. We'll need the final ROCker model and the reference BLAST, and will add your model to our curated list.