pybarrnap is a python implementation of barrnap (Bacterial ribosomal RNA predictor). pybarrnap provides a CLI compatible with barrnap and also provides a python API for running rRNA prediction and retrieving predicted rRNA. pybarrnap default mode depends only on the python library and not on the external command-line tools nhmmer and bedtools. As an additional feature from barrnap, accurate mode is available by installing the external command-line tool cmscan(infernal).
Note
Barrnap v0.9 uses the HMM profile database created from older releases of Rfam and SILVA. On the other hand, pybarrnap default mode uses the HMM profile database created from the Rfam(14.10). Therefore, there will be some differences in results between Barrnap v0.9 and pybarrnap default mode.
Python 3.8 or later
is required for installation.
pybarrnap depends on pyhmmer and biopython python library.
If accurate mode is required, please install infernal additionally.
Install PyPI package:
pip install pybarrnap
Install bioconda package:
conda install -c conda-forge -c bioconda pybarrnap
Use Docker (Image Registry):
docker run -it --rm ghcr.io/moshi4/pybarrnap:latest pybarrnap -h
pybarrnap genome.fna > genome_rrna.gff
$ pybarrnap --help
usage: pybarrnap [options] genome.fna[.gz] > genome_rrna.gff
Python implementation of barrnap (Bacterial ribosomal RNA predictor)
positional arguments:
fasta Input fasta file (or stdin)
optional arguments:
-e , --evalue E-value cutoff (default: 1e-06)
-l , --lencutoff Proportional length threshold to label as partial (default: 0.8)
-r , --reject Proportional length threshold to reject prediction (default: 0.25)
-t , --threads Number of threads (default: 1)
-k , --kingdom Target kingdom [bac|arc|euk|all] (default: 'bac')
kingdom='all' is available only when set with `--accurate` option
-o , --outseq Output rRNA hit seqs as fasta file (default: None)
-i, --incseq Include FASTA input sequences in GFF output (default: OFF)
-a, --accurate Use cmscan instead of pyhmmer.nhmmer (default: OFF)
-q, --quiet No print log on screen (default: OFF)
-v, --version Print version information
-h, --help Show this help message and exit
Tip
If --accurate
option is set, cmscan(infernal) is used for rRNA search instead of pyhmmer.nhmmer.
Although cmscan is slower than pyhmmer.nhmmer, it is expected to give more accurate results because it performs rRNA searches using RNA secondary structure profiles.
Click here to download examples dataset.
Print rRNA prediction result on screen
pybarrnap examples/bacteria.fna
Output rRNA predition result to file
pybarrnap examples/archaea.fna -k arc --outseq rrna.fna --incseq > rrna_incseq.gff
With pipe stdin
cat examples/fungus.fna | pybarrnap -q -k euk | grep 28S
pybarrnap provides simple API for running rRNA prediction and retrieving predicted rRNA.
from pybarrnap import Barrnap
from pybarrnap.utils import load_example_fasta_file
# Get example fasta file path
fasta_file = load_example_fasta_file("bacteria.fna")
# Run pybarrnap rRNA prediction
barrnap = Barrnap(
fasta_file,
evalue=1e-6,
lencutoff=0.8,
reject=0.25,
threads=1,
kingdom="bac",
accurate=False,
quiet=False,
)
result = barrnap.run()
# Output rRNA GFF file
result.write_gff("bacteria_rrna.gff")
# Output rRNA GFF file (Include input fasta sequence)
result.write_gff("bacteria_rrna_incseq.gff", incseq=True)
# Output rRNA fasta file
result.write_fasta("bacteria_rrna.fna")
# Get rRNA GFF text and print
print("\n========== Print rRNA GFF ==========")
print(result.get_gff_text())
# Get rRNA features and print
print("\n========== Print rRNA features ==========")
for rec in result.seq_records:
for feature in rec.features:
print(feature.id, feature.type, feature.location, feature.qualifiers)
# Get rRNA sequences and print
print("\n========== Print rRNA sequences ==========")
for rec in result.get_rrna_seq_records():
print(f">{rec.id}\n{rec.seq}")
pybarrnap was reimplemented in python based on the perl implementation of Barrnap v0.9. HMM(Hidden Marcov Model) and CM(Covariance Model) profile database for pybarrnap was created from Rfam(14.10).