A general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths.
Seqfu can be easily installed via Miniconda:
conda install -y -c conda-forge -c bioconda "seqfu>1.10"
Building the Nim programs alone would just require a nimble build
,
but this would leave out some other utilities.
There is a make
(Makefile) building system. Since Nim is not so popular,
I describe a full installation:
# Do you have building tools? You will need C and make, in Ubuntu:
sudo apt install build-essential
# Install zlib
sudo apt install zlib1g-dev
# Install Nim 2.0
curl https://nim-lang.org/choosenim/init.sh -sSf | sh
# Clone this repo
git clone https://github.com/telatin/seqfu2
# Compile and test
cd seqfu2
make
make test
# All binaries are in bin (move them in a location in your $PATH)
Telatin A, Fariselli P, Birolo G. SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering 2021, 8, 59. doi.org/10.3390/bioengineering8050059
@article{seqfu,
title = {SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files},
author = {Telatin, Andrea and Fariselli, Piero and Birolo, Giovanni},
year = 2021,
journal = {Bioengineering},
volume = 8,
number = 5,
doi = {10.3390/bioengineering8050059},
issn = {2306-5354},
url = {https://www.mdpi.com/2306-5354/8/5/59},
article-number = 59,
pubmedid = 34066939
}
The full documentation is available at: telatin.github.io/seqfu2