Error in "Per Tile Sequence Quality"
0
0
Entering edit mode
4 months ago
Umer ▴ 120

Hi,

Background: We got WGS data og fungal samples.

Coverage: 100x

Platform: Illumina NOvaseq x (Paired 150bp)

GOAL: Denovo Genome Assemblies

When performing initial FastQC analysis of raw data. everything seems to be really good.

      Total_Sequences     Total_Bases    Seq_Length     GC%
Read1    90264516           13.5 Gbp         150         47
Read2    90264516           13.5 Gbp         150         48

The Per Base Sequence Quality looks really good. as all the bases are in green zone. and a blue line (median line) is between 38-40 across the graph (from base 1 to 150). You can look at image here.

Problem 1: The next graph Per tile sequence quality the FastQC report shows error (red cross). the blue grapg below has some red spectrum lines. You can Look at the Image Here. After looking at QC-Fail I found out that this is related to Flow-Cell issues, here.

Questions:

  1. I their anything that should be done when performing the data QC via fastP or some other tool ?
  2. For De-novo genome assembly and Ref-guided genome assembly, is it important to remove these regions from data or I can just ignore these errors and move on with downstream analysis ?

Your help and views on this are welcomed.

Illumina NOvaSeqx genome-assembly FastQC • 258 views
ADD COMMENT
0
Entering edit mode

Do not lose a lot of sleep on this. There should be enough data that is from the blue region for the assemblies. But if you must then you could scan and remove reads that fall below quality scores represented by blue regions.

ADD REPLY

Login before adding your answer.

Traffic: 964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6