Question

How to find the adapter sequence of an RNAseq dataset?

0

Entering edit mode

5 hours ago

Maryam • 0

I am analyzing an RNA-seq dataset (GSE174302) produced using the HiSeq X Ten platform (GPL20795) for Homo sapiens. The fastQC report revealed failed Adapter Content checks, indicating that adapter sequences may be present in the reads. I would like to trim the adapter sequences but am unable to identify them.

The overrepresented sequences in the fastQC report show NO HIT

.I searched about HiSeq X Ten platform and I found that HiSeq X Reagent Kits support the TruSeq DNA PCR-Free Library Preparation Kit. Then I tried to find the adapter sequence in the Illumina adapter sequence sheet.

But, non of the adapters there were identical with the overrepresented sequences in the fastQC report.

Is there a way that gives me the adapter sequence? enter image description here

fastQC trimming adapter RNAseq • 70 views

ADD COMMENT • link updated 1 hour ago by GenoMax 146k • written 5 hours ago by Maryam • 0

score 1 · Answer 1 · 2024-11-01

Looks like you are doing this analysis via Galaxy so the following may not help immediately. You can always post to Galaxy help forum for specific help at: https://help.galaxyproject.org/

"failure" on a fastqc test does not mean that you can't proceed with further data analysis. Looks like you have standard Illumina adapters which can be easily identified and trimmed by fastp (LINK). You could also use bbduk.sh from BBMap suite which provides a file with commonly used adapter sequences in "resources" folder of the software distribution.

non of the adapters there were identical with the overrepresented sequences in the fastQC report.

Overrepresented sequenced do not need to be those of adapters. They could be parts of genes/segments which are high copy in starting material.