TMM on a targeted RNA-Seq dataset
1
0
Entering edit mode
2.6 years ago
oakhamwolf ▴ 20

Hi all

Hope someone can help with this.

We have some targeted RNA-seq data as part of a qualitative pilot study we're doing. We're not looking to do DE on this data and do not have replicates.

While we've done some simple within-sample transcript level comparisons using Salmon TPM values we would like to do some between-sample comparisons to assess levels of particular transcripts across the cell line samples.

Ordinarily I'd be bringing these into R and computing TMM and looking at the normalised CPM values. However, given this is effectively a subset of the transcriptome I was wondering if this method is still appropriate in this context and if there was anything I need to keep in mind when computing TMM on transcript level counts?

Many thanks in advance for any help provided.

Cheers

tmm edger Targeted_rnaseq • 1.1k views
ADD COMMENT
2
Entering edit mode
2.6 years ago
ATpoint 85k

You need to run the TMM calculation on a set of genes that you are confident with that they are non DE. After all the edgeR procedure is a two-step process. First you normalize by total library size and then you correct this with the TMM factors. This is all done internally, e.g. when running calcNormFactors() and then cpm(). The total library part is the same be it targeted or full transcriptome, but the TMM part should be based on non-DE genes. In a full transcriptome setting TMM tries to automatically find these genes by trimming away genes with extreme M values, and it is good at doing that as long as you do not have extreme shifts in your data and a good portion of non-DE genes. In targeted approaches that is not guaranteed. Are these any controls in there that you can run it on? Technically you would run calcNormFactors() on the count matrix containing only control genes and then feed the factors back to the full DGEList object. Do you have such controls? You can also just run it on the whole thing and then make an MA-plot, checking if individual genes that are supposed to be non-DE center somewhat at y=0.

ADD COMMENT
0
Entering edit mode

I see. Thank you. That makes a lot of sense. This experiment is really just a look-see at the expression of a bunch of transcripts across a bunch of cell lines. There aren't any control transcripts in there that we are are confident won't change across cell-lines. However, we have some ERCCs in the mix so providing they have behaved themselves (they haven't been in other runs so we haven't looked at them yet for this one) we could use them as the non DE "genes". I might take the MA-plot route too though and see what we have. Thanks again for your time. It is very much appreciated and including control genes in the next run will be a must. Cheers

ADD REPLY
0
Entering edit mode

....also how can I accept your comment as an answer? Thank you.

ADD REPLY
0
Entering edit mode

moved to answer, glad it helped you.

ADD REPLY

Login before adding your answer.

Traffic: 1147 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6