Question

DESeq2 lfcshrink, shrinkage apeglm vs normal

0

Entering edit mode

12 months ago

joonhong kwon ▴ 70

Hi all,

I have some question about the lfcshrink() function of DESeq2.

When analyzed using the unshrinkage method in DESeq2, the fold change was overcalculated (Log2FC= ~30), so we are trying to apply the shrinkage methods in DESeq2. There are three methods (apeglm, ashe, normal) introduced in the DESeq2 vignette, and among them, I would like to use the 'normal' method. This is because the fold change is calculated in a more desired direction in the 'normal' method.

But people, including developers (Michael Love), don't seem to recommend the 'normal' method. I know that the apeglm method is a better method, but the Log2FC value is almost 0 in the apeglm method. I think the use of apeglm is only recommended, but it does not mean that you should not use the normal method.

Is using the ‘normal’ method itself a wrong method?
Is there a way to get more general FC values (values that are not close to 0) while using the 'apeglm' method?

Thanks in advance,

Joonhong

DESeq2 RNA-seq DEG shrinkage • 1.5k views

ADD COMMENT • link updated 12 months ago by antonioggsousa 3.2k • written 12 months ago by joonhong kwon ▴ 70

0

Entering edit mode

Hi,

Sorry, what do you mean with:

the Log2FC value is almost 0 in the apeglm method

Shrinkage aims to deal with the "inflation" of the log2 fold change for lowly expressed genes. Therefore, shrinkage should affect the log2 fold change of lowly expressed genes, but less highly expressed genes.

Take this example:

log2(0.2/0.02)=3.321928 # lowly expressed genes

log2(200/50)=2 # highly expressed genes

Which one are you more confident about?

The confidence that you have for the quantification of lowly expressed genes is lower than those highly expressed. Little changes in lowly expressed genes will have a big effect on the log2 fold change. Shrinkage attempts to control/deal with this.

I hope this helps to clarify.

Best,

António

ADD REPLY • link 12 months ago by antonioggsousa 3.2k

0

Entering edit mode

Thank you for reply.

I applied DESeq2 for snRNA-seq pseudobulk DEG analysis. However, in some clusters, even if the average expression value is high, the fold change is calculated close to 0. (figure, cluster2) The fold change of all genes in Cluster2 is close to 0.

When comparing the unshrinkage, normal, and apeglm results, I think that the fold change was appropriately corrected in cluster1. However, I think it was calculated very aggressively in cluster2.

Are these results reliable?

enter image description here

Thank you!

Joonhong

ADD REPLY • link 12 months ago by joonhong kwon ▴ 70

1

Entering edit mode

Two take aways:

The fact that the mean, i.e., baseMean, is high does not mean that the log2 fold change is too. As far as I remember the baseMean is calculated across all the samples, meaning that you may have high expression values in both conditions you're comparing and, as such, low log2 fold changes but high baseMean.
I think the difference between your cluster 1 and cluster 2 is the adjusted p-value, i.e., padj. In the cluster 1 is significant for all except the first and, if you look into the log2 fold change values after apelgm shrinkage, they look very similar to the previously unshrinkage values. On the other hand, the padj is not significant for any of the genes in cluster 2 and the shrinkage is more aggressive because the confidence on those log2 fold changes estimates is lower.

I hope this helps.

António

ADD REPLY • link 12 months ago by antonioggsousa 3.2k