Hi all,
I have some question about the lfcshrink() function of DESeq2.
When analyzed using the unshrinkage method in DESeq2, the fold change was overcalculated (Log2FC= ~30), so we are trying to apply the shrinkage methods in DESeq2. There are three methods (apeglm, ashe, normal) introduced in the DESeq2 vignette, and among them, I would like to use the 'normal' method. This is because the fold change is calculated in a more desired direction in the 'normal' method.
But people, including developers (Michael Love), don't seem to recommend the 'normal' method. I know that the apeglm method is a better method, but the Log2FC value is almost 0 in the apeglm method. I think the use of apeglm is only recommended, but it does not mean that you should not use the normal method.
- Is using the ‘normal’ method itself a wrong method?
- Is there a way to get more general FC values (values that are not close to 0) while using the 'apeglm' method?
Thanks in advance,
Joonhong
Hi,
Sorry, what do you mean with:
Shrinkage aims to deal with the "inflation" of the log2 fold change for lowly expressed genes. Therefore, shrinkage should affect the log2 fold change of lowly expressed genes, but less highly expressed genes.
Take this example:
log2(0.2/0.02)=3.321928
# lowly expressed geneslog2(200/50)=2
# highly expressed genesWhich one are you more confident about?
The confidence that you have for the quantification of lowly expressed genes is lower than those highly expressed. Little changes in lowly expressed genes will have a big effect on the log2 fold change. Shrinkage attempts to control/deal with this.
I hope this helps to clarify.
Best,
António
Thank you for reply.
I applied DESeq2 for snRNA-seq pseudobulk DEG analysis. However, in some clusters, even if the average expression value is high, the fold change is calculated close to 0. (figure, cluster2) The fold change of all genes in Cluster2 is close to 0.
When comparing the unshrinkage, normal, and apeglm results, I think that the fold change was appropriately corrected in cluster1. However, I think it was calculated very aggressively in cluster2.
Are these results reliable?
Thank you!
Joonhong
Two take aways:
The fact that the mean, i.e.,
baseMean
, is high does not mean that the log2 fold change is too. As far as I remember thebaseMean
is calculated across all the samples, meaning that you may have high expression values in both conditions you're comparing and, as such, low log2 fold changes but highbaseMean
.I think the difference between your cluster 1 and cluster 2 is the adjusted p-value, i.e., padj. In the cluster 1 is significant for all except the first and, if you look into the log2 fold change values after
apelgm
shrinkage, they look very similar to the previously unshrinkage values. On the other hand, thepadj
is not significant for any of the genes in cluster 2 and the shrinkage is more aggressive because the confidence on those log2 fold changes estimates is lower.I hope this helps.
António