HI!
In GWAS, LD pruning is commonly used to remove SNPs that are in high linkage disequilibrium to ensure that the SNPs being analysed are independent. However, this process reduces the number of SNPs significantly. YES, this is not a technical/bioinformatic query! YET IMPORTANT
So, what's the rationale and implications of LD pruning?
Isn't the LD pruning result in a significant loss of genetic variability by discarding many SNPs? and if anyone says, it does retain independent variants then, how does retaining only independent SNPs impact the ability to capture meaningful genetic variability? also, post LD, the SNPs count may reduce to upto 1Lakh only.
In what ways does focusing on independent SNPs enhance the statistical power and reliability of GWAS results?
Any insights wrt GWAS would be greatly appreciated.
This question has been addressed many times on this forum. Id recommend starting with similar questions that received an accepted answer.
Ignore unhelpful comments like these. Biostars isn't exactly super-searchable and sometimes the terminology/implementations can be confusing to new users.
I would read over: https://www.nature.com/articles/s43586-021-00056-9 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6001694/
But to answer your quesiton.
With GWAS, we are only testing for associations, which may or may not be casual. And given the number of tests, we are esentially guaranteed highly significant results. Thus, we need to do some sort of multiple correction. And here lies the problem. Many SNPs are going to be in LD to each other based on ancestry. So by testing every associations, our multiple correction will be more impactful and the chance of false-negatives increases. Thus, a lot of researchers will LD prune to reduce the chance of false-negatives and look for ways to prioritize SNPs within significant GWAS hits to increase the chance of finding something casual.
would you care to propose functionality to enhance the searchability? Like with other similar sites, building a repository is meant to be a core feature of this site. Istvan Albert
Search is more complicated to implement than one might think.
In general, I would much prefer that instead of just stating "there are many questions like this" and leaving it at that, the reply would also list some of those threads. Part of being an expert is knowing which answers are valid and accurate.
Not so long ago, a Biostar moderator made a somewhat cocky "Let Me Google This For You" post in reply to a question, but then, amusingly, it turned out that the first hit in Google (also posted on Biostar) was an incorrect answer.
The moderator was so caught up in their perceived righteousness that they did not even bother checking the actual answer ... so there ... that is a teachable moment.
fair enough, though this particular post would almost certainly receive a correct answer through perplexity, chatgpt, etc. off the top. and i dont think its particularly difficult to find posts in this forum regarding multiple testing or snp pruning based on number of independent tests. honestly, the links to the right are more or less all OP would need to read to get started.