You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am struggling with "L" and "UL" dataset. Please consider the following tree, where my research interest in not a single species, but rather a whole lineage(branch where I pointed arrow). I am interested into common set of gene which are lost (L) at that branch.
If I consider all "L" at that branch I am skipping lot of data, means if genes which are, lets say out of 14 in-group species in 11 they are Lost but in other 3 in-group that same gene is UL. Is it okay to say that gene is lost ?
Another If gene is clearly "L" in TOGA output, is it necessary to check in transcriptome data if that gene is transcribing or not or based on TOGA robustness we can say that lost is "Clear Lost" means no functional protein for that gene?
For "UL", I went through some discussion over TOGA GitHub issues but still I am not clear for their status; Is it right to say that if 1 gene have 10 transcript, out of 10 that could be possible that I can get true hit in transcriptome for 6, if they have inactivation mutation but not full filling the "loss" criteria
What could be the possible way to analyze in more detail for "UL" data , transcriptome dataset or RELAX selection test.
In provided sample tree, if gene is Intact till Query3 and then at my focal branch in all species that gene is either L or UL but this is gene NOT Intact, What would be the best possible status of that gene in your thoughts?
I know, this is too much to ask, But I really appreciate your thoughtful suggestions and they will really help me to sort my data in more logical way.
Looking forward to hear from you
Best Regards
Vinita
The text was updated successfully, but these errors were encountered:
not sure I fully understand all questions but I'll try my best.
For 3) pls have a look at the TOGA supplementary materials. We have images illustrating examples.
For 2 and 4), both RELAX and transcriptomics data is a good idea. Transcriptomics data could tell you if the inact mutation is potentially a base error (RNA reads don't have the genomic mutation), or if the exon with the mutation has maybe shifted splice sites or is skipped, such that the mutation is actually not part of a transcript.
The picture looks to me like a gene loss (or UL) on your focal branch, provided that no species in the group has an Intact gene.
Here are a few more considerations. Some ingroup species with UL can have another frameshift that returns into the ancestral reading frame (the first frameshift would then be shared with other species). Some ingroups can also have M (or in principle PI), e.g. say exon 3 and 7 have frameshifts and both exons are missing (assembly gap).
Probably best to assess with a multiple alignment whether inact mutations are shared among the species in your group and whether they can be assigned to the ancestral branch you labeled.
Hello Dr. @MichaelHiller
Greetings, I have some post toga queries:
I am struggling with "L" and "UL" dataset. Please consider the following tree, where my research interest in not a single species, but rather a whole lineage(branch where I pointed arrow). I am interested into common set of gene which are lost (L) at that branch.
I know, this is too much to ask, But I really appreciate your thoughtful suggestions and they will really help me to sort my data in more logical way.
Looking forward to hear from you
Best Regards
Vinita
The text was updated successfully, but these errors were encountered: