Hi everyone!
I´m trying to plot and compare count data of specific genes, using the plotCounts() function of DeSeq2. I´m using ENSEMBL annotation/ IDs. I find transcript IDs of my genes of interest using:
grch38annot[grch38annot$symbol == "EZH2", "ensgene"]
The resulting gene ID is then used as input for the plotCounts() function like this:
plotCounts(dds_lrt, gene="ENSG00000106462", intgroup="sampletype")
For the first five genes, everything worked out perfectly fine. For the gene "HCG18" I get an output of seven different gene IDs from the first function. Using any of those gene IDs for the second function causes an error "indexing outside of limit" (translated from german "Indizierung außerhalb der Grenzen"). I double-checked that the gene IDs are listed within my "grch38annot" dataframe. Does anyone know how to circumnavigate that error/ how it is caused?
Thanks a lot in advance,
Ella
Are you sure that particular Ensembl ID is still present in your
dds_lrt
object? If you have removed lowly expressed genes you might not have that gene in your object anymore.Thank you for the fast answer. Sometimes the explanation is too obvious to see it, you were right, that particular ID was not in my dds_lrt object; and even not present in datasets of previous steps.
Still, I´m wondering where that particular gene was lost. In another analysis (variant calling) I also see a fairly low coverage, but at least some counts. Strange.
When it comes to bulk RNA-seq, the only things that I can think of are: i) the gene was not annotated in the annotation file to being with or ii) is lost while removing genes with low counts. You can add
"ENS..." %in% rownames(dds_lrt)
here and there in your code and see after which step it turns toFALSE
. And if you spot the particular step where that gene is "lost", please let us know.Sorry, I oversaw your message. I investigated why the ID was lost and you were right, it was due to low coverage. Thanks again :)