Hello,
Suppose I want to a nucleotide sequence from a specific transcript isoform for EGFR. I could, then, do something fairly manual like navigate to https://www.ncbi.nlm.nih.gov/nuccore/NM_001346941.2 and look scroll down, then count the nts, then cut and paste.
However, I feel there has got to be (probably many) programmatic ways extract (for example) the 1101st to 1217th nucleotides from this transcript.
I looked around and found things like biomartr::is.genome.available()
but this appears to be for higher level downloading, like getting all the transcripts by organism.
I must be missing something. Is there a tool out there that, if given, download_refseq_nt_sequence(NM_001346941.2, '1110','1217')
, will return the actual sequence?
Could be R, python, bash, or webtool; i can use any of them.
thank you very much
thank you!!!!!!!!!!! this was so helpful. makes me think we convenient browser based tools etc. i really appreciate you.
suppose i wish to start from an amino acid position insteead, but then still pull nucleotides (or vice versa).
Is there anyway to grab variant positions neatly?
Looks like i may want something like this (piping thru efetch): esearch -db gene -query "BRCA2 [GENE] AND human [ORGN]" |. efetch -format docsum |
Can you provide an example?
Hey Geno!! Sure. thanks so much for following up.
Suppose what I have is:
.. but what I want is:
or better yet NM_001346941 (some # of NTs before)C(some # of NTs after) e.g.
or even rsID:
How about (truncated for space). First columns is rsID.
I really owe you one Geno. working on a deadline here and appreciate you.
Looks like this does a lot of it??
https://github.com/zwdzwd/transvar