Entering edit mode
5.0 years ago
noodle
▴
590
Let's say I have a large NCBI query which I get disconnected from (example below) - is there a way to restart the query from a known # in the list? Or to break apart the esearch
query into a few lists to pipe into esummary
or efetch
? Thanks!
esearch -db 'protein' -query 'CRISPR' | esummary -db 'protein' -format fasta > output_fasta.txt
Break query into smaller chunks. Also sign up for NCBI API key and use it, if you have not done so already.
Is this your real query or just something that you picked as an example? Something is not right... it returns 127 million hits for this query which I think is pretty much the entire Protein database. If you search for the term
CRISPR
in the NCBI Protein portal you do not get any results.EDirect is good for relatively small number of records; a few thousands and depending on the data even a few tens of thousands but not more than that. It will be quicker for you to just download the entire protein dataset from NCBI FTP and filter the specific accessions of interest to you.
Thanks for the reply - this is just a ridiculous example. Downloading and filtering may be the best approach for me.