Tag: NCBI Datasets

Access and Download Sequence Data and Metadata Using NCBI Datasets

Access and Download Sequence Data and Metadata Using NCBI Datasets

Goodbye Assembly and Genome, hello NCBI Datasets!

Exciting news! NCBI has streamlined and modernized how you access and download genome, taxonomy, and gene information with NCBI Datasets. As previously announced, NCBI Datasets is replacing the legacy Genome and Assembly resources providing you a single entry point to genome datasets. Effective today, the legacy pages are retired and no longer available.

Please note there will be no changes to how you programmatically access the databases using E-Utilities or EDirect. Continue reading “Access and Download Sequence Data and Metadata Using NCBI Datasets”

RefSeq Release 225 Now Available!

RefSeq Release 225 Now Available!

Check out RefSeq release 225, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of July 8, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 448,507,905 records
  • 334,845,613 proteins
  • 63,542,774 RNAs
  • Sequences from 152,668 organisms

The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “RefSeq Release 225 Now Available!”

Upcoming Changes to NCBI Taxonomy Classifications

Upcoming Changes to NCBI Taxonomy Classifications

NCBI is continually making improvements to our Taxonomy resource in response to new data and changes in biological nomenclature and classification. In the coming months, we will update the higher-level classification of birds (Aves), budding yeasts (Saccharomycotina), prokaryotes (Bacteria and Archaea) and Viruses. This update will also change the formal ranks of several high-level taxonomic names including Eukaryota. Except for the new species names for Viruses, none of these changes will affect organism names at the species level or below.  

Here is a brief overview of changes to each group in the order we plan to make them. Stay tuned for upcoming posts, which will describe the changes for each category in more detail.  Continue reading “Upcoming Changes to NCBI Taxonomy Classifications”

New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI

New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI

Sequence data from the ongoing avian influenza A (H5N1) virus outbreak in cattle are now available through NLM’s NCBI resources NCBI Virus and NCBI Datasets.

These data were submitted by the U.S. Department of Agriculture (USDA), U.S. Centers for Disease Control and Prevention (CDC), the World Health Organization (WHO), Iowa State University, and St. Jude Children’s Research HospitalContinue reading “New Data Available! Access Avian Influenza A (H5N1) Virus Sequences at NCBI”

Ortholog Groups Added for ~2 Million Insect Genes

Ortholog Groups Added for ~2 Million Insect Genes

Find evolutionarily related genes across insects and other arthropods on our new Ortholog webpages

NCBI recently released a set of orthologs for approximately 2 million insect genes. You can now find and access the orthologous genes, transcripts, and proteins by searching a species and gene name in NCBI All Databases, NCBI Gene, or NCBI Datasets. As previously described, these orthologs are based on comparisons to the Drosophila melanogaster annotated genome. Using Drosophila gene nomenclature for orthologs should lead to more informative gene symbols for insects and other arthropods.  Continue reading “Ortholog Groups Added for ~2 Million Insect Genes”

New! RefSeq Release 224

New! RefSeq Release 224

Check out RefSeq release 224, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of May 6, 2024, this full release incorporates genomic, transcript, and protein data containing:

  • 435,879,646 records
  • 324,246,652 proteins
  • 62,348,147 RNAs
  • Sequences from 150,742 organisms

The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “New! RefSeq Release 224”

Browse Taxonomy Records with NCBI Datasets

Browse Taxonomy Records with NCBI Datasets

New & improved NCBI Datasets Taxonomy pages and command-line service 

NCBI Datasets is excited to introduce new features to our Taxonomy pages making it easier for you to access, browse, and download taxonomic information about organisms at any taxonomic level.  

What’s new?
  • Explore Taxonomy records with an updated look and feel  
  • Access and download taxonomic metadata from the web or with our updated command-line (CLI) tools 

Continue reading “Browse Taxonomy Records with NCBI Datasets”

New RefSeq Annotations Now Available!

New RefSeq Annotations Now Available!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-six new annotations in RefSeq!

New Annotations
  • Aedes albopictus (Asian tiger mosquito)
  • Anolis carolinensis (green anole)
  • Armigeres subalbatus (mosquito)
  • Bacillus rossius redtenbacheri (walking stick)
  • Bolinopsis microptera (comb jelly)
  • Bombyx mori (domestic silkworm)
  • Bubalus kerabau (carabao)
  • Candoia aspera (snake)
  • Cavia porcellus (domestic guinea pig) 
  • Continue reading “New RefSeq Annotations Now Available!”
Join NCBI at TAGC 2024

Join NCBI at TAGC 2024

March 6-10 in Washington, D.C. 

We look forward to seeing you in person at The Allied Genetics Conference (TAGC), March 6-10, 2024, in the Washington D.C. metro area. NCBI staff will participate in a variety of activities and events, including hosting a hands-on workshop: Exploring and downloading NCBI data with NCBI Datasets. We’re also excited to share our recent efforts on the NIH Comparative Genomics Resource (CGR) in a talk during Sunday’s Technology, Tools, and Resources session. 

Check out NCBI’s schedule of activities and events:

Continue reading “Join NCBI at TAGC 2024”

Significant Updates Coming to the NCBI Datasets APIs and Command-Line Tools

Significant Updates Coming to the NCBI Datasets APIs and Command-Line Tools

As part of our ongoing effort to enhance your experience, we are updating the NCBI Datasets application programming interfaces (APIs). Beginning in June 2024, the v2alpha APIs will be promoted to the stable v2 version. At this time, the v1 API, the command-line interface (CLI) version 13 and older versions, and the Python library v1 will be deprecated and thus no longer supported for bug fixes or updates. Effective December 31, 2024, these will no longer be available for use. 

Our updated APIs and CLI tools include new features and functionality based on your feedback. We’re committed to making this transition as smooth as possible and encourage you to review our FAQs for more details. If you use NCBI Datasets web pages and command-line tools v14+, no action is required.   Continue reading “Significant Updates Coming to the NCBI Datasets APIs and Command-Line Tools”