NVIDIA Cosmos for Developers

NVIDIA Cosmos™ is a platform of state-of-the-art generative world foundation models, advanced tokenizers, guardrails, and an accelerated data processing and curation pipeline for autonomous vehicles (AVs) and robotics developers.

Build, evaluate, deploy, and simulate physical AI models faster while minimizing testing and validation risks in the real world.

Try NowDocumentation


See Cosmos World Foundation Models in Action

Cosmos world foundation models (WFMs) generate high-fidelity, physics-aware video from simple inputs, simulating and predicting real-world outcomes for robotics and autonomous systems.


NVIDIA Cosmos World Foundation Models

The first wave of our first versions of pre-trained models for generating physics-aware videos and world states are now available openly to developers.

NVIDIA Cosmos has inbuilt guardrails to filter brands, unsafe content, and harmful prompts within Cosmos generated outputs. Cosmos also has guardrails to blur human faces, post-guards to remove questionable scenarios, and digital watermarks on synthetic videos generated from NVIDIA NIM™ microservices.

Different types of NVIDIA Cosmos models

Autoregressive

Predict future frames in a video sequence, leveraging temporal dependencies to generate coherent and realistic motion.

Cosmos Super:

Diffusion

Create videos by progressively refining random noise into coherent video frames through iterative denoising guided by learned temporal and spatial patterns.

Cosmos Super:

Workflow Enablers

Essential models that simplify the development and deployment of world models in physical AI applications.

Cosmos Super:

  • Cosmos-1.0-Guardrail
    State-of-the-art model combining pre- and post-generation guards to ensure safety and consistency.
    Download from NGC or Hugging Face.
  • Cosmos-1.0-PromptUpsampler-12B-Text2World
    Enhances prompt quality by improving text prompt descriptions and details automatically.
    Download from NGC or Hugging Face.
  • Cosmos- 1.0-Diffusion-7B- Decoder
    Decodes autoregressive video sequences for augmented reality.
    Download from NGC or Hugging Face.

Fine-Tuned Samples

  • Cosmos-1.0-Diffusion-7B-Text2World-Sample-MultiviewDriving
    Fine-tuned for AV multi-sensor driving views. Coming soon.
Fine-tuned for AV multi-sensor driving views

Introducing Cosmos for Physical AI Development

Get an introduction to the models, tools, and capabilities of the Cosmos platform to accelerate the development of physical-AI-embodied systems such as robots and autonomous vehicles.

Building Custom World Models With NVIDIA NeMo

New NVIDIA NeMo capabilities for customizing video foundation models, from data curation and model tuning, to inference pipeline.

Coming Soon

Open Cosmos World Foundation Models

Open Cosmos world foundation models and tokenizers are enabling developers to build physical AI without high entry costs.


Starter Kits

Start developing world models with Cosmos by accessing open models, fine-tuning tutorials, and more how-to on downstream applications and various stages of physical AI development.

Starter Kits by Use Case

Synthetic Data Generation

Build and deploy world models for infinite domain-specific synthetic data.

Policy Model Development

Fine-tune Cosmos WFMs to build policy models for mapping a physical AI system’s states to optimal actions based on learned behavior or rules.

Policy Model Validation

Fine-tuned Cosmos WFMs on validation data can accelerate initialization, evaluation, validation, and benchmarking of policy models before real-world deployment.

Starter Kits by Model Development Stage

Process and Curate Video Data

NeMo Curator generates high-quality training data with scalable pipelines that efficiently handle 100+ PB of data. With out-of-the-box-optimized performance that delivers a 35X speedup, NeMo Curator minimizes processing costs and accelerates time-to-market.

Tokenize Training Data

Cosmos tokenizers for images and videos offer up to 8X better compression and 12X faster speeds than open tokenizers, reducing computational costs.

Train and Customize

NVIDIA NeMo accelerates the development of world models by efficiently training and fine-tuning, multimodal models at scale with popular customization techniques like LoRA and SFT.


NVIDIA Cosmos Learning Library


More Resources

NVIDIA Developer Forums

Sign Up for the Developer Newsletter

NVIDIA Training and Certification

Read Cosmos FAQ

NVIDIA Inception Program for Startups

Accelerate Your Startup


    Ethical Considerations

    NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

    For more detailed information on ethical considerations for this model, please see the System Card, Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.

    Get Started With NVIDIA Cosmos Today

    Try Now