Versatile Data Kit (VDK) is an open source framework that enables anyone with basic SQL or Python knowledge to build, run, and manage their own data workflows.
Data processing instructions use plain text SQL or python files that are executed sequentially in alphanumeric order, allowing you to easily build your data workflows.
VDK is built for resiliency and can recover in mid-process or restart entirely from the start.
VDK creates data processing workflows to:
- Ingest data (extract)
- Transform data (transform)
- Export data (load)
- Ingest data from different sources, including CSV files, JSON objects, and data from REST API services.
- Use Python/SQL and VDK templates to transform data.
- Ensure data applications are packaged, versioned, and deployed correctly while dealing with credentials, retries, and reconnects.
- Provide built-in monitoring and smart notification capabilities.
- Track both code and data modifications and the relationship between them, allowing quicker troubleshooting and version rollback.
- Software Development Kit (SDK):
- Tools to automate the extraction, transformation, and loading of data.
- A plugin framework that allows users to extend the framework according to their specific requirements.
- Control Service: The Control Service allows users to create, deploy, manage, and execute data jobs in a Kubernetes runtime environment.
A preview of the VDK CLI commands:
Installing VDK is a simple pip command. See the Getting Started guide to install VDK and create a data job.
- See use case examples that show how VDK fits into the data workflow.
- See the documentation for VDK.
- Read the article about using the Versatile Data Kit and Trino DB.
- Join us at a community meeting
Create an issue or pull request on GitHub to submit suggestions or changes. If you are interested in contributing as a developer, visit the contributing page.
- Connect on Slack by:
- Joining the CNCF Slack workspace.
- Joining the #versatile-data-kit channel.
- Follow us on Twitter.
- Subscribe to the Versatile Data Kit YouTube Channel.
- Join our development mailing list, used by developers and maintainers of VDK.
Everyone involved in working on the project's source code, or engaging in any issue trackers, Slack channels, and mailing lists is expected to be familiar with and follow the Code of Conduct.