A curated list of awesome big data frameworks, ressources and other awesomeness.
-
Updated
May 7, 2024
A curated list of awesome big data frameworks, ressources and other awesomeness.
Open-Source Web UI for Apache Kafka Management
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Fancy stream processing made operationally mundane
The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
🌊 Online machine learning in Python
Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
Utils for streaming large files (S3, HDFS, gzip, bz2...)
Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.
Pravega - Streaming as a new software defined storage primitive
A lightweight stream processing library for Go
Python Stream Processing
Python stream processing for Kafka
Trill is a single-node query processor for temporal or streaming data.
Real-time stream processing for python
📐 Pushing the boundaries of simplicity
⚡ Single-pass algorithms for statistics
A machine learning package for streaming data in Python. The other ancestor of River.
HStreamDB is an open-source, cloud-native streaming database for IoT and beyond. Modernize your data stack for real-time applications.
Add a description, image, and links to the streaming-data topic page so that developers can more easily learn about it.
To associate your repository with the streaming-data topic, visit your repo's landing page and select "manage topics."