eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
-
Updated
Sep 14, 2022 - D
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Random Sampling in Clojure
Efficient reservoir sampling implementation for PyTorch
Performs memory-efficient reservoir sampling on very large input files delimited by newlines
A collection of algorithms in Java 8 for the problem of random sampling with a reservoir
Sampling methods for data streams
Produce a sample of lines from files.
Sample documents from MongoDB collections.
Python implementation of fast approximation reservioir sampling.
SAT'18 Paper: SPUR - Satisfying Perfectly Uniform Random sampler (Winner Best Student Paper)
Reservoir sampling implementation with akka-streams support
Stream sampler that picks a random (representative) sample of size k from a stream of values with unknown and possibly very large length.
A fast implementation of Reservoir Sampling with Immutable Persistent data structures.
Data- and processor- parallelism for fast weighted sampling
Output randomly sampled lines from input stream or file
Reservoir Sampling for Group-By Queries in Flink Platform. Answering effectively Single Aggregate.
USC DSCI 553 - Foundations & Applications of Data Mining - Spring 2024 - Prof. Wei-Min Shen
Add a description, image, and links to the reservoir-sampling topic page so that developers can more easily learn about it.
To associate your repository with the reservoir-sampling topic, visit your repo's landing page and select "manage topics."