Skip to content

[datagen] Integer range of VARCHAR column results in large memory allocation #3353

Open
@snkas

Description

Reproduction steps

-- Data generation is specified an integer range for a column with type VARCHAR.
-- The data generator allocates a large amount of memory proportional to the
-- bounds of the range. For large bounds, this will result in an out-of-memory error.
-- The below range will result in about 4 GiB of stable memory usage (peak is about 2-3x).
-- Adding extra zeros will on most systems result in out-of-memory.
CREATE TABLE t1 (
  val1 VARCHAR(100)
) WITH (
  'connectors' = '[{
    "transport": {
    "name": "datagen",
      "config": {
        "plan": [{
          "limit": 1,
          "rate": 1,
          "fields": {
            "val1": { 
              "strategy": "uniform",
              "range": [ 4000000000, 4000000001 ]
            }
          }
        }]
      }
    }
  }]'
);

Expected behavior

It either should only allocate proportional to the range, or alternatively reject integer ranges for string types.

Metadata

Assignees

Labels

adaptersIssues related to the adapters crate

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions