Skip to content

Split/explode a multibyte binary into individual bytes #14891

Closed
@CSharperMantle

Description

Question

Is there an idiomatic way to turn a multibyte binary into a list<binary> or list<int> composed of many single-byte elements? Are there plans for each to support extracting bytes from binary in the future?

Additional context and details

I am working with binary files in Nushell. I want to iterate over the individual bytes to collect some statistics over them, but I have found no easy way to perform the iteration.

each command treats a binary as a whole, multibyte stream:

> open test.bin --raw | into binary
Length: unknown (stream) | printable whitespace ascii_other non_ascii
00000000:   00 01 02 03  04 05 06 07  08 09 aa bb  cc dd ee ff   0••••••••_××××××
> open test.bin --raw | into binary | each { |e| $e | into int | echo }
Error: nu::shell::eval_block_with_input

  × Eval block failed with pipeline input
   ╭─[entry #31:1:1]
 1 │ open test.bin --raw | into binary | each { |e| $e | into int | echo }
   · ──┬─
   ·   ╰── source value
   ╰────

Error: nu::shell::incorrect_value

  × Incorrect value.
   ╭─[entry #31:1:48]
 1 │ open test.bin --raw | into binary | each { |e| $e | into int | echo }
   ·                                                ─┬   ────┬───
   ·                                                 │       ╰── encountered here
   ·                                                 ╰── binary input is too large to convert to int (16 bytes)
   ╰────

> open test.bin --raw | into binary | each { |e| echo $e }
╭───┬──────────────────────────────────────────────────────────────╮
│ 0 │ [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 170, 187, 204, 221, 238, 255] │
╰───┴──────────────────────────────────────────────────────────────╯
D:\Workspace\hgame2025\compress.nu>            

bytes collect concatenates fragments of multibyte binarys and makes it a single stream, yet no inverses of this command are documented.

I know the effect could be achieved using 0..(($stream | length) - 1) | each { |i| $stream | bytes at $i..$i | into int } as a workaround:

> 0..(($stream | length) - 1) | each { |i| $stream | bytes at $i..$i | into int }
╭────┬─────╮
│  0 │   0 │
│  1 │   1 │
│  2 │   2 │
│  3 │   3 │
│  4 │   4 │
│  5 │   5 │
│  6 │   6 │
│  7 │   7 │
│  8 │   8 │
│  9 │   9 │
│ 10 │ 170 │
│ 11 │ 187 │
│ 12 │ 204 │
│ 13 │ 221 │
│ 14 │ 238 │
│ 15 │ 255 │
╰────┴─────╯

However, it would be better to pipeline it if the temporary $stream could be eliminated. May each support this feature in the future?

Metadata

Assignees

No one assigned

    Labels

    questionthe issue author asks something

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions