Skip to content

Is it really byte-level? #61

Open
Open
@LuCeHe

Description

From your paper it seems like the byte-level classification decomposes a character i.e. 'C' into its binary representation, something like 000101110, but your code gives back 68, which I think it's not what you intended, cause that is simply a char level representation.

Am I wrong?

Your dataset would be still fulfilling its purpose of using very long sequences, but I think it's not char-byte-level, but char-level.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions