Skip to content

codepages.nlp binary in System.Text.Encoding.CodePages #81693

Open
@premun

Description

Context

The key goal of source-build is to satisfy the official packaging rules of commonly used Linux distributions, such as Fedora and Debian. Many Linux distributions have similar rules. These rules tend to have two main principles: consistent reproducibility, and source code for everything.

In order to support the "source code for everything" requirement, binary files are not allowed in product repositories. Aside from, binaries that can be created during the build process from source are better not to be checked in as one of the main goals of git is that humans can review the code changes.

Questions

  • What scenario / which RID are these files used for?
  • Are these files necessary for a successful build of the .NET SDK?
  • If they are is, can they be removed from the repository and replaced with a source and process that synthesizes them during build?

Goal

We should comply with the source build requirements and get rid of these binaries. The file in question is https://github.com/dotnet/runtime/blob/main/src/libraries/System.Text.Encoding.CodePages/src/Data/codepages.nlp

Based on the discussion here, it seems it's possible to synthesize this file from source but the current tool that does that is written in Perl.

Possible workarounds

At the moment, we only source-build Linux x64/arm64 so if this file is required for other RIDs, it can be temporarily removed from the source build. This is only in case it's difficult to replace the file with source. Other platforms will be supported by source build in the future though, so this problem will re-surface in case we go around it this way.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions