Skip to content

Kotlin Multiplatform (KMP) library for string unicode normalization

License

Notifications You must be signed in to change notification settings

Doist/doistx-normalize

Repository files navigation

doistx-normalize

badge-version badge-android badge-jvm badge-js badge-ios badge-ios badge-ios badge-macos badge-windows badge-linux

Kotlin Multiplatform (KMP) library that adds support for normalization as described by Unicode Standard Annex #15 - Unicode Normalization Forms, by extending the String class with a normalize(Form) method.

All normalization forms are supported:

  • Form.NFC: Normalization Form C, canonical decomposition followed by canonical composition.
  • Form.NFD: Normalization Form D, canonical decomposition.
  • Form.NFKC: Normalization Form KC, compatibility decomposition followed by canonical composition.
  • Form.NFKD: Normalization Form KD, compatibility decomposition.

Usage

"Äffin".normalize(Form.NFC) // => "Äffin"
"Äffin".normalize(Form.NFD) // => "A\u0308ffin"
"Äffin".normalize(Form.NFKC) // => "Äffin"
"Äffin".normalize(Form.NFKD) // => "A\u0308ffin"

"Henry \u2163".normalize(Form.NFC) // => "Henry \u2163"
"Henry \u2163".normalize(Form.NFD) // => "Henry \u2163"
"Henry \u2163".normalize(Form.NFKC) // => "Henry IV"
"Henry \u2163".normalize(Form.NFKD) // => "Henry IV"

Setup

repositories {
   mavenCentral()
}

kotlin {
   sourceSets {
      val commonMain by getting {
         dependencies {
            implementation("com.doist.x:normalize:1.1.1")
         }
      }
   }
}

Development

Building this project can be tricky, as cross-compilation in KMP not widely supported. In this case:

  • macOS and iOS targets must be built on macOS.
  • Windows targets should be built on Windows (or a JDK under Wine).
  • Linux targets must be built on Linux due depending on libunistring.
  • JVM/Android and JS targets can be cross-compiled.

The defaults can be adjusted using two project properties:

  • targets is a string for which targets to build, test, or publish, depending on the task that runs.
    • all (default): All possible targets in the current host.
    • native: Native targets only (e.g., on macOS, that's macOS, iOS, watchOS and tvOS).
    • common: Common targets only (e.g., JVM, JS, Wasm).
    • host: Host OS only.
  • publishRootTarget is a boolean that indicates whether the kotlinMultiplatform root publication is included when publishing enabled targets (can only be done once).

When targets are built, tested and published in CI/CD, the Apple host handles Apple-specific targets, the Windows host handles Windows, and Linux handles everything else.

Release

To release a new version, ensure CHANGELOG.md is up-to-date, and push the corresponding tag (e.g., v1.2.3). GitHub Actions handles the rest.

License

Released under the MIT License.

Unicode's normalization test suite is subject to this license.