Skip to content
This repository has been archived by the owner on Nov 27, 2019. It is now read-only.
/ unicode-ranger Public archive

A utility that scans URL contents and returns a unicode-range value!

License

Notifications You must be signed in to change notification settings

malchata/unicode-ranger

 
 

Repository files navigation

unicode-ranger

Get unicode ranges, subset fonts.

This is a node module that scans URL(s) and gets unicode ranges. As it scans page content, it generates multiple unicode ranges with respect to the font a particular element's text is rendered in. Here's the simplest use case, which analyzes the given URLs and returns unicode ranges:

import UnicodeRanger from "unicode-ranger";

// A single URL can be provided as a string:
urlList = "https://en.wikipedia.org/wiki/Asian_giant_hornet";

// Or multiple URLs can be provided as an array:
urlList = ["https://en.wikipedia.org/wiki/Asian_giant_hornet", "https://en.wikipedia.org/wiki/Sphecius_speciosus", "https://en.wikipedia.org/wiki/Hemipepsis_ustulata"];

// You can also generate a list of URLs from a sitemap.xml file:
urlList = "https://example.com/sitemap.xml";

new UnicodeRanger(urlList).then((unicodeRanges)=>{
  for(var fontFamily in unicodeRanges){
    console.log(`${fontFamily}:`);
    console.log(unicodeRanges[fontFamily]);
    console.log("");
  }
}).catch(err=>new Error(err));

The output of this might look something like this:

Open Sans:
U+A,U+20,U+2E,U+44,U+45,U+4D,U+54,U+59,U+61-69,U+6B-70,U+72-79

You can use the unicode ranges generated by this utility as values for the unicode-range CSS property. unicode-ranger will subset TrueType fonts with pyftsubset if provided a subset map of font families to source files. Here's an example of that in action:

const options = {
  subsetMap: {
    "Monoton": {
      files: "./monoton.ttf"
    },
    "Fira Sans": {
      files: ["./fira-sans-regular.ttf", "./fira-sans-regular-italic.ttf", "./fira-sans-bold.ttf"]
    },
    "Fredoka One": {
      files: "./fredoka-one.ttf"
    },
  }
};

new UnicodeRanger(urlList, options).then((unicodeRanges)=>{
  for(var fontFamily in unicodeRanges){
    console.log(`${fontFamily}:`);
    console.log(unicodeRanges[fontFamily]);
    console.log("");
  }
}).catch(err=>new Error(err));

Options

This section is mostly todo, but here are some options currently in use:

  • verbose (default: false)
    unicode-ranger can be a bit quiet. If you want to know more of what it's up to under the hood, flip this to true.
  • subsetMap (default: undefined)
    A mapping of font families to relative locations of TrueType font files. An example of this option in action is shown above.
  • excludeElements (default ["SCRIPT", "BR", "TRACK", "WBR", "PARAM", "HR", "LINK", "OBJECT", "STYLE", "PICTURE", "IMG", "AUDIO", "VIDEO", "SOURCE", "EMBED", "APPLET", "TITLE", "META", "HEAD"])
    An array of HTML tags to ignore. Stick with the defaults for now. Tags are in uppercase to align with how the DOM API's tagName property returns tag names for HTML doctypes.

Other options are defined, but they may not be fully implemented yet. Which brings me to the following section...

This is not production ready. At all.

unicode-ranger is a work in progress, and as such, this documentation may not accurately reflect what unicode-ranger is fully capable of at the time of this writing. There's also a CLI for unicode-ranger, but that currently supports the v1 release. v2 is a total rewrite with a bunch of different features being added, so the CLI is currently incompatible with v2.

That's it.

About

A utility that scans URL contents and returns a unicode-range value!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published