Skip to content
/ uni Public
forked from arp242/uni

Query the Unicode database from the commandline, with good support for emojis

License

Notifications You must be signed in to change notification settings

msingle/uni

 
 

Repository files navigation

Build Status codecov

uni queries the Unicode database from the commandline.

There are four commands: identify codepoints in a string, search for codepoints, print codepoints by class, block, or range, and emoji to find emojis.

It includes full support for Unicode 12.1 (May 2019) with full Emoji support (a surprisingly large amount of emoji pickers don't deal with emoji sequences very well).

There are binaries on the releases page, or compile from source with go get arp242.net/uni, which will put the binary at ~/go/bin/uni.

Integrations

  • dmenu and rofi script at dmenu-uni. See the top of the script for some options you may want to frob with.

  • For a Vim command see uni.vim; just copy/paste it in your vimrc.

Usage

Identify a character:

$ uni identify €
     cpoint  dec    utf-8      html       name
'€'  U+20AC  8364   e2 82 ac    €     EURO SIGN (Currency_Symbol)

Or a string; i is a shortcut for identify:

$ uni i h€ý

     cpoint  dec    utf-8       html       name
'h'  U+0068  104    68          h     LATIN SMALL LETTER H (Lowercase_Letter)
'€'  U+20AC  8364   e2 82 ac    €     EURO SIGN (Currency_Symbol)
'ý'  U+00FD  253    c3 bd       ý   LATIN SMALL LETTER Y WITH ACUTE (Lowercase_Letter)

It reads from stdin:

$ head -c2 README.markdown | uni i
     cpoint  dec    utf-8       html       name
'['  U+005B  91     5b          [     LEFT SQUARE BRACKET (Open_Punctuation)
'!'  U+0021  33     21          !     EXCLAMATION MARK (Other_Punctuation)

Search description:

$ uni search euro
     cpoint  dec    utf-8       html       name
'₠'  U+20A0  8352   e2 82 a0    ₠   EURO-CURRENCY SIGN (Currency_Symbol)
'€'  U+20AC  8364   e2 82 ac    €     EURO SIGN (Currency_Symbol)
'𐡷'  U+10877 67703  f0 90 a1 b7 𐡷  PALMYRENE LEFT-POINTING FLEURON (Other_Symbol)
'𐡸'  U+10878 67704  f0 90 a1 b8 𐡸  PALMYRENE RIGHT-POINTING FLEURON (Other_Symbol)
'𐫱'  U+10AF1 68337  f0 90 ab b1 𐫱  MANICHAEAN PUNCTUATION FLEURON (Other_Punctuation)
'🌍' U+1F30D 127757 f0 9f 8c 8d 🌍  EARTH GLOBE EUROPE-AFRICA (Other_Symbol)
'🏤' U+1F3E4 127972 f0 9f 8f a4 🏤  EUROPEAN POST OFFICE (Other_Symbol)
'🏰' U+1F3F0 127984 f0 9f 8f b0 🏰  EUROPEAN CASTLE (Other_Symbol)
'💶' U+1F4B6 128182 f0 9f 92 b6 💶  BANKNOTE WITH EURO SIGN (Other_Symbol)

The s command is a shortcut for search. Multiple words are matched individually:

$ uni s globe earth
     cpoint  dec    utf-8       html       name
'🌍' U+1F30D 127757 f0 9f 8c 8d 🌍  EARTH GLOBE EUROPE-AFRICA (Other_Symbol)
'🌎' U+1F30E 127758 f0 9f 8c 8e 🌎  EARTH GLOBE AMERICAS (Other_Symbol)
'🌏' U+1F30F 127759 f0 9f 8c 8f 🌏  EARTH GLOBE ASIA-AUSTRALIA (Other_Symbol)

Use standard shell quoting for more literal matches:

$ uni s rightwards black arrow
     cpoint  dec    utf-8       html       name
'➡'  U+27A1  10145  e2 9e a1    ➡   BLACK RIGHTWARDS ARROW (Other_Symbol)
'➤'  U+27A4  10148  e2 9e a4    ➤   BLACK RIGHTWARDS ARROWHEAD (Other_Symbol)
[..]

$ uni s 'rightwards black arrow'
     cpoint  dec    utf-8       html       name
'⮕'  U+2B95  11157  e2 ae 95    ⮕   RIGHTWARDS BLACK ARROW (Other_Symbol)

The print command (shortcut p) can be used to print specific codepoints or groups of codepoints:

$ uni print U+2042
     cpoint  dec    utf-8       html       name
'⁂'  U+2042  8258   e2 81 82    ⁂   ASTERISM (Other_Punctuation)

Print a custom range; U+2042, U2042, and 2042 are all identical:

$ uni print 2042..2044
     cpoint  dec    utf-8       html       name
'⁂'  U+2042  8258   e2 81 82    ⁂   ASTERISM (Other_Punctuation)
'⁃'  U+2043  8259   e2 81 83    ⁃   HYPHEN BULLET (Other_Punctuation)
'⁄'  U+2044  8260   e2 81 84    ⁄    FRACTION SLASH (Math_Symbol)

General category:

$ uni p Po
     cpoint  dec    utf-8       html       name
     cpoint  dec    utf-8       html       name
'!'  U+0021  33     21          !     EXCLAMATION MARK (Other_Punctuation)
'"'  U+0022  34     22          "     QUOTATION MARK (Other_Punctuation)
[..]

Blocks:

$ uni p arrows 'box drawing'
     cpoint  dec    utf-8       html       name
'←'  U+2190  8592   e2 86 90    ←     LEFTWARDS ARROW (Math_Symbol)
'↑'  U+2191  8593   e2 86 91    ↑     UPWARDS ARROW (Math_Symbol)
[..]
'─'  U+2500  9472   e2 94 80    ─     BOX DRAWINGS LIGHT HORIZONTAL (Other_Symbol)
'━'  U+2501  9473   e2 94 81    ━   BOX DRAWINGS HEAVY HORIZONTAL (Other_Symbol)
[..]

And finally, there is the emoji command (shortcut: e), which is the real reason I wrote this:

$ uni e cry
😢 crying face         Smileys & Emotion  face-concerned
😭 loudly crying face  Smileys & Emotion  face-concerned
😿 crying cat          Smileys & Emotion  cat-face
🔮 crystal ball        Activities         game

Filter by group:

$ uni e -groups hands
🤲 palms up together  People & Body  hands
🤝 handshake          People & Body  hands
👏 clapping hands     People & Body  hands
🙏 folded hands       People & Body  hands
👐 open hands         People & Body  hands
🙌 raising hands      People & Body  hands

Group and search can be combined:

$ uni e -groups cat-face grin
😺 grinning cat                    Smileys & Emotion  cat-face
😸 grinning cat with smiling eyes  Smileys & Emotion  cat-face

Apply skin tone modifiers with -tone:

$ uni e -tone dark -groups hands
🤲🏿 palms up together  People & Body  hands
🤝 handshake          People & Body  hands    [doesn't support skin tone; it's displayed correct]
👏🏿 clapping hands     People & Body  hands
🙏🏿 folded hands       People & Body  hands
👐🏿 open hands         People & Body  hands
🙌🏿 raising hands      People & Body  hands

The default is to display all genders ("person", "man", "woman"), but this can be filtered with the -gender option:

$ uni e -gender man -groups person-gesture
🙍‍♂️ man frowning      People & Body  person-gesture
🙎‍♂️ man pouting       People & Body  person-gesture
🙅‍♂️ man gesturing NO  People & Body  person-gesture
🙆‍♂️ man gesturing OK  People & Body  person-gesture
💁‍♂️ man tipping hand  People & Body  person-gesture
🙋‍♂️ man raising hand  People & Body  person-gesture
🧏‍♂️ deaf man          People & Body  person-gesture
🙇‍♂️ man bowing        People & Body  person-gesture
🤦‍♂️ man facepalming   People & Body  person-gesture
🤷‍♂️ man shrugging     People & Body  person-gesture

Both -tone and -gender accept multiple values. -gender women,man will display both the female and male variants (in that order), and -tone light,dark will display both a light and dark skin tone; use all to display all skin tones or genders:

$ uni e -tone light,dark -gender f,m shrug
🤷🏻‍♀️ woman shrugging: light skin tone  People & Body  person-gesture
🤷🏻‍♂️ man shrugging: light skin tone    People & Body  person-gesture
🤷🏿‍♀️ woman shrugging: dark skin tone   People & Body  person-gesture
🤷🏿‍♂️ man shrugging: dark skin tone     People & Body  person-gesture

Alternatives

CLI/TUI

  • https://github.com/sindresorhus/emoj

    Doesn't support emojis sequences (e.g. MAN SHRUGGING is PERSON SHRUGGING + MAN, FIREFIGHTER is PERSON + FIRE TRUCK, etc); quite slow for a CLI program (emoj smiling takes 1.8s on my system, sometimes a lot longer), search results are pretty bad (shrug returns unamused face, thinking face, eyes, confused face, neutral face, tears of joy, and expressionless face ... but not the shrugging emoji), not a fan of npm (has 1862 dependencies).

  • https://github.com/Fingel/tuimoji

    Grouping could be better, doesn't support emojis sequences, only interactive TUI, feels kinda slow-ish especially when searching.

GUI

Development

Re-generate the Unicode data with go generate unidata. Files are cached in unidata/.cache, so clear that if you want to update the files from remote.

About

Query the Unicode database from the commandline, with good support for emojis

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 97.6%
  • Shell 1.8%
  • Vim Script 0.6%