Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji do not render in PDFs #144

Closed
gsurrel opened this issue Mar 22, 2023 · 26 comments · Fixed by #3853
Closed

Emoji do not render in PDFs #144

gsurrel opened this issue Mar 22, 2023 · 26 comments · Fixed by #3853
Assignees
Labels
bug Something isn't working good contribution Good, isolated issue for someone that wants to contribute pdf Related to PDF export

Comments

@gsurrel
Copy link

gsurrel commented Mar 22, 2023

Hi,

I've just cloned the repo and compiled it. After compiling that document, I've a blank page (using Evince) or some tofu (using Firefox):

#emoji.face.grin

The generated PDF is 10MB big, with the Noto Color Emoji font embedded. When changing the zoom level, Evince writes some font thing failed to stderr.

System: Ubuntu 22.04, cargo 1.68.0, typst 045a109

@ghost
Copy link

ghost commented Mar 24, 2023

looks like a font issue. I get expected result:

https://github.com/typst/typst/files/11058297/emoji.pdf

18.2 KB. I am using Segoe UI Emoji font.

@gsurrel
Copy link
Author

gsurrel commented Mar 24, 2023

Hum, that's quite interesting:

#set text(font: (
  "Segoe UI Emoji"
))
Segoe Emoji here → #emoji.face.grin #emoji.amphora ← ?

#set text(font: (
  "Noto Color Emoji"
))
Noto Emoji here → #emoji.face.grin #emoji.amphora ← ?

With the following result:
image

Here is the file
test.pdf

@mdmrk
Copy link

mdmrk commented Mar 25, 2023

System: Arch Linux, cargo 1.68.0, typst a253b47

Testing Apple Color Emoji font

`typst --fonts` output icomoon Apple Color Emoji Arial Bahnschrift C059 Calibri Cambria Cambria Math Candara Cantarell Comic Sans MS Consolas Constantia Corbel Courier New D050000L FontAwesome Franklin Gothic FreeMono FreeSans FreeSerif Gabriola Georgia Impact Ink Free JetBrains Mono JetBrains Mono NL JetBrainsMono Nerd Font JetBrainsMono Nerd Font Mono JetBrainsMonoNL Nerd Font JetBrainsMonoNL Nerd Font Mono Liberation Mono Liberation Sans Liberation Serif Lucida Console Lucida Sans Unicode Marlett Microsoft Sans Serif Nimbus Mono PS Nimbus Roman Nimbus Sans Noto Color Emoji Noto Fangsong KSS Rotated Noto Fangsong KSS Vertical Noto Looped Thai Noto Music Noto Naskh Arabic Noto Naskh Arabic UI Noto Nastaliq Urdu Noto Rashi Hebrew Noto Sans Noto Sans Adlam Noto Sans Adlam Unjoined Noto Sans Anatolian Hieroglyphs Noto Sans Arabic Noto Sans Armenian Noto Sans Avestan Noto Sans Balinese Noto Sans Bamum Noto Sans Bassa Vah Noto Sans Batak Noto Sans Bengali Noto Sans Bengali UI Noto Sans Bhaiksuki Noto Sans Brahmi Noto Sans Buginese Noto Sans Buhid Noto Sans Canadian Aboriginal Noto Sans Carian Noto Sans Caucasian Albanian Noto Sans Chakma Noto Sans Cham Noto Sans Cherokee Noto Sans Chorasmian Noto Sans CJK HK Noto Sans CJK JP Noto Sans CJK KR Noto Sans CJK SC Noto Sans CJK TC Noto Sans Coptic Noto Sans Cuneiform Noto Sans Cypriot Noto Sans Cypro Minoan Noto Sans Deseret Noto Sans Devanagari Noto Sans Devanagari UI Noto Sans Duployan Noto Sans Egyptian Hieroglyphs Noto Sans Elbasan Noto Sans Elymaic Noto Sans Ethiopic Noto Sans Georgian Noto Sans Glagolitic Noto Sans Gothic Noto Sans Grantha Noto Sans Gujarati Noto Sans Gujarati UI Noto Sans Gunjala Gondi Noto Sans Gurmukhi Noto Sans Gurmukhi UI Noto Sans Hanifi Rohingya Noto Sans Hanunoo Noto Sans Hatran Noto Sans Hebrew Noto Sans Imperial Aramaic Noto Sans Indic Siyaq Numbers Noto Sans Inscriptional Pahlavi Noto Sans Inscriptional Parthian Noto Sans Javanese Noto Sans Kaithi Noto Sans Kannada Noto Sans Kannada UI Noto Sans Kayah Li Noto Sans Kharoshthi Noto Sans Khmer Noto Sans Khojki Noto Sans Khudawadi Noto Sans Lao Noto Sans Lao Looped Noto Sans Lepcha Noto Sans Limbu Noto Sans Linear A Noto Sans Linear B Noto Sans Lisu Noto Sans Lycian Noto Sans Lydian Noto Sans Mahajani Noto Sans Malayalam Noto Sans Malayalam UI Noto Sans Mandaic Noto Sans Manichaean Noto Sans Marchen Noto Sans Masaram Gondi Noto Sans Math Noto Sans Mayan Numerals Noto Sans Medefaidrin Noto Sans Meetei Mayek Noto Sans Mende Kikakui Noto Sans Meroitic Noto Sans Miao Noto Sans Modi Noto Sans Mongolian Noto Sans Mono Noto Sans Mono CJK HK Noto Sans Mono CJK JP Noto Sans Mono CJK KR Noto Sans Mono CJK SC Noto Sans Mono CJK TC Noto Sans Mro Noto Sans Multani Noto Sans Myanmar Noto Sans Nabataean Noto Sans Nandinagari Noto Sans New Tai Lue Noto Sans Newa Noto Sans NKo Noto Sans Nushu Noto Sans Ogham Noto Sans Ol Chiki Noto Sans Old Noto Sans Old Hungarian Noto Sans Old North Arabian Noto Sans Old Permic Noto Sans Old Persian Noto Sans Old Sogdian Noto Sans Old South Arabian Noto Sans Old Turkic Noto Sans Oriya Noto Sans Osage Noto Sans Osmanya Noto Sans Pahawh Hmong Noto Sans Palmyrene Noto Sans Pau Cin Hau Noto Sans Phags-Pa Noto Sans Phoenician Noto Sans Psalter Pahlavi Noto Sans Rejang Noto Sans Runic Noto Sans Samaritan Noto Sans Saurashtra Noto Sans Sharada Noto Sans Shavian Noto Sans Siddham Noto Sans SignWriting Noto Sans Sinhala Noto Sans Sinhala UI Noto Sans Sogdian Noto Sans Sora Sompeng Noto Sans Soyombo Noto Sans Sundanese Noto Sans Syloti Nagri Noto Sans Symbols Noto Sans Symbols 2 Noto Sans Syriac Noto Sans Syriac Eastern Noto Sans Syriac Western Noto Sans Tagalog Noto Sans Tagbanwa Noto Sans Tai Le Noto Sans Tai Tham Noto Sans Tai Viet Noto Sans Takri Noto Sans Tamil Noto Sans Tamil Supplement Noto Sans Tamil UI Noto Sans Tangsa Noto Sans Telugu Noto Sans Telugu UI Noto Sans Test Noto Sans Thaana Noto Sans Thai Noto Sans Thai Looped Noto Sans Tifinagh Noto Sans Tifinagh Adrar Noto Sans Tifinagh Agraw Imazighen Noto Sans Tifinagh Ahaggar Noto Sans Tifinagh Air Noto Sans Tifinagh APT Noto Sans Tifinagh Azawagh Noto Sans Tifinagh Ghat Noto Sans Tifinagh Hawad Noto Sans Tifinagh Rhissa Ixa Noto Sans Tifinagh SIL Noto Sans Tifinagh Tawellemmet Noto Sans Tirhuta Noto Sans Ugaritic Noto Sans Vai Noto Sans Vithkuqi Noto Sans Wancho Noto Sans Warang Citi Noto Sans Yi Noto Sans Zanabazar Square Noto Serif Noto Serif Ahom Noto Serif Armenian Noto Serif Balinese Noto Serif Bengali Noto Serif CJK HK Noto Serif CJK JP Noto Serif CJK KR Noto Serif CJK SC Noto Serif CJK TC Noto Serif Devanagari Noto Serif Display Noto Serif Dives Akuru Noto Serif Dogra Noto Serif Ethiopic Noto Serif Georgian Noto Serif Grantha Noto Serif Gujarati Noto Serif Gurmukhi Noto Serif Hebrew Noto Serif Kannada Noto Serif Khitan Small Script Noto Serif Khmer Noto Serif Khojki Noto Serif Lao Noto Serif Makasar Noto Serif Malayalam Noto Serif Myanmar Noto Serif NP Hmong Noto Serif Old Uyghur Noto Serif Oriya Noto Serif Sinhala Noto Serif Tamil Noto Serif Tangut Noto Serif Telugu Noto Serif Test Noto Serif Thai Noto Serif Tibetan Noto Serif Toto Noto Serif Vithkuqi Noto Serif Yezidi Noto Traditional Nushu octicons P052 Palatino Linotype Pomodoro Segoe Fluent Icons Segoe MDL2 Assets Segoe Print Segoe Script Segoe UI Segoe UI Emoji Segoe UI Historic Segoe UI Symbol Segoe UI Variable Sitka Text Source Code Pro Source Code Variable Standard Symbols PS Sylfaen Symbol Tahoma Times New Roman Trebuchet MS URW Bookman URW Gothic Verdana Webdings Wingdings Z003

input.typ

#set text(font: (
	"Apple Color Emoji"
))
#emoji.turtle
#emoji.apple.red

Result, cropped screenshot:
image

@ghost
Copy link

ghost commented Mar 25, 2023

@MarioD8 please wrap that code in a details element

https://developer.mozilla.org/docs/Web/HTML/Element/details

you are making it difficult to scroll for anyone that visits this page

@laurmaedje
Copy link
Member

Emoji fonts aren't correctly exported at the moment. There may also be unrelated font issues at play here.

@jason-s
Copy link

jason-s commented Mar 25, 2023

Any way for us to provide verbose logs? I'm running into what I think is this issue on my Mac.

test 123 #emoji.warning #emoji.turtle

#set text(font: (
  "Helvetica Neue"

))
test 123 #emoji.warning #emoji.turtle

#set text(font: (
  "Apple Color Emoji"
))
test 123 #emoji.warning #emoji.turtle

emoji-test.pdf

@reknih reknih added bug Something isn't working pdf Related to PDF export labels Apr 5, 2023
@namespaceYcZ
Copy link

With the help of new bing, I found a relevant answer: https://community.adobe.com/t5/acrobat-discussions/emoji-in-adobe-pdf/m-p/10148090#M121259. According to this answer, the reason for the emoji display problem may be that the PDF specification does not support SVG format fonts. The answer also points out two ways to solve this problem(if the reason is correct):

  1. rasterize or vectorize any glyphs for which the SVG definition varies from the default monochrome CFF or TrueType definition
  2. produce and embed a Type 3 font into the PDF for such characters.

I hope this information is helpful to everyone!

@lvignoli
Copy link
Contributor

Is there an already planned fix for this, or is the fix still unclear? For me emojis have been "working" minimally with Twitter Color Emoji: only a grayscale outline is exported in PDFs.

@laurmaedje
Copy link
Member

The planned fix is to export them as XObjects with /ActualText to make them copyable. If that turns out to be problematic, another alternative would be to embed them as Type 3 fonts.

@lvignoli
Copy link
Contributor

lvignoli commented Jun 20, 2023

Glad to hear this, thanks 😃

@Andrew15-5
Copy link
Contributor

I'm not sure why there is no visual progress on this issue, so I wanted to share something that could help solve the problem quicker. About half a year ago LibreOffice's latest version was 7.4 which didn't support emoji characters in PDF files. But I've read that in v7.5 it should be fixed.

https://wiki.documentfoundation.org/ReleaseNotes/7.5#Filters:

PDF

  • Support embedding color (e.g. Emoji) fonts using color layers (using COLR/CPAL OpenType tables). tdf#104403 (Khaled Hosny)
  • Support embedding color (e.g. Emoji) fonts using color bitmaps (using CBLC/CBDT or sbix OpenType tables). tdf#121327 (Khaled Hosny)

I recently upgraded my LO suite and sure enough, the bug is gone, finally! But I don't really use LO Writer (typesetting systems FTW). Still, if that project resolved the same issue that this project has, then the solution is out there.

Here is a simple example (using v7.6.0.3): emoji.zip

If that turns out to be problematic, another alternative would be to embed them as Type 3 fonts.

Okular says that emoji has a type "Type 3", so Type 3 is fine. Although I don't know if it's bad or not, because the emoji is copying fine into the clipboard (and different font types is still a heavy/new topic for me).

image

@laurmaedje
Copy link
Member

I'm not sure why there is no visual progress on this issue,

Mostly because we just didn't have time to implement this yet.

Thanks for the resources about how LibreOffice does it. This might come in handy.

@khaledhosny
Copy link

The planned fix is to export them as XObjects with /ActualText to make them copyable. If that turns out to be problematic, another alternative would be to embed them as Type 3 fonts.

FWIW, I have done both approaches before, XObjects in luaotfload (for luatex) and Type 3 in LibreOffice. XObjects work fine, but text copying was a proplem with CDBT table (old Noto Color Emoji) as these fonts have no outline glyphs and /ActualText does not seem to work if its enclosing images only. Type 3 seems to be more cleaner.

@laurmaedje
Copy link
Member

Thanks, that's good information!

@laurmaedje laurmaedje added the help wanted Extra attention is needed label Sep 14, 2023
@polazarus
Copy link

polazarus commented Sep 18, 2023

As a workaround, I wrote a package svg-emoji to replace emoji with an SVG glyps directly. For now, it only offers Noto support.

@laurmaedje laurmaedje added good contribution Good, isolated issue for someone that wants to contribute and removed help wanted Extra attention is needed labels Sep 19, 2023
@laurmaedje
Copy link
Member

laurmaedje commented Dec 8, 2023

As discussed on Discord, here is some background on color fonts and Typst's PDF font handling and then the steps required to fix this issue.

A bit of background on color fonts

OpenType supports multiple formats for encoding emoji fonts. The data for each of these is stored in OpenType tables within the font. We can query this data with ttf-parser. The following color formats exist:

  • sbix: A table that encodes emojis as raster images. Backed by Apple. (Example font: Apple Color Emoji)

  • CBDT: Another table that encodes emojis as raster images, but in a slightly different way. Backed by Google. (Example font: Noto Color Emoji)

  • SVG: Encodes emojis as a subset of SVG. Backed by Adobe and Mozilla. (Example font: Twitter Color Emoji)

  • COLRv0: Encodes emojis with the normal font outlines + color palettes. Backed by Microsoft. (Example font: Segoe UI Emoji)

  • COLRv1: Microsoft noticed that color emojis with just plain colors look a bit boring, so they added a ton of SVG-like features to the COLR table. This format is quite recent and support for it isn't merged into ttf-parser yet. Even with the latest updates, Windows doesn't seem to ship it, so we can skip it for now. (Example font: Recent versions of Noto Color Emoji)

The inner workings of these OpenType tables is mostly abstracted away by ttf-parser, but it's still important to know how they work conceptually.

How Typst writes text and fonts into PDFs

Within write_text in typst-pdf/src/page.rs, text is written by writing CIDs (character ids) into the content stream with the /TJ operator. In spite of their name CIDs are not like Rust chars. Instead, they typically map 1-1 to glyphs IDs in a font because we configure an Identity CID-to-GID mapping (except in the case of CFF fonts, which work a bit differently).

The CIDs reference a font configured via the /Tf operator. While writing the text items, we collect all fonts that are referenced, which we then embed into the PDF at the end. This happens in typst-pdf/src/font.rs To be able to copy from the PDF, a PDF viewer must map the CIDs back to Unicode text. This is what the /ToUnicode mapping is for, which we write for each font.

So, this is how it works for normal fonts. The problem now is that PDF viewers completely ignore the color tables in emoji fonts and fall back to normal outlines (if available). To get emojis to show up, we have two different options:

  • Encode them as graphics rather than text. Then, they aren't copy-pastable. While we can, in theory, specify an /ActualText that should be copied, many PDF viewers don't seem to support that.

  • Encode them as Type 3 fonts. A Type 3 font is a special type of PDF font that doesn't embed font data in an external format like TTF or CFF, but rather defines the font's glyphs directly as PDF objects. This way, we can create the emojis as PDF graphics, but display them with the normal text-showing operators.

Based on the conversation above and what other tools do, Type 3 seem like the better approach. Relevant details can be found in the PDF 1.7 specification section 9.5.6. "Type 3 fonts".

Implementing it in Typst

Here's a rough outline of the steps involved in implementing emoji handling for PDF in Typst:

  • When writing a glyph in a text run, we need to detect whether an emoji glyph definition exists in any of the formats above. If yes, we need to terminate the text run and switch to a Type 3 font we will generate for it. This should live in typst-pdf/src/page.rs, likely using some helpers defined in typst-pdf/src/font.rs.

  • We need code to convert emoji definitions in any of the formats above into PDF content streams. The PNG exporter has existing code for all the formats except COLR (since it's a recent addition to ttf-parser) whereas the SVG exporter doesn't handle them yet. To share as much code as possible between the exporters, it would probably make sense to convert a color glyph to a Typst Frame rather than directly producing PDF content for it and then reuse this frame across all three exporters. This code could live in typst/src/text/font/color.rs.

  • An unfortunate limitation of Type 3 fonts is that they can encode at most 256 glyphs, so if more than 256 emojis from the same font are used, we need to write multiple Type 3 fonts for that one.

  • We need to actually write the necessary variable number of Type 3 fonts for each font and generate /ToUnicode mappings for them. This should live in typst-pdf/src/font.rs.

@lvignoli
Copy link
Contributor

lvignoli commented Dec 8, 2023

Damn, all I wanted was a little 🐿️, turns out it's Specs War Infinity Edition
Thanks for planning on supporting this 😀

@lvignoli
Copy link
Contributor

@elegaanz thanks a lot for this crucial feature!
Typst is now 1.0 for me 🐿️

@davystrong
Copy link

Not sure if I'm missing something, but this still doesn't work for me. A file containing simply:

This should be a face #emoji.face but it's not

Produces the text with a gap where the emoji should be. Do I need to specify a special font? I'm just using whatever the default is when I run typst compile temp.typ.

Any help would be appreciated!

@Andrew15-5
Copy link
Contributor

I'm not sure compiler has an emoji font, so if you don't have one installed locally, then this probably is the issue. Another one is that you didn't compile the latest version as it only would work correctly in the next 0.12 version.

@davystrong
Copy link

davystrong commented Aug 20, 2024

Thanks for that! I had version 0.11.1 installed and upgrading to the git head (brew install --HEAD typst) fixed the problem.

Edit: It appears that the VS Code extension, Typst LSP, uses a bundled version of the Typst compiler, so upgrading the Typst CLI doesn't fix the automatic build on save output.

@mholson
Copy link

mholson commented Oct 8, 2024

I can confirm that upgrading to the git head (brew install --HEAD typst) fixed the problem for me as well ... hello emoji! Thank you @Andrew15-5 and @davystrong. Perfect timing for a project that I am working on.

@Andrew15-5
Copy link
Contributor

Note that you can now download the v0.12.0-rc1 or wait for a bit and use an upcoming v0.12.0.

@krisutofu
Copy link

The only way it works for me on current Linux (with Typst v0.12.0) in the VSCode Preview is font "Noto Emoji". The emojis are black-white and do not look as good as the original emoji images. Color Emoji fonts that I tested were not working. Is there any other font confirmed to be working on Linux in the VSCode Preview?

@laurmaedje
Copy link
Member

@krisutofu This is a problem with the VS Code extension's preview. It uses its own rendering mechanism, which does not seem to support emojis so far. If you export the PDF, it should be all good.

@krisutofu
Copy link

@krisutofu This is a problem with the VS Code extension's preview. It uses its own rendering mechanism, which does not seem to support emojis so far. If you export the PDF, it should be all good.

Amazing, it works in the exported PDF. Thank you so much. Typst for the win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good contribution Good, isolated issue for someone that wants to contribute pdf Related to PDF export
Projects
None yet
Development

Successfully merging a pull request may close this issue.