Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Harfbuzz in Metrics View #5522

Open
wants to merge 62 commits into
base: master
Choose a base branch
from

Conversation

iorsh
Copy link
Contributor

@iorsh iorsh commented Jan 9, 2025

This PR introduces HarfBuzz shaping in Metrics view.

  1. The current shaping functionality is preserved and can be invoked from View menu as "Built-in shaper". It's very complex, and I'm pretty sure I didn't manage to fully support everything it provides, so some people might find it better in some aspects.
  2. The new shaper comes with some generic infrastructure which can be eventually expanded to support more shapers (e.g. Uniscribe under Windows)
  3. The new code is written mostly in C++17. This version seems reasonably conservative to me.
  4. Editing of kerning values, bearings, and width is supported.
  5. The font is generated for HarfBuzz use every time the Metrics View is opened or shaper engine is switched (in the View menu). This could be slow for large fonts.

After:
aldhabi-hb
Before:
aldhabi-old
Menu:
hb_menu

Fixes #3733, fixes #3837, fixes #4158, fixes #4511, fixes #5343, fixes #5513

@iorsh
Copy link
Contributor Author

iorsh commented Jan 9, 2025

CI is broken.

  • Linux: ubuntu-latest moved from 22 to 24
  • Windows - pacman issues again.

I'll take a look.

if (rtl) {
// HarfBuzz reverses the order of an RTL output buffer
std::reverse(glyphs_after_gpos, glyphs_after_gpos + glyph_count);
std::reverse(width_deltas.begin(), width_deltas.end());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use hb_buffer_reverse() earlier instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work for me. The RTL buffer before the reversing contains glyphs tail-to-head, and hb_buffer_get_glyph_positions() returns values with geometrically correct left-to-right advances and offsets.

In the case of mixed base glyphs (positive advance) and marks (positive offset, since the mark comes before the glyph), the function hb_buffer_reverse() correctly reorders the glyphs, but hb_buffer_get_glyph_positions() would just return the same values in reverse order, and this breaks the geometric relation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the current code works for you, then fine. hb_shape() returns the glyphs in visual order (that is, for RTL text the buffer is reversed), so that code that is drawing the text wouldn‘t need to worry about the buffer direction. hb_buffer_reverse() undoes that. Either case the individual glyph advances will be the same since there are no RTL glyph advances in OpenType.

I don’t know what FontForge expects here, but it is probably some FontForge-specific thing, not HarfBuzz that is being unexpected.

// functions.
hb_bool_t found =
hb_font_get_glyph_name(hb_ttf_font, glyph_info.codepoint,
glyph_name, sizeof(glyph_name) - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that glyph names are not required to be unique, and FontForge does not enforce them being unique either (you can have two glyphs with the same name in FontForge). You probably need to maintain a map between splinechar indices and generated font indices (I think they are usually the same, but FontForge might be re-ordering some glyphs when generating the font).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that FF does enforce unique names, at least this is what I got when trying to rename a glyph to existing name:
image
Do you have in mind some specific scenario where this doesn't hold?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an SFD file with duplicate glyph names (two glyphs both named test), FontForge will open it without any complaint and it will also generate a font with the duplicate glyph name: test.sfd.zip

char* utf8_str = u2utf8_copy(u_vec.data());

hb_buffer_t* hb_buffer = hb_buffer_create();
hb_buffer_add_utf8(hb_buffer, utf8_str, -1, 0, -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use hb_buffer_add_utf32(), instead of converting to UTF-8 as HarfBuzz will internally convert it back to UTF-32 anyway.

hb_buffer_set_direction(hb_buffer, hb_dir);

// Shape the text
hb_shape(hb_ttf_font, hb_buffer, NULL, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Features are never passed to HarfBuzz, so any non-default features will not be applied.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, default features are always applied and selecting/unselecting features on the features list has no effect.

hb_glyph_extents_t extents;
hb_bool_t res = hb_font_get_glyph_extents(
hb_ttf_font, glyph_info.codepoint, &extents);
assert(res);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculating glyph extents can be slow (especially for CFF fonts), and I don’t see extents used anywhere so it seems wasteful.


// Fill unscaled metrics in font units
hb_position_t h_advance =
hb_font_get_glyph_h_advance(hb_ttf_font, glyph_info.codepoint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the original width from splinechar, shaping does not change it.

hb_language_t hb_lang = hb_language_from_string((const char*)lang, -1);
hb_buffer_set_language(hb_buffer, hb_lang);

bool rtl = !u_vec.empty() && isrighttoleft(u_vec[0]);
Copy link
Contributor

@khaledhosny khaledhosny Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking first character is not enough, for example if text starts with punctuation. HarfBuzz has hb_buffer_guess_segment_properties which works slightly better here.

Ideally, there should be a UI to set the direction, since it is possible the text is entirely composed of unencoded glyphs which will be given PUA encoding and all PUA codepoints has strong LTR direction.


_WriteTTFFont(
ttf_file, context_->sf, ff_ttf, NULL, bf_ttf,
ttf_flag_otmode | ttf_flag_oldkernmappedonly | ttf_flag_fake_map,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think ttf_flag_oldkernmappedonly does anything unless ttf_flag_oldkern is also set. But you don’t want old kern table anyway.

_WriteTTFFont(
ttf_file, context_->sf, ff_ttf, NULL, bf_ttf,
ttf_flag_otmode | ttf_flag_oldkernmappedonly | ttf_flag_fake_map,
context_->get_enc_map(context_->sf), ly_fore);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably speed things up by skipping generating tables that HarfBuzz does not use, e.g. glyph outlines (glyf and CFF table) which is usually the larges font tables and the slowest to build.

@khaledhosny
Copy link
Contributor

Selecting builtin shaper always crashes for me.

@iorsh
Copy link
Contributor Author

iorsh commented Jan 18, 2025

Windows CI fixed with jtanx/fontforgebuilds#23

HB_FEATURE_GLOBAL_END};
hb_feature_vec.push_back(hb_feat);
}
return hb_feature_vec;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some work is needed to handle features that are enabled by default. Features like init, medi, etc. are enabled by default and HarfBuzz applies them selectively based on text analysis, but when enabled manually they will be applied unconditionally which will break their intended use.

What should happen is that if a default feature is selected in the UI, it should not be in the features list, but if it is unselected it should be in the features list with value set to 0 (disable).

HarfBuzz does not have an API that tells which features are enabled by default, so you will have to check the HarfBuzz sources. The list of default features depend on the script, I don’t know how FontForge decides which features to enable by default, but I think you will have to sync that with HarfBuzz so that features pre-selected in the UI match what HarfBuzz does.

Some applications use a tri-state checkbox for features, the default is neither selected or unselected what omits the feature from the features list, and selected set the feature to 1 and unselcted sets it to 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HarfBuzz sources are still the best reference, but here's a quick guide based on recent research on which features are considered enabled by default "in general": https://w3c.github.io/IFT/Overview.html#feature-tag-list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added support for default feature list (copied from https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-subset-input.cc#L82), but couldn't see any differentiation by script. @khaledhosny, is there any additional place for default features per script?

@iorsh
Copy link
Contributor Author

iorsh commented Jan 28, 2025

Selecting builtin shaper always crashes for me.

@khaledhosny, I tested it with a few fonts, but couldn't reproduce. Could you please attach a crash stack, or the font file if possible?

@khaledhosny
Copy link
Contributor

Selecting builtin shaper always crashes for me.

@khaledhosny, I tested it with a few fonts, but couldn't reproduce. Could you please attach a crash stack, or the font file if possible?

It happens with any font, when ff::shapers::IShaper::~IShaper() is called, this warning seems to be relevant:

/fontforge/fontforgeexe/shapers/shaper_shim.cpp:75:5: warning: delete called on 'ff::shapers::IShaper' that is abstract but has non-virtual destructor [-Wdelete-abstract-non-virtual-dtor]
   75 |     delete shaper;

This patch makes the warning and the crash go away:

diff --git a/fontforgeexe/shapers/i_shaper.hpp b/fontforgeexe/shapers/i_shaper.hpp
index 1884b05c1..f44e6f3cb 100644
--- a/fontforgeexe/shapers/i_shaper.hpp
+++ b/fontforgeexe/shapers/i_shaper.hpp
@@ -36,6 +36,8 @@ namespace ff::shapers {

 class IShaper {
  public:
+    virtual ~IShaper() = default;
+
     // The internal shaper name is non-localizable and serves to identify the
     // shaper in the system.
     virtual const char* name() const = 0;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants