Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Font subsetting and font optimization improvements #362

Merged
merged 22 commits into from
Jun 16, 2020

Conversation

gunnsth
Copy link
Contributor

@gunnsth gunnsth commented May 25, 2020

This PR is focused on improving some optimization methods, mostly for font optimization, but also some other approaches.

  • Add NewFromContents to extractor to enable extracting from generic pdf contents / resources.
  • IdentityEncoder - add rune tracking for subsetting
  • Image compression: Try DCT and flate combined
  • Content stream optimization: Remove unnecessary operands
  • Font program reduction: Remove redundant tables from truetype fonts
  • Font program subsetting: Remove redundant glyphs and reduce truetype fonts

Note that some golint fixes mixed in also.


This change is Reviewable

@codecov
Copy link

codecov bot commented May 25, 2020

Codecov Report

Merging #362 into development will decrease coverage by 6.23%.
The diff coverage is 18.20%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development     #362      +/-   ##
===============================================
- Coverage        62.39%   56.16%   -6.24%     
===============================================
  Files              236      239       +3     
  Lines            45802    46206     +404     
===============================================
- Hits             28580    25951    -2629     
- Misses           16549    16905     +356     
- Partials           673     3350    +2677     
Impacted Files Coverage Δ
model/optimize/clean_contentstream.go 0.00% <0.00%> (ø)
model/optimize/clean_fonts.go 0.00% <0.00%> (ø)
model/optimize/optimizer.go 87.14% <0.00%> (-5.29%) ⬇️
model/optimize/utils.go 0.00% <0.00%> (ø)
extractor/text.go 63.47% <25.00%> (-8.26%) ⬇️
model/font_composite.go 58.44% <25.00%> (-11.93%) ⬇️
internal/textencoding/identity.go 40.90% <50.00%> (-2.43%) ⬇️
model/font.go 57.56% <50.00%> (-7.33%) ⬇️
model/annotations.go 22.33% <54.54%> (-1.85%) ⬇️
model/page.go 53.20% <82.35%> (-7.38%) ⬇️
... and 164 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99ef1b8...c943a1d. Read the comment docs.

@gunnsth gunnsth changed the title WIP Font subsetting and font optimization improvements Font subsetting and font optimization improvements Jun 12, 2020
@gunnsth gunnsth requested a review from adrg June 12, 2020 08:59
for _, obj := range kids.Elements() {
pobj, ok := core.GetIndirect(obj)
if !ok {
break
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be continue here instead of break?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that should not happen, I think. Might make sense to log a debug message. It's part of the optimization so it's not very mission critical.

Copy link
Collaborator

@adrg adrg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

@gunnsth gunnsth merged commit 11f692b into unidoc:development Jun 16, 2020
@gunnsth gunnsth deleted the dev-optimization-improvements branch June 16, 2020 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants