Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache results of Cmts.preserve #1766

Merged
merged 1 commit into from
Aug 6, 2021
Merged

Conversation

emillon
Copy link
Collaborator

@emillon emillon commented Aug 5, 2021

The preserve function returns a string with the layout of a part of the AST. It is used in several places to determine if a subpart has a break for example.

This can be a problem when these constructions are nested, since laying out the external construction requires laying out the internal ones first, and this creates quadratic behavior.

This is solved by caching the results of these intermediate layouts. This cache is unbounded so there are bad corner cases. At worst, each cached point is associated to the result string which could mean quadratic memory consumption.

Some ways to fix this:

  • use a LRU cache like the one from core_kernel
  • store only a summary rather than the full string (length, number of lines, max indent)

In addition to #1750, a similar case can be triggered using nested cases. It seems that expression_width would be affected but I have not been able to reproduce that using that code path, so this part stays not cached.

Closes #1750

@emillon
Copy link
Collaborator Author

emillon commented Aug 5, 2021

MaxRSS does not change too much when formatting source.ml so even if this introduces a memory leak, this stays small enough not to matter.

@emillon emillon force-pushed the issue1750 branch 2 times, most recently from aa60006 to c3268d6 Compare August 5, 2021 14:32
Copy link
Collaborator

@gpetiot gpetiot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance improvement is impressive, it looks good to me, thanks!

lib/Fmt_ast.ml Outdated Show resolved Hide resolved
lib/Fmt_ast.ml Outdated
type c =
{ conf: Conf.t
; debug: bool
; source: Source.t
; cmts: Cmts.t
; fmt_code: Conf.t -> string -> (Fmt.t, unit) Result.t }
; fmt_code: Conf.t -> string -> (Fmt.t, unit) Result.t
; layout_cache: Layout_cache.t }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's worth to add this field to the type c, as we don't need to pass it to every function, it's only used by Cmts.preserve, why not add it to Cmts.t and directly modify the implementation of Cmts.preserve?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did so because it's not really related to comment handling, but yes we can do that.

Copy link
Collaborator

@gpetiot gpetiot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

The `preserve` function returns a string with the layout of a part of
the AST. It is used in several places to determine if a subpart has a
break for example.

This can be a problem when these constructions are nested, since laying
out the external construction requires laying out the internal ones
first, and this creates quadratic behavior.

This is solved by caching the results of these intermediate layouts.
This cache is unbounded so there are bad corner cases. At worst, each
cached point is associated to the result string which could mean
quadratic memory consumption.

Some ways to fix this:
- use a LRU cache like the one from `core_kernel`
- store only a summary rather than the full string (length, number of
  lines, max indent)

In addition to ocaml-ppx#1750, a similar case can be triggered using nested
cases. It seems that `expression_width` would be affected but I have not
been able to reproduce that using that code path.

Closes ocaml-ppx#1750
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: quadratic behaviour with nested lists in argument position
3 participants