split AllEquations into pieces for more parallelism #472

digama0 · 2024-10-09T16:40:39Z

While more can be done to improve the performance of the equation command, one thing that helps a lot is to make an embarrassingly-parallel task more parallel in practice by splitting it into separate files. The Equation files have also been moved into a common folder.

carlini · 2024-10-09T16:49:15Z

This might break some of the scripts that load equation data from this file. Ideally they wouldn't be doing that in the first place and should read from a separate python-friendly file. If things don't build for this reason I can try to to fix things.

digama0 · 2024-10-09T16:55:10Z

Can you point me to the code that does this? It should not be hard to point them at the new files instead. (EDIT: they should all be fixed now, but I'm not sure I know how to run all these scripts correctly so please double check that they are working as intended.)

carlini · 2024-10-09T19:12:10Z

There are several I'm aware of:

equational_theories/scripts/generate_graphiti_data.rb

Line 16 in c27789e

    
           File.read(File.join(__dir__, '../equational_theories/AllEquations.lean')).split("\n").each { |s|

equational_theories/scripts/explore_magma.py

Line 13 in c27789e

EQUATIONS_FILENAME = "../equational_theories/AllEquations.lean"

equational_theories/scripts/find_dual.py

Line 169 in c27789e

for line in open("../equational_theories/AllEquations.lean"):

equational_theories/scripts/generate_equation_implication_js.py

Line 109 in c27789e

for line in open("equational_theories/AllEquations.lean"):

Shreyas4991 · 2024-10-09T23:33:20Z

@digama0 : A lot of the automated diagram generating python and ruby scripts assume the existence of AllEquations.lean because this structure was agreed on near the beginning of the project. They read this file as a textfile. This isn't super ideal and maybe we should have written an export script in lean that exports the equations of "AllEquations.lean" into json. But fixing this now will be non-trivial.

teorth · 2024-10-09T23:54:29Z

This is what #142 is supposed to do. Perhaps #427 can be adapted to create such a JSON, which all other tools then draw from? Then if we ever refactor the equation lean files again then one just has to change the one export tool.

carlini · 2024-10-10T00:00:49Z

(I'd be happy to help rewrite the scripts to load from an appropriate json data file as appropriate.)

Shreyas4991 · 2024-10-10T00:01:36Z

The python script that generated the equations appears deterministic, so in principle, the same script can be modified to output the json file. However the script writers must then use this json file.

digama0 · 2024-10-10T00:05:25Z

This is what #142 is supposed to do. Perhaps #427 can be adapted to create such a JSON, which all other tools then draw from? Then if we ever refactor the equation lean files again then one just has to change the one export tool.

Actually, regarding speeding up equations generation in #478 , I think lean would also benefit from reading input via JSON instead of having to parse it out of a lean file. I think a large factor in the remaining performance is just the top level command loop, running linters and other things on each command in addition to running it.

The drawback is that it would most likely end up less readable...

There are three main improvements in this implementation: * The command builds terms directly instead of constructing syntax and re-elaborating it through the regular frontend * The proof of the `models_iff` theorem is streamlined and also generalized to arbitrary arity instead of proving every arity separately * The slow persistent env extension is replaced with a TagExtension, on the assumption that we only need to ask for the equations in bulk and do not need to maintain a data structure for fast query. The result is that it now only takes 30 seconds instead of 2.6 minutes to parse the 4600 equations. Including the effect of #472 it only takes 6 seconds.

vlad902 · 2024-10-10T07:36:03Z

I can update the ruby code when I'm back at the computer later today.

digama0 · 2024-10-10T07:53:57Z

Note: the current state is that all scripts have been fixed, although I don't know how to use them all so they are only lightly tested.

vlad902 · 2024-10-10T13:49:49Z

Note: the current state is that all scripts have been fixed, although I don't know how to use them all so they are only lightly tested.

Ah, apologies for the confusion.

Shreyas4991 · 2024-10-10T13:54:48Z

I will merge this. Fixes to the scripts if required can be done from new PRs. This will save CI time

split AllEquations into pieces for more parallelism

1729ae0

digama0 force-pushed the split_eqns branch from 23dfdbe to 1729ae0 Compare October 9, 2024 16:41

apply rename

ef13c40

digama0 added 5 commits October 9, 2024 19:13

tricky renames

094ab77

more renames

03ba20a

fix

b11acab

Merge branch 'main' into split_eqns

80bdf6d

fix

6301de6

Merge branch 'main' into split_eqns

09ddd2b

digama0 mentioned this pull request Oct 9, 2024

faster equation command #478

Merged

missed a spot

e74527c

Merge branch 'main' into split_eqns

814cad4

goens mentioned this pull request Oct 10, 2024

Equation Explorer shows Equation4270 wrong #480

Closed

digama0 and others added 2 commits October 10, 2024 10:04

renaming uses of Equations.lean

e930ad6

Merge branch 'main' into split_eqns

92a9972

Shreyas4991 merged commit abf2c24 into teorth:main Oct 10, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split AllEquations into pieces for more parallelism #472

split AllEquations into pieces for more parallelism #472

digama0 commented Oct 9, 2024

carlini commented Oct 9, 2024

digama0 commented Oct 9, 2024 •

edited

Loading

carlini commented Oct 9, 2024

Shreyas4991 commented Oct 9, 2024

teorth commented Oct 9, 2024

carlini commented Oct 10, 2024

Shreyas4991 commented Oct 10, 2024 •

edited

Loading

digama0 commented Oct 10, 2024

vlad902 commented Oct 10, 2024

digama0 commented Oct 10, 2024

vlad902 commented Oct 10, 2024

Shreyas4991 commented Oct 10, 2024 •

edited

Loading

split AllEquations into pieces for more parallelism #472

split AllEquations into pieces for more parallelism #472

Conversation

digama0 commented Oct 9, 2024

carlini commented Oct 9, 2024

digama0 commented Oct 9, 2024 • edited Loading

carlini commented Oct 9, 2024

Shreyas4991 commented Oct 9, 2024

teorth commented Oct 9, 2024

carlini commented Oct 10, 2024

Shreyas4991 commented Oct 10, 2024 • edited Loading

digama0 commented Oct 10, 2024

vlad902 commented Oct 10, 2024

digama0 commented Oct 10, 2024

vlad902 commented Oct 10, 2024

Shreyas4991 commented Oct 10, 2024 • edited Loading

digama0 commented Oct 9, 2024 •

edited

Loading

Shreyas4991 commented Oct 10, 2024 •

edited

Loading

Shreyas4991 commented Oct 10, 2024 •

edited

Loading