Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Builtins] [Costing] Make costing more sharing-friendly #4505

Conversation

effectfully
Copy link
Contributor

Let's see if that helps. Don't look here yet.

@effectfully effectfully added Builtins Performance Costing Anything relating to costs, fees, gas, etc. Don't look here yet labels Mar 25, 2022
@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

@iohk-devops

This comment was marked as outdated.

@effectfully
Copy link
Contributor Author

Well, that's sad.

@effectfully
Copy link
Contributor Author

Wait, we have -O0 in that module...

@effectfully
Copy link
Contributor Author

Ok, the Core is completely wrong, looking into it.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

@iohk-devops

This comment was marked as outdated.

@effectfully
Copy link
Contributor Author

-1.74% on average, which is nice, given how minimal the changes are. The generated Core now does look much better. The benchmarking results are kinda shaky though, gonna run nofib.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:nofib

@iohk-devops

This comment was marked as outdated.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

@iohk-devops

This comment was marked as outdated.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

@iohk-devops

This comment was marked as outdated.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:nofib

@iohk-devops

This comment was marked as outdated.

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

@iohk-devops
Copy link

Comparing benchmark results of 'plutus-benchmark:validation' on '33341b3fb' (base) and 'f02339369' (PR)

Script 33341b3 f023393 Change
auction_1-1 235.9 μs 231.7 μs -1.8%
auction_1-2 870.4 μs 861.8 μs -1.0%
auction_1-3 863.0 μs 853.1 μs -1.1%
auction_1-4 306.0 μs 300.8 μs -1.7%
auction_2-1 235.0 μs 232.6 μs -1.0%
auction_2-2 869.3 μs 861.1 μs -0.9%
auction_2-3 1.105 ms 1.090 ms -1.4%
auction_2-4 863.9 μs 849.9 μs -1.6%
auction_2-5 306.0 μs 300.0 μs -2.0%
crowdfunding-success-1 276.5 μs 270.3 μs -2.2%
crowdfunding-success-2 276.7 μs 271.3 μs -2.0%
crowdfunding-success-3 276.7 μs 271.5 μs -1.9%
currency-1 329.0 μs 324.0 μs -1.5%
escrow-redeem_1-1 473.4 μs 467.1 μs -1.3%
escrow-redeem_1-2 472.9 μs 467.7 μs -1.1%
escrow-redeem_2-1 556.1 μs 549.2 μs -1.2%
escrow-redeem_2-2 556.1 μs 550.0 μs -1.1%
escrow-redeem_2-3 556.9 μs 549.2 μs -1.4%
escrow-refund-1 207.9 μs 203.3 μs -2.2%
future-increase-margin-1 329.1 μs 324.1 μs -1.5%
future-increase-margin-2 735.9 μs 727.1 μs -1.2%
future-increase-margin-3 734.1 μs 727.8 μs -0.9%
future-increase-margin-4 686.9 μs 676.2 μs -1.6%
future-increase-margin-5 1.070 ms 1.051 ms -1.8%
future-pay-out-1 329.6 μs 323.3 μs -1.9%
future-pay-out-2 739.6 μs 724.6 μs -2.0%
future-pay-out-3 737.9 μs 726.6 μs -1.5%
future-pay-out-4 1.066 ms 1.053 ms -1.2%
future-settle-early-1 329.1 μs 323.9 μs -1.6%
future-settle-early-2 735.4 μs 725.9 μs -1.3%
future-settle-early-3 734.1 μs 725.6 μs -1.2%
future-settle-early-4 816.1 μs 806.4 μs -1.2%
game-sm-success_1-1 536.6 μs 526.0 μs -2.0%
game-sm-success_1-2 258.4 μs 253.7 μs -1.8%
game-sm-success_1-3 862.8 μs 850.2 μs -1.5%
game-sm-success_1-4 303.3 μs 295.7 μs -2.5%
game-sm-success_2-1 537.4 μs 527.5 μs -1.8%
game-sm-success_2-2 257.7 μs 256.0 μs -0.7%
game-sm-success_2-3 862.1 μs 850.6 μs -1.3%
game-sm-success_2-4 304.1 μs 294.9 μs -3.0%
game-sm-success_2-5 865.4 μs 850.4 μs -1.7%
game-sm-success_2-6 304.3 μs 294.5 μs -3.2%
multisig-sm-1 548.1 μs 535.7 μs -2.3%
multisig-sm-2 534.5 μs 525.5 μs -1.7%
multisig-sm-3 538.5 μs 532.1 μs -1.2%
multisig-sm-4 542.3 μs 539.4 μs -0.5%
multisig-sm-5 761.7 μs 751.6 μs -1.3%
multisig-sm-6 546.4 μs 536.7 μs -1.8%
multisig-sm-7 534.2 μs 526.5 μs -1.4%
multisig-sm-8 539.7 μs 531.8 μs -1.5%
multisig-sm-9 542.5 μs 538.9 μs -0.7%
multisig-sm-10 762.9 μs 751.6 μs -1.5%
ping-pong-1 450.3 μs 441.3 μs -2.0%
ping-pong-2 452.1 μs 440.0 μs -2.7%
ping-pong_2-1 270.9 μs 265.1 μs -2.1%
prism-1 216.6 μs 211.3 μs -2.4%
prism-2 580.0 μs 571.4 μs -1.5%
prism-3 492.3 μs 486.8 μs -1.1%
pubkey-1 184.5 μs 180.0 μs -2.4%
stablecoin_1-1 1.198 ms 1.182 ms -1.3%
stablecoin_1-2 252.1 μs 248.5 μs -1.4%
stablecoin_1-3 1.374 ms 1.347 ms -2.0%
stablecoin_1-4 269.0 μs 264.3 μs -1.7%
stablecoin_1-5 1.734 ms 1.688 ms -2.7%
stablecoin_1-6 333.0 μs 327.7 μs -1.6%
stablecoin_2-1 1.200 ms 1.177 ms -1.9%
stablecoin_2-2 252.1 μs 247.4 μs -1.9%
stablecoin_2-3 1.375 ms 1.338 ms -2.7%
stablecoin_2-4 269.6 μs 263.8 μs -2.2%
token-account-1 249.6 μs 245.1 μs -1.8%
token-account-2 441.1 μs 435.1 μs -1.4%
uniswap-1 546.5 μs 538.5 μs -1.5%
uniswap-2 296.6 μs 290.7 μs -2.0%
uniswap-3 2.221 ms 2.166 ms -2.5%
uniswap-4 447.8 μs 437.6 μs -2.3%
uniswap-5 1.542 ms 1.494 ms -3.1%
uniswap-6 429.1 μs 420.3 μs -2.1%
vesting-1 469.8 μs 463.7 μs -1.3%

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:nofib

@iohk-devops
Copy link

Comparing benchmark results of 'plutus-benchmark:nofib' on '33341b3fb' (base) and 'a59aba586' (PR)

Script 33341b3 a59aba5 Change
clausify/formula1 22.90 ms 22.88 ms -0.1%
clausify/formula2 28.34 ms 28.29 ms -0.2%
clausify/formula3 77.05 ms 76.90 ms -0.2%
clausify/formula4 128.7 ms 127.6 ms -0.9%
clausify/formula5 489.5 ms 489.9 ms +0.1%
knights/4x4 67.04 ms 66.09 ms -1.4%
knights/6x6 179.4 ms 177.1 ms -1.3%
knights/8x8 296.5 ms 292.5 ms -1.3%
primetest/05digits 42.85 ms 41.72 ms -2.6%
primetest/08digits 79.08 ms 76.84 ms -2.8%
primetest/10digits 111.2 ms 108.4 ms -2.5%
primetest/20digits 223.1 ms 217.1 ms -2.7%
primetest/30digits 322.4 ms 314.2 ms -2.5%
primetest/40digits 434.8 ms 421.8 ms -3.0%
primetest/50digits 428.4 ms 417.4 ms -2.6%
queens4x4/bt 11.00 ms 10.87 ms -1.2%
queens4x4/bm 15.54 ms 15.44 ms -0.6%
queens4x4/bjbt1 13.71 ms 13.58 ms -0.9%
queens4x4/bjbt2 14.59 ms 14.45 ms -1.0%
queens4x4/fc 34.50 ms 34.38 ms -0.3%
queens5x5/bt 147.5 ms 145.9 ms -1.1%
queens5x5/bm 180.6 ms 178.9 ms -0.9%
queens5x5/bjbt1 175.0 ms 173.1 ms -1.1%
queens5x5/bjbt2 185.0 ms 182.8 ms -1.2%
queens5x5/fc 441.1 ms 436.4 ms -1.1%

@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

@iohk-devops
Copy link

Comparing benchmark results of 'plutus-benchmark:validation' on '33341b3fb' (base) and 'a59aba586' (PR)

Script 33341b3 a59aba5 Change
auction_1-1 237.2 μs 226.4 μs -4.6%
auction_1-2 875.1 μs 855.8 μs -2.2%
auction_1-3 866.5 μs 845.0 μs -2.5%
auction_1-4 306.7 μs 293.8 μs -4.2%
auction_2-1 236.7 μs 227.6 μs -3.8%
auction_2-2 872.8 μs 858.7 μs -1.6%
auction_2-3 1.106 ms 1.088 ms -1.6%
auction_2-4 863.2 μs 848.5 μs -1.7%
auction_2-5 305.8 μs 294.7 μs -3.6%
crowdfunding-success-1 276.6 μs 267.8 μs -3.2%
crowdfunding-success-2 276.0 μs 267.5 μs -3.1%
crowdfunding-success-3 276.5 μs 268.0 μs -3.1%
currency-1 329.0 μs 324.4 μs -1.4%
escrow-redeem_1-1 473.8 μs 466.0 μs -1.6%
escrow-redeem_1-2 473.7 μs 464.2 μs -2.0%
escrow-redeem_2-1 558.7 μs 544.2 μs -2.6%
escrow-redeem_2-2 558.1 μs 543.5 μs -2.6%
escrow-redeem_2-3 558.3 μs 545.4 μs -2.3%
escrow-refund-1 206.8 μs 200.2 μs -3.2%
future-increase-margin-1 329.1 μs 323.0 μs -1.9%
future-increase-margin-2 734.1 μs 721.2 μs -1.8%
future-increase-margin-3 734.7 μs 721.0 μs -1.9%
future-increase-margin-4 689.8 μs 674.6 μs -2.2%
future-increase-margin-5 1.074 ms 1.049 ms -2.3%
future-pay-out-1 330.7 μs 322.0 μs -2.6%
future-pay-out-2 738.4 μs 722.4 μs -2.2%
future-pay-out-3 738.8 μs 722.6 μs -2.2%
future-pay-out-4 1.070 ms 1.057 ms -1.2%
future-settle-early-1 328.7 μs 324.5 μs -1.3%
future-settle-early-2 734.5 μs 724.9 μs -1.3%
future-settle-early-3 735.8 μs 724.0 μs -1.6%
future-settle-early-4 821.7 μs 807.2 μs -1.8%
game-sm-success_1-1 538.6 μs 522.5 μs -3.0%
game-sm-success_1-2 258.8 μs 251.3 μs -2.9%
game-sm-success_1-3 861.8 μs 847.9 μs -1.6%
game-sm-success_1-4 304.0 μs 290.5 μs -4.4%
game-sm-success_2-1 538.6 μs 522.7 μs -3.0%
game-sm-success_2-2 259.3 μs 251.2 μs -3.1%
game-sm-success_2-3 866.6 μs 846.0 μs -2.4%
game-sm-success_2-4 305.2 μs 290.3 μs -4.9%
game-sm-success_2-5 866.0 μs 846.6 μs -2.2%
game-sm-success_2-6 304.1 μs 290.6 μs -4.4%
multisig-sm-1 548.5 μs 536.5 μs -2.2%
multisig-sm-2 534.8 μs 525.9 μs -1.7%
multisig-sm-3 539.4 μs 531.9 μs -1.4%
multisig-sm-4 543.4 μs 537.7 μs -1.0%
multisig-sm-5 765.1 μs 749.7 μs -2.0%
multisig-sm-6 548.2 μs 535.0 μs -2.4%
multisig-sm-7 534.4 μs 525.5 μs -1.7%
multisig-sm-8 540.5 μs 532.6 μs -1.5%
multisig-sm-9 542.2 μs 538.7 μs -0.6%
multisig-sm-10 768.7 μs 753.8 μs -1.9%
ping-pong-1 454.0 μs 441.5 μs -2.8%
ping-pong-2 454.2 μs 439.5 μs -3.2%
ping-pong_2-1 271.5 μs 263.3 μs -3.0%
prism-1 218.4 μs 210.1 μs -3.8%
prism-2 583.8 μs 568.4 μs -2.6%
prism-3 492.4 μs 480.7 μs -2.4%
pubkey-1 184.8 μs 176.7 μs -4.4%
stablecoin_1-1 1.200 ms 1.175 ms -2.1%
stablecoin_1-2 252.9 μs 246.7 μs -2.5%
stablecoin_1-3 1.374 ms 1.344 ms -2.2%
stablecoin_1-4 269.6 μs 260.7 μs -3.3%
stablecoin_1-5 1.733 ms 1.690 ms -2.5%
stablecoin_1-6 334.2 μs 322.3 μs -3.6%
stablecoin_2-1 1.201 ms 1.174 ms -2.2%
stablecoin_2-2 253.1 μs 244.3 μs -3.5%
stablecoin_2-3 1.378 ms 1.341 ms -2.7%
stablecoin_2-4 270.0 μs 261.6 μs -3.1%
token-account-1 249.5 μs 245.3 μs -1.7%
token-account-2 441.4 μs 432.9 μs -1.9%
uniswap-1 547.8 μs 540.3 μs -1.4%
uniswap-2 297.0 μs 288.5 μs -2.9%
uniswap-3 2.224 ms 2.166 ms -2.6%
uniswap-4 446.9 μs 431.2 μs -3.5%
uniswap-5 1.542 ms 1.499 ms -2.8%
uniswap-6 428.7 μs 413.3 μs -3.6%
vesting-1 469.5 μs 460.8 μs -1.9%

Comment on lines +34 to +37
deriving via ModelJSON "costingFun" (CostingFun model)
instance FromJSON model => FromJSON (CostingFun model)
deriving via ModelJSON "costingFun" (CostingFun model)
instance ToJSON model => ToJSON (CostingFun model)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to check that I didn't screw up JSON encoding/decoding?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's a test.


-- | A separate module for JSON instances, so that we can stick @-O0@ on it and avoid spending
-- a lot of time optimizing loads of Core whose performance doesn't matter.
module PlutusCore.Evaluation.Machine.CostingFun.JSON () where
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now in a separate file, so that we can keep -O0 here and use the defaults for the module with runCostingFunOneArgument etc functions (which helped by ~0.75%, which isn't much, but also saved my sanity because I now don't need to wait an eternity to see how Core changes after a tweak).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of -O0 removed a bunch of matching to extract an Int# from the arguments of ModelLinearSize etc. Which is particularly good since that matching wasn't getting cached (likely due to -O0 as well). Now the arguments are unpacked, so we don't need to extract that Int# at all.

@effectfully
Copy link
Contributor Author

Ready for review.

@effectfully
Copy link
Contributor Author

@kwxm I think it still would be great to just define the costing functions manually without matching on models to check if we're squeezing enough of performance out of what we have right now.

@effectfully
Copy link
Contributor Author

Example Core output for those who are curious:

-- RHS size: {terms: 59, types: 30, coercions: 13, joins: 0/1}
runOneArgumentModel
  = \ ds_dpuo ->
      case ds_dpuo of {
        ModelOneArgumentConstantCost dt_dpDt ->
          let { c_sqx7 = I# dt_dpDt } in
          lazy ((\ _ -> c_sqx7) `cast` <Co:4>);
        ModelOneArgumentLinearCost ds1_dpvO ->
          case ds1_dpvO of { ModelLinearSize dt_svSs dt1_svSt ->
          lazy
            (\ ds2_dpux ->
               case ds2_dpux `cast` <Co:3> of { I# ww1_iq8j ->
               case $w$c* ww1_iq8j dt1_svSt of ww4_iq8o { __DEFAULT ->
               case addIntC# ww4_iq8o dt_svSs of { (# r#_iq89, ds3_iq8a #) ->
               case ds3_iq8a of {
                 __DEFAULT ->
                   case andI# (># ww4_iq8o 0#) (># dt_svSs 0#) of {
                     __DEFAULT ->
                       case andI# (<# ww4_iq8o 0#) (<# dt_svSs 0#) of {
                         __DEFAULT -> case overflowError of wild3_00 { };
                         1# -> lvl31_rwWK `cast` <Co:2>
                       };
                     1# -> lvl32_rwWL `cast` <Co:2>
                   };
                 0# -> (I# r#_iq89) `cast` <Co:2>
               }
               }
               }
               })
          }
      }

-- RHS size: {terms: 25, types: 17, coercions: 2, joins: 0/0}
runCostingFunOneArgument
  = \ ds_dpvP ->
      case ds_dpvP of { CostingFun cpu_ahk7 mem_ahk8 ->
      case runOneArgumentModel mem_ahk8 of runMem_ahka { __DEFAULT ->
      case runOneArgumentModel cpu_ahk7 of runCpu_ahk9 { __DEFAULT ->
      lazy
        (\ mem1_ahkb ->
           case (runCpu_ahk9 mem1_ahkb) `cast` <Co:1> of { I# dt1_inIU ->
           case (runMem_ahka mem1_ahkb) `cast` <Co:1> of { I# dt3_inIW ->
           ExBudget dt1_inIU dt3_inIW
           }
           })
      }
      }
      }

Seems pretty nice. It kinda sucks runCpu_ahk9 and runMem_ahka do not return an Int# though, but that's probably because they're let-bound and we can't do better here (I have no idea what I'm talking about though).

@effectfully
Copy link
Contributor Author

The speedup is 2.51% on average.

@effectfully effectfully marked this pull request as draft March 28, 2022 10:56
@effectfully
Copy link
Contributor Author

/benchmark plutus-benchmark:validation

Copy link
Contributor

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks plausible, I don't have any obvious suggestions for making it nicer.

I think the issue with runCpu/runMem is that you would need the run*Model functions to get hit by worker-wrapper, I think. But I think the NOINLINE probably stops that?

runCostingFunTwoArguments (CostingFun cpu mem) =
let !runCpu = runTwoArgumentModel cpu
!runMem = runTwoArgumentModel mem
in lazy $ \mem1 mem2 ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using a local definition stop the floating?

runCostingFunTwoArguments (CostingFun cpu mem) =
  let !runCpu = ...
       !runMem = ...
       go = \mem1 mem2 -> ...
  in go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used case instead of let (suggested by Heinrich Apfelmus) and we don't need that lazy nonsense in there now. We still need some elsewhere though.

Comment on lines +34 to +37
deriving via ModelJSON "costingFun" (CostingFun model)
instance FromJSON model => FromJSON (CostingFun model)
deriving via ModelJSON "costingFun" (CostingFun model)
instance ToJSON model => ToJSON (CostingFun model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's a test.

they are partially computed and cached, which results in them being called only once per builtin
2. the resulting lambda is wrapped with a call to 'lazy', so that GHC doesn't float the let-bound
functions inside the lambda
3. the whole definition is marked with @INLINE@, because it gets worker-wrapper transformed and we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess specifically we can't benefit from worker-wrapper, because at the call sites we don't have a statically known usage of the ExBudget return value, so we can't unbox it nicely.

@michaelpj
Copy link
Contributor

There's an interesting little puzzle here, you could ask in #ghc. The combination of wanting to get a partially-applied function returned (so you share the work from the first argument), combined with worker-wrapper-ing the second part...

@iohk-devops
Copy link

Comparing benchmark results of 'plutus-benchmark:validation' on '33341b3fb' (base) and '2b6f1f522' (PR)

Script 33341b3 2b6f1f5 Change
auction_1-1 235.6 μs 223.9 μs -5.0%
auction_1-2 870.2 μs 856.5 μs -1.6%
auction_1-3 860.7 μs 843.4 μs -2.0%
auction_1-4 304.7 μs 290.0 μs -4.8%
auction_2-1 237.0 μs 226.2 μs -4.6%
auction_2-2 873.0 μs 860.8 μs -1.4%
auction_2-3 1.108 ms 1.089 ms -1.7%
auction_2-4 863.6 μs 849.3 μs -1.7%
auction_2-5 305.4 μs 291.3 μs -4.6%
crowdfunding-success-1 275.9 μs 263.8 μs -4.4%
crowdfunding-success-2 275.5 μs 264.0 μs -4.2%
crowdfunding-success-3 277.2 μs 264.1 μs -4.7%
currency-1 330.4 μs 320.8 μs -2.9%
escrow-redeem_1-1 475.9 μs 461.5 μs -3.0%
escrow-redeem_1-2 476.6 μs 462.7 μs -2.9%
escrow-redeem_2-1 559.2 μs 541.3 μs -3.2%
escrow-redeem_2-2 559.3 μs 540.2 μs -3.4%
escrow-redeem_2-3 559.5 μs 540.0 μs -3.5%
escrow-refund-1 206.6 μs 197.8 μs -4.3%
future-increase-margin-1 329.5 μs 320.2 μs -2.8%
future-increase-margin-2 737.2 μs 718.9 μs -2.5%
future-increase-margin-3 735.1 μs 718.3 μs -2.3%
future-increase-margin-4 686.5 μs 672.4 μs -2.1%
future-increase-margin-5 1.068 ms 1.046 ms -2.1%
future-pay-out-1 329.2 μs 319.0 μs -3.1%
future-pay-out-2 737.0 μs 720.2 μs -2.3%
future-pay-out-3 736.3 μs 721.3 μs -2.0%
future-pay-out-4 1.067 ms 1.049 ms -1.7%
future-settle-early-1 329.1 μs 322.3 μs -2.1%
future-settle-early-2 737.8 μs 722.4 μs -2.1%
future-settle-early-3 739.8 μs 722.0 μs -2.4%
future-settle-early-4 823.5 μs 802.8 μs -2.5%
game-sm-success_1-1 537.6 μs 519.7 μs -3.3%
game-sm-success_1-2 258.4 μs 245.9 μs -4.8%
game-sm-success_1-3 865.6 μs 841.1 μs -2.8%
game-sm-success_1-4 303.6 μs 286.6 μs -5.6%
game-sm-success_2-1 535.8 μs 518.8 μs -3.2%
game-sm-success_2-2 257.6 μs 246.3 μs -4.4%
game-sm-success_2-3 860.8 μs 842.6 μs -2.1%
game-sm-success_2-4 303.9 μs 286.5 μs -5.7%
game-sm-success_2-5 864.3 μs 842.1 μs -2.6%
game-sm-success_2-6 304.2 μs 287.9 μs -5.4%
multisig-sm-1 547.6 μs 534.8 μs -2.3%
multisig-sm-2 534.7 μs 523.8 μs -2.0%
multisig-sm-3 540.6 μs 530.0 μs -2.0%
multisig-sm-4 546.5 μs 533.8 μs -2.3%
multisig-sm-5 767.6 μs 749.3 μs -2.4%
multisig-sm-6 547.6 μs 532.7 μs -2.7%
multisig-sm-7 535.4 μs 522.1 μs -2.5%
multisig-sm-8 538.9 μs 528.5 μs -1.9%
multisig-sm-9 541.0 μs 532.2 μs -1.6%
multisig-sm-10 762.7 μs 747.2 μs -2.0%
ping-pong-1 451.4 μs 437.3 μs -3.1%
ping-pong-2 451.2 μs 436.9 μs -3.2%
ping-pong_2-1 269.7 μs 261.5 μs -3.0%
prism-1 216.6 μs 207.1 μs -4.4%
prism-2 579.7 μs 563.0 μs -2.9%
prism-3 490.6 μs 475.7 μs -3.0%
pubkey-1 184.5 μs 174.0 μs -5.7%
stablecoin_1-1 1.200 ms 1.170 ms -2.5%
stablecoin_1-2 252.1 μs 240.4 μs -4.6%
stablecoin_1-3 1.373 ms 1.333 ms -2.9%
stablecoin_1-4 270.2 μs 256.1 μs -5.2%
stablecoin_1-5 1.741 ms 1.679 ms -3.6%
stablecoin_1-6 335.8 μs 318.8 μs -5.1%
stablecoin_2-1 1.206 ms 1.174 ms -2.7%
stablecoin_2-2 254.2 μs 242.1 μs -4.8%
stablecoin_2-3 1.380 ms 1.345 ms -2.5%
stablecoin_2-4 270.1 μs 257.3 μs -4.7%
token-account-1 250.4 μs 242.4 μs -3.2%
token-account-2 440.5 μs 428.6 μs -2.7%
uniswap-1 545.5 μs 538.1 μs -1.4%
uniswap-2 295.2 μs 284.8 μs -3.5%
uniswap-3 2.222 ms 2.158 ms -2.9%
uniswap-4 447.7 μs 423.4 μs -5.4%
uniswap-5 1.541 ms 1.494 ms -3.0%
uniswap-6 429.3 μs 406.3 μs -5.4%
vesting-1 470.8 μs 458.3 μs -2.7%

@effectfully effectfully marked this pull request as ready for review March 28, 2022 13:08
@effectfully
Copy link
Contributor Author

-3.2% on average with the slightly changed approach. Docs updated. Gonna merge this once CI is green.

@effectfully effectfully merged commit e59ade1 into master Mar 28, 2022
@effectfully effectfully deleted the effectfully/builtins/costing/make-costing-more-sharing-friendly branch March 28, 2022 15:46
@effectfully
Copy link
Contributor Author

I think the issue with runCpu/runMem is that you would need the run*Model functions to get hit by worker-wrapper, I think. But I think the NOINLINE probably stops that?

Dunno. Without runCpu/runMem there's less boxing-unboxing, that's for sure.

There's an interesting little puzzle here, you could ask in #ghc. The combination of wanting to get a partially-applied function returned (so you share the work from the first argument), combined with worker-wrapper-ing the second part...

Where's that #ghc?

I merged the PR for now, but I'll probably investigate further. It's a funny puzzle indeed.

@michaelpj
Copy link
Contributor

#ghc being a (pretty active) IRC channel with the GHC devs in it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builtins Costing Anything relating to costs, fees, gas, etc. Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants