Skip to content

3902 use smt solver for equations #3904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jul 16, 2024
Merged

Conversation

jberthold
Copy link
Member

This PR adds the SMT solver to the equation application function.

Unclear side conditions for equations (those that could not be simplified to values in isolation) are checked (two-sided) using the SMT solver in context of the known path condition.
This should reduce the amount of unclear conditions, and thereby enable more simplificaitons in boooster.

Fixes #3902

@jberthold

This comment was marked as outdated.

jberthold and others added 4 commits May 30, 2024 12:31
…-for-equations

- adapted code to changes in `Booster.SMT.checkPredicates`
- will require another adaptation if/when follow-up PR is merged
@jberthold
Copy link
Member Author

Test 3902-use-smt-solver-for-equations time master-0932411b9 time (3902-use-smt-solver-for-equations/master-0932411b9) time
benchmarks/ecrecoverloop02-sigs-valid-spec.k 164.52 284.02 0.5792549820435181
benchmarks/ecrecoverloop02-sig1-invalid-spec.k 161.72 271.73 0.5951495970264601
benchmarks/ecrecoverloop02-sig0-invalid-spec.k 97.41 150.97 0.6452275286480758
mcd/flopper-dent-guy-diff-tic-not-0-pass-rough-spec.k 487.55 573.19 0.850590554615398
erc20/ds/approve-success-spec.k 40.2 45.53 0.8829343290138371
erc20/hkg/approve-spec.k 38.0 41.42 0.9174311926605504
erc20/hkg/transferFrom-success-2-spec.k 43.47 47.36 0.9178631756756757
erc20/hkg/transferFrom-success-1-spec.k 44.26 47.68 0.9282718120805369
erc20/ds/transfer-success-2-spec.k 45.55 48.45 0.9401444788441692
erc20/ds/transfer-success-1-spec.k 46.67 48.97 0.9530324688584848
mcd/flipper-addu48u48-fail-rough-spec.k 39.32 40.77 0.9644346333088054
erc20/ds/transfer-failure-1-a-spec.k 49.28 47.56 1.0361648444070648
mcd/end-pack-pass-rough-spec.k 106.83 102.77 1.0395056923226622
examples/erc721-spec.md 117.13 112.0 1.0458035714285714
benchmarks/ecrecoverloop00-sig1-invalid-spec.k 69.47 65.97 1.0530544186751554
mcd/pot-join-pass-rough-spec.k 150.26 141.38 1.0628094497100014
mcd/flopper-cage-pass-spec.k 41.02 37.87 1.0831792975970427
mcd/vow-fess-fail-rough-spec.k 115.29 87.57 1.3165467625899283
TOTAL 1857.95 2195.21 0.8463654957840023

@jberthold
Copy link
Member Author

jberthold commented Jul 11, 2024

kontrol tests had a few failures (seemingly unrelated because they were also happening on the master run).

Test 3902-use-smt-solver-for-equations time master-0932411b9 time (3902-use-smt-solver-for-equations/master-0932411b9) time
StoreTest.testStoreLoadNonExistent() 12.19 15.73 0.7749523204068658
FreshBytesTest.test_symbolic_bytes_1 74.72 92.59 0.8069985959606869
StoreTest.testGasLoadWarmUp() 15.54 18.76 0.8283582089552237
PlainPrankTest.test_startPrank_zeroAddress_true() 17.49 20.72 0.8441119691119691
ExpectCallTest.testExpectRegularCall() 16.7 18.86 0.8854718981972428
],uint256)) 150.6 169.31 0.8894926466245349
ExpectRevertTest.testFail_expectRevert_false() 17.77 19.76 0.8992914979757084
MockCallTestFoundry.testMockCallEmptyAccount() 32.81 36.16 0.9073561946902656
StartPrankTestMsgSender.test_startprank_msgsender_setup() 29.04 31.34 0.9266113592852584
PrankTestOrigin.test_origin_setup() 28.74 30.76 0.9343302990897269
StoreTest.testStoreLoad() 13.36 13.93 0.95908111988514
AddrTest.test_addr_true()-trace_options1 18.99 19.73 0.9624936644703497
AddrTest.test_notBuiltinAddress_concrete() 12.07 11.59 1.0414150129421915
SymbolicStorageTest.testFail_SymbolicStorage1(uint256) 43.89 41.52 1.0570809248554913
SetUpDeployTest.test_extcodesize() 47.12 44.48 1.0593525179856116
LoopsTest.sum_N(uint256) 6873.12 6335.25 1.0849011483366875
LoopsTest.test_sum_10() 19.7 18.11 1.0877967973495306
StartPrankTestOrigin.test_startprank_origin_setup() 32.85 29.18 1.1257710760795065
ExpectRevertTest.testFail_expectRevert_multipleReverts() 18.81 16.69 1.127022168963451
ExpectCallTest.testExpectStaticCall() 18.45 16.33 1.1298224127372933
]) 39.41 33.55 1.1746646795827123
PlainPrankTest.test_startPrank_true() 21.22 17.26 1.2294322132097333
MockCallTestFoundry.testMockCalldata() 50.15 38.64 1.2978778467908902
StoreTest.testLoadNonExistent() 15.31 11.71 1.3074295473953885
MockCallTestFoundry.testRevertMock() 37.59 23.05 1.6308026030368765
TOTAL 7657.64 7125.01 1.0747549828000242

Especially:

FAILED src/tests/integration/test_foundry_prove.py::test_foundry_prove[LoopsTest.sum_N(uint256)]

@jberthold jberthold marked this pull request as ready for review July 11, 2024 08:52
@geo2a
Copy link
Contributor

geo2a commented Jul 11, 2024

Regarding LoopsTest.sum_N(uint256), there's a thread on Slack where the Kontrol team seems to have identified and solved the problem with it. So we can ignore it for now.

@jberthold
Copy link
Member Author

Something mysterious is happening in the evm-semantics test vow-fess-fail-rough with this branch. The test is failing when run with the branch, and a bug reports shows that it sends more requests to the server.
However, when I compare those requests against a tarball produced with booster's master, I find that

  • the first difference is in an implies request, a different order of the Exists clauses in the consequent (I checked that all variables are in fact present)
  • subsequent implies requests have different ordering of Exists clauses again, and there are more of them
  • the only apparent differences in execute responses are that booster from the branch elminates a chop call
  • The proof using booster from the branch runs about 12 more requests. I have no idea why that is happening.

@PetarMax
Copy link
Contributor

It would appear that the booster is inferring a constraint that is required for applying a simplification using Z3. This is sound and (for me) welcome, but in this particular case, it introduces over-approximation.

In my opinion, the solution is to add all constraints inferred using Z3 explicitly to the path constraint. This would:

  1. Minimise over-approximation
  2. Precisely pinpoint where we need to use Z3, which we can then leverage to write the appropriate lemmas and avoid that. The lemmas in this case, I believe, are:
rule X:Int <=Int chop ( X +Int Y:Int ) => X +Int Y <Int pow256
  requires #rangeUInt(256, X) andBool #rangeUInt(256, Y)
  [simplification]

and symmetrically for Y:Int +Int X.

@geo2a
Copy link
Contributor

geo2a commented Jul 15, 2024

It seems to me like we are leaning towards getting this merged and then adding lemmas to KEVM to make the proof pass. Perhaps it would make sense to open the PR to KEVM first to make, just to make sure that simply adding these lemmas does not break anything?

@jberthold
Copy link
Member Author

It seems to me like we are leaning towards getting this merged and then adding lemmas to KEVM to make the proof pass. Perhaps it would make sense to open the PR to KEVM first to make, just to make sure that simply adding these lemmas does not break anything?

I agree this would be the best way forward at the moment.
Will prepare the respective KEVM PR and re-test with that PR.

@jberthold
Copy link
Member Author

KEVM test using branch with new chop lemma

Test 3902-use-smt-solver-for-equations time master-c71e5da10 time (3902-use-smt-solver-for-equations/master-c71e5da10) time
benchmarks/address00-spec.k 34.91 96.03 0.3635322295116109
kontrol/test-allowchangestest-testfailallowcallstoaddress-0-spec.k 37.51 99.84 0.37570112179487175
benchmarks/ecrecoverloop02-sigs-valid-spec.k 165.23 286.21 0.5773033786380629
benchmarks/ecrecoverloop02-sig1-invalid-spec.k 161.79 270.45 0.5982251802551304
erc20/hkg/transferFrom-success-2-spec.k 43.16 68.12 0.6335877862595419
kontrol/test-allowchangestest-testallow_fail-0-spec.k 37.66 59.36 0.6344339622641509
benchmarks/ecrecoverloop02-sig0-invalid-spec.k 98.22 151.47 0.6484452366805308
mcd/flopper-tick-pass-rough-spec.k 133.35 192.52 0.6926553085393725
mcd/pot-join-pass-rough-spec.k 148.33 212.49 0.6980563791237235
mcd/flopper-dent-guy-same-pass-rough-spec.k 452.84 600.05 0.7546704441296559
mcd/flopper-file-pass-rough-spec.k 290.11 353.17 0.8214457626638729
mcd/flopper-dent-guy-diff-tic-not-0-pass-rough-spec.k 581.19 704.08 0.8254601749801159
erc20/ds/approve-success-spec.k 40.13 46.95 0.8547390841320553
benchmarks/bytes00-spec.k 39.48 44.92 0.8788958147818343
erc20/hkg/transferFrom-success-1-spec.k 44.5 48.27 0.9218976590014502
erc20/hkg/approve-spec.k 38.14 40.99 0.9304708465479384
erc20/ds/transfer-success-2-spec.k 45.78 48.78 0.93849938499385
erc20/ds/transfer-success-1-spec.k 46.87 49.64 0.9441982272360998
mcd/flipper-addu48u48-fail-rough-spec.k 39.36 41.51 0.9482052517465671
mcd/functional-spec.k 380.08 397.53 0.9561039418408673
kontrol/test-allowchangestest-testallow-0-spec.k 37.36 38.9 0.9604113110539846
erc20/ds/transferFrom-failure-1-d-spec.k 39.76 38.33 1.0373075919645187
mcd/vow-flog-fail-rough-spec.k 91.41 87.87 1.0402867872994195
mcd/end-pack-pass-rough-spec.k 107.08 102.87 1.040925439875571
examples/erc721-spec.md 117.09 112.07 1.0447934326760062
benchmarks/structarg01-spec.k 44.27 42.36 1.0450897072710106
mcd/vow-fess-fail-rough-spec.k 85.51 81.78 1.045610173636586
benchmarks/ecrecoverloop00-sig1-invalid-spec.k 69.36 66.06 1.0499545867393278
mcd/flopper-cage-pass-spec.k 40.95 38.19 1.072270227808327
TOTAL 3491.43 4420.81 0.7897715577009645

@jberthold jberthold merged commit abf81e9 into master Jul 16, 2024
6 checks passed
@jberthold jberthold deleted the 3902-use-smt-solver-for-equations branch July 16, 2024 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use the SMT solver when checking equations' side condtions in Booster
4 participants