3902 use smt solver for equations #3904

jberthold · 2024-05-28T06:52:22Z

This PR adds the SMT solver to the equation application function.

Unclear side conditions for equations (those that could not be simplified to values in isolation) are checked (two-sided) using the SMT solver in context of the known path condition.
This should reduce the amount of unclear conditions, and thereby enable more simplificaitons in boooster.

Fixes #3902

…-for-equations - adapted code to changes in `Booster.SMT.checkPredicates` - will require another adaptation if/when follow-up PR is merged

…-for-equations

jberthold · 2024-07-11T04:20:29Z

Test	3902-use-smt-solver-for-equations time	master-0932411b9 time	(3902-use-smt-solver-for-equations/master-0932411b9) time
benchmarks/ecrecoverloop02-sigs-valid-spec.k	164.52	284.02	0.5792549820435181
benchmarks/ecrecoverloop02-sig1-invalid-spec.k	161.72	271.73	0.5951495970264601
benchmarks/ecrecoverloop02-sig0-invalid-spec.k	97.41	150.97	0.6452275286480758
mcd/flopper-dent-guy-diff-tic-not-0-pass-rough-spec.k	487.55	573.19	0.850590554615398
erc20/ds/approve-success-spec.k	40.2	45.53	0.8829343290138371
erc20/hkg/approve-spec.k	38.0	41.42	0.9174311926605504
erc20/hkg/transferFrom-success-2-spec.k	43.47	47.36	0.9178631756756757
erc20/hkg/transferFrom-success-1-spec.k	44.26	47.68	0.9282718120805369
erc20/ds/transfer-success-2-spec.k	45.55	48.45	0.9401444788441692
erc20/ds/transfer-success-1-spec.k	46.67	48.97	0.9530324688584848
mcd/flipper-addu48u48-fail-rough-spec.k	39.32	40.77	0.9644346333088054
erc20/ds/transfer-failure-1-a-spec.k	49.28	47.56	1.0361648444070648
mcd/end-pack-pass-rough-spec.k	106.83	102.77	1.0395056923226622
examples/erc721-spec.md	117.13	112.0	1.0458035714285714
benchmarks/ecrecoverloop00-sig1-invalid-spec.k	69.47	65.97	1.0530544186751554
mcd/pot-join-pass-rough-spec.k	150.26	141.38	1.0628094497100014
mcd/flopper-cage-pass-spec.k	41.02	37.87	1.0831792975970427
mcd/vow-fess-fail-rough-spec.k	115.29	87.57	1.3165467625899283
TOTAL	1857.95	2195.21	0.8463654957840023

jberthold · 2024-07-11T08:48:58Z

kontrol tests had a few failures (seemingly unrelated because they were also happening on the master run).

Test	3902-use-smt-solver-for-equations time	master-0932411b9 time	(3902-use-smt-solver-for-equations/master-0932411b9) time
StoreTest.testStoreLoadNonExistent()	12.19	15.73	0.7749523204068658
FreshBytesTest.test_symbolic_bytes_1	74.72	92.59	0.8069985959606869
StoreTest.testGasLoadWarmUp()	15.54	18.76	0.8283582089552237
PlainPrankTest.test_startPrank_zeroAddress_true()	17.49	20.72	0.8441119691119691
ExpectCallTest.testExpectRegularCall()	16.7	18.86	0.8854718981972428
],uint256))	150.6	169.31	0.8894926466245349
ExpectRevertTest.testFail_expectRevert_false()	17.77	19.76	0.8992914979757084
MockCallTestFoundry.testMockCallEmptyAccount()	32.81	36.16	0.9073561946902656
StartPrankTestMsgSender.test_startprank_msgsender_setup()	29.04	31.34	0.9266113592852584
PrankTestOrigin.test_origin_setup()	28.74	30.76	0.9343302990897269
StoreTest.testStoreLoad()	13.36	13.93	0.95908111988514
AddrTest.test_addr_true()-trace_options1	18.99	19.73	0.9624936644703497
AddrTest.test_notBuiltinAddress_concrete()	12.07	11.59	1.0414150129421915
SymbolicStorageTest.testFail_SymbolicStorage1(uint256)	43.89	41.52	1.0570809248554913
SetUpDeployTest.test_extcodesize()	47.12	44.48	1.0593525179856116
LoopsTest.sum_N(uint256)	6873.12	6335.25	1.0849011483366875
LoopsTest.test_sum_10()	19.7	18.11	1.0877967973495306
StartPrankTestOrigin.test_startprank_origin_setup()	32.85	29.18	1.1257710760795065
ExpectRevertTest.testFail_expectRevert_multipleReverts()	18.81	16.69	1.127022168963451
ExpectCallTest.testExpectStaticCall()	18.45	16.33	1.1298224127372933
])	39.41	33.55	1.1746646795827123
PlainPrankTest.test_startPrank_true()	21.22	17.26	1.2294322132097333
MockCallTestFoundry.testMockCalldata()	50.15	38.64	1.2978778467908902
StoreTest.testLoadNonExistent()	15.31	11.71	1.3074295473953885
MockCallTestFoundry.testRevertMock()	37.59	23.05	1.6308026030368765
TOTAL	7657.64	7125.01	1.0747549828000242

Especially:

FAILED src/tests/integration/test_foundry_prove.py::test_foundry_prove[LoopsTest.sum_N(uint256)]

booster/library/Booster/Pattern/ApplyEquations.hs

booster/test/rpc-integration/resources/simplify-smt.kore

booster/test/rpc-integration/resources/use-path-condition-in-equations.k

geo2a · 2024-07-11T09:08:17Z

Regarding LoopsTest.sum_N(uint256), there's a thread on Slack where the Kontrol team seems to have identified and solved the problem with it. So we can ignore it for now.

jberthold · 2024-07-12T05:32:23Z

Something mysterious is happening in the evm-semantics test vow-fess-fail-rough with this branch. The test is failing when run with the branch, and a bug reports shows that it sends more requests to the server.
However, when I compare those requests against a tarball produced with booster's master, I find that

the first difference is in an implies request, a different order of the Exists clauses in the consequent (I checked that all variables are in fact present)
subsequent implies requests have different ordering of Exists clauses again, and there are more of them
the only apparent differences in execute responses are that booster from the branch elminates a chop call
The proof using booster from the branch runs about 12 more requests. I have no idea why that is happening.

PetarMax · 2024-07-15T19:42:40Z

It would appear that the booster is inferring a constraint that is required for applying a simplification using Z3. This is sound and (for me) welcome, but in this particular case, it introduces over-approximation.

In my opinion, the solution is to add all constraints inferred using Z3 explicitly to the path constraint. This would:

Minimise over-approximation
Precisely pinpoint where we need to use Z3, which we can then leverage to write the appropriate lemmas and avoid that. The lemmas in this case, I believe, are:

rule X:Int <=Int chop ( X +Int Y:Int ) => X +Int Y <Int pow256
  requires #rangeUInt(256, X) andBool #rangeUInt(256, Y)
  [simplification]

and symmetrically for Y:Int +Int X.

geo2a · 2024-07-15T20:01:32Z

It seems to me like we are leaning towards getting this merged and then adding lemmas to KEVM to make the proof pass. Perhaps it would make sense to open the PR to KEVM first to make, just to make sure that simply adding these lemmas does not break anything?

jberthold · 2024-07-15T23:36:39Z

It seems to me like we are leaning towards getting this merged and then adding lemmas to KEVM to make the proof pass. Perhaps it would make sense to open the PR to KEVM first to make, just to make sure that simply adding these lemmas does not break anything?

I agree this would be the best way forward at the moment.
Will prepare the respective KEVM PR and re-test with that PR.

…-for-equations

jberthold · 2024-07-16T04:42:49Z

KEVM test using branch with new `chop` lemma

Test	3902-use-smt-solver-for-equations time	master-c71e5da10 time	(3902-use-smt-solver-for-equations/master-c71e5da10) time
benchmarks/address00-spec.k	34.91	96.03	0.3635322295116109
kontrol/test-allowchangestest-testfailallowcallstoaddress-0-spec.k	37.51	99.84	0.37570112179487175
benchmarks/ecrecoverloop02-sigs-valid-spec.k	165.23	286.21	0.5773033786380629
benchmarks/ecrecoverloop02-sig1-invalid-spec.k	161.79	270.45	0.5982251802551304
erc20/hkg/transferFrom-success-2-spec.k	43.16	68.12	0.6335877862595419
kontrol/test-allowchangestest-testallow_fail-0-spec.k	37.66	59.36	0.6344339622641509
benchmarks/ecrecoverloop02-sig0-invalid-spec.k	98.22	151.47	0.6484452366805308
mcd/flopper-tick-pass-rough-spec.k	133.35	192.52	0.6926553085393725
mcd/pot-join-pass-rough-spec.k	148.33	212.49	0.6980563791237235
mcd/flopper-dent-guy-same-pass-rough-spec.k	452.84	600.05	0.7546704441296559
mcd/flopper-file-pass-rough-spec.k	290.11	353.17	0.8214457626638729
mcd/flopper-dent-guy-diff-tic-not-0-pass-rough-spec.k	581.19	704.08	0.8254601749801159
erc20/ds/approve-success-spec.k	40.13	46.95	0.8547390841320553
benchmarks/bytes00-spec.k	39.48	44.92	0.8788958147818343
erc20/hkg/transferFrom-success-1-spec.k	44.5	48.27	0.9218976590014502
erc20/hkg/approve-spec.k	38.14	40.99	0.9304708465479384
erc20/ds/transfer-success-2-spec.k	45.78	48.78	0.93849938499385
erc20/ds/transfer-success-1-spec.k	46.87	49.64	0.9441982272360998
mcd/flipper-addu48u48-fail-rough-spec.k	39.36	41.51	0.9482052517465671
mcd/functional-spec.k	380.08	397.53	0.9561039418408673
kontrol/test-allowchangestest-testallow-0-spec.k	37.36	38.9	0.9604113110539846
erc20/ds/transferFrom-failure-1-d-spec.k	39.76	38.33	1.0373075919645187
mcd/vow-flog-fail-rough-spec.k	91.41	87.87	1.0402867872994195
mcd/end-pack-pass-rough-spec.k	107.08	102.87	1.040925439875571
examples/erc721-spec.md	117.09	112.07	1.0447934326760062
benchmarks/structarg01-spec.k	44.27	42.36	1.0450897072710106
mcd/vow-fess-fail-rough-spec.k	85.51	81.78	1.045610173636586
benchmarks/ecrecoverloop00-sig1-invalid-spec.k	69.36	66.06	1.0499545867393278
mcd/flopper-cage-pass-spec.k	40.95	38.19	1.072270227808327
TOTAL	3491.43	4420.81	0.7897715577009645

jberthold added 6 commits May 28, 2024 13:23

use SMT solver for term evaluation/simplification

be5cc33

add a small integration test for smt-enabled simplification

314c2d8

persist the unexpected kore-rpc-dev (and kore-rpc-booster) result

701f978

Rename SMTLib symbols in case of collisions

279bfb7

render zero-argument functions as atoms not lists

83f61fc

Merge branch 'master' into 3902-use-smt-solver-for-equations

d119ba5

This comment was marked as outdated.

Sign in to view

jberthold mentioned this pull request May 30, 2024

Abort on "Unknown" side conditions, don't crash #3907

Merged

jberthold and others added 4 commits May 30, 2024 12:31

Merge remote-tracking branch 'origin/master' into 3902-use-smt-solver…

125fc1e

…-for-equations - adapted code to changes in `Booster.SMT.checkPredicates` - will require another adaptation if/when follow-up PR is merged

Adjust expected output for vacuous-branch test

6fd1ff7

Format with fourmolu

332b49c

small cleanup

a7ae495

geo2a mentioned this pull request May 31, 2024

Make SMT timeout increase strategy configurable #3910

Open

jberthold and others added 3 commits July 11, 2024 11:25

Merge remote-tracking branch 'origin/master' into 3902-use-smt-solver…

ca66a4b

…-for-equations

adapt test expectation for use-path-condition test

76dcdbf

Format with fourmolu

4da490f

work around an issue with the kontrol performance tests

5051ef9

jberthold marked this pull request as ready for review July 11, 2024 08:52

jberthold requested review from goodlyrottenapple and geo2a July 11, 2024 08:52

goodlyrottenapple reviewed Jul 11, 2024

View reviewed changes

booster/library/Booster/Pattern/ApplyEquations.hs Outdated Show resolved Hide resolved

geo2a reviewed Jul 11, 2024

View reviewed changes

booster/library/Booster/Pattern/ApplyEquations.hs Outdated Show resolved Hide resolved

geo2a reviewed Jul 11, 2024

View reviewed changes

booster/test/rpc-integration/resources/simplify-smt.kore Outdated Show resolved Hide resolved

geo2a reviewed Jul 11, 2024

View reviewed changes

booster/test/rpc-integration/resources/use-path-condition-in-equations.k Show resolved Hide resolved

throw SMT errors instead of swallowing them

fa722bc

jberthold mentioned this pull request Jul 16, 2024

Add a new chop simplification lemma. runtimeverification/evm-semantics#2527

Merged

Merge remote-tracking branch 'origin/master' into 3902-use-smt-solver…

6749010

…-for-equations

jberthold requested review from geo2a and goodlyrottenapple July 16, 2024 05:03

geo2a approved these changes Jul 16, 2024

View reviewed changes

jberthold merged commit abf81e9 into master Jul 16, 2024
6 checks passed

jberthold deleted the 3902-use-smt-solver-for-equations branch July 16, 2024 06:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

3902 use smt solver for equations #3904

3902 use smt solver for equations #3904

Uh oh!

jberthold commented May 28, 2024

Uh oh!

This comment was marked as outdated.

jberthold commented Jul 11, 2024

Uh oh!

jberthold commented Jul 11, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

geo2a commented Jul 11, 2024

Uh oh!

jberthold commented Jul 12, 2024

Uh oh!

PetarMax commented Jul 15, 2024

Uh oh!

geo2a commented Jul 15, 2024

Uh oh!

jberthold commented Jul 15, 2024

Uh oh!

jberthold commented Jul 16, 2024

Uh oh!

Uh oh!

Uh oh!

3902 use smt solver for equations #3904

3902 use smt solver for equations #3904

Uh oh!

Conversation

jberthold commented May 28, 2024

Uh oh!

This comment was marked as outdated.

jberthold commented Jul 11, 2024

Uh oh!

jberthold commented Jul 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

geo2a commented Jul 11, 2024

Uh oh!

jberthold commented Jul 12, 2024

Uh oh!

PetarMax commented Jul 15, 2024

Uh oh!

geo2a commented Jul 15, 2024

Uh oh!

jberthold commented Jul 15, 2024

Uh oh!

jberthold commented Jul 16, 2024

KEVM test using branch with new chop lemma

Uh oh!

Uh oh!

Uh oh!

jberthold commented Jul 11, 2024 •

edited

Loading

KEVM test using branch with new `chop` lemma