Skip to content

Pre-kompile K definitions and specs for integration tests #2417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
May 10, 2024

Conversation

geo2a
Copy link
Contributor

@geo2a geo2a commented May 8, 2024

This PR modifies the RPC prover test harness in test_prove.py, to allow in-advance kompilation of the K definitions. This greatly improves the reliability and precision of the Haskell Backend performance measurement in scripts when we run several proofs in parallel. By separating the kompilation and proving stages, we reduce the noise caused by the K frontend.

An overview of changes:

  • add a new custom --kompiled-targets-dir command line option to pytest, allowing to set persistent cache directory for the kompiled definitions used by the testsl;
  • add the kompiled_target_cache fixture to test_prove.py that will read the persitent cache from disk if --kompiled-targets-dir is provided;
  • add the test_kompile_targets test to test_prove.py that forces the kompilation of all targets. If --kompiled-targets-dir is provided, the targets' kompiled directories will be stored there and could be re-used by the following pytest sessions that are given the same --kompiled-targets-dir. This test will be skipped if --kompiled-targets-dir is not given, making sure the CI configuration does not have to be changed to ignore it explicitly.

For example, after executing

pytest src/tests/integration/test_prove.py::test_kompile_targets --kompiled-targets-dir ./prekompiled

the ./prekompiled will be populated with the following subdirectories:

benchmarks-verification
bihu-functional-spec
bihu-verification
erc20-verification
examples-erc20-spec
examples-solidity-code-spec
examples-storage-spec
functional-abi-spec
functional-evm-int-simplifications-spec
kontrol-verification
mcd-functional-spec
mcd-verification

The invocation

pytest src/tests/integration -k "test_prove_pyk" --kompiled-targets-dir ./prekompiled

will now be able to re-use the kompilation artefacts and would not have to make expensive and scheduling-disrupting calls to kompile.

@geo2a geo2a changed the title Georgy/test prove kompile in advance Pre-kompile K definitions and specs for integration tests May 8, 2024
@geo2a geo2a self-assigned this May 8, 2024
@geo2a geo2a force-pushed the georgy/test-prove-kompile-in-advance branch 2 times, most recently from f2359e8 to bc1f3ea Compare May 8, 2024 16:18
@geo2a geo2a force-pushed the georgy/test-prove-kompile-in-advance branch from bc1f3ea to ef47d11 Compare May 8, 2024 16:18
cache[file.stem] = file / 'kompiled'
else:
cache_dir.mkdir(parents=True)
return cache_dir, cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider abstracting this in a class.

output_dir = cache_dir / f'{target.main_file.stem}-{len(cache)}'
output_dir.mkdir()
cache[target] = target(output_dir)
if target.name not in cache:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Target already uniquely identifies a kompilation target, it's probably not necessary to change the key to a str derived from it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not possible to construct a Target when initializing the cache from a directory in kompiled_target_cache, as I have nothing to construct the main_file field from.

cache[target] = target(output_dir)
if target.name not in cache:
output_dir = cache_dir / target.name
output_dir.mkdir(exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to cover the case when the user points to an already existing cache to rekompile it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, exist_ok=True is indeed unnecessary as the directory should never exist if the target is not already cached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, turns out it is actually easier if we keep exist_ok=True, and let the tests wait on the lock to populate/read it, rather then fail to create it if it exists.

for file in cache_dir.iterdir():
if file.is_dir():
# the cache key is the name of the target, which is the filename by-construction.
cache[file.stem] = file / 'kompiled'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initializing a cache might be unnecessary, you can just do the directory lookup when you request the kompiled directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean: could you elaborate?

My goals are:

  • keep using cache in so that the old behavior of the kompiled_target_for fixture is retained,
  • make sure that cache is never actually updated when --kompiled-targets-dir is passed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see it, tests basically require an interface that takes a Target and returns a Path to the kompiled target (in the actual tests there might be another indirection from spec files to targets, but I'll not consider this for simplicity).

So you just have to ensure that Target -> Path is reproducible between runs, then accommodating both use cases (prekompiled vs on-the-fly) is straightforward.

  • If the Path does not exist, lock and kompile
  • Otherwise, return the Path.

For prekompiled mode, you pass in a non-empty directory as base. For on-the-file mode, you pass in an empty temporary directory.

cache, etc. can be dropped, the actual cache is the file system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sound good, I understand what you mean. However, I'd rather get my incremental change merged quickly and now spend too much time refactoring the test suite. Would that be OK if I let you refactor it as you see fit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can live with that, but please open an issue for the refactoring and assign it to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will do!

@rv-jenkins rv-jenkins merged commit 1683cd4 into master May 10, 2024
@rv-jenkins rv-jenkins deleted the georgy/test-prove-kompile-in-advance branch May 10, 2024 08:25
rv-jenkins pushed a commit to runtimeverification/haskell-backend that referenced this pull request May 10, 2024
~Needs runtimeverification/evm-semantics#2417

Allows compiling the K definitions needs for proofs in advance, with the
hope of reducing measurement noise. In the spirit of
#3845, but
now for KEVM rather than Kontrol.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants