Add ffmpeg as potential external project #188

mstorsjo · 2024-12-11T10:22:58Z

To add the external source, run
"git clone -b release/7.1 https://git.ffmpeg.org/ffmpeg.git" to clone the ffmpeg sources in the
llvm-test-suite/test-suite-externals directory, or configure with TEST_SUITE_FFMPEG_ROOT pointing at a directory containing this source.

(As this points to a branch, not a tag, it may be required to check out the known matching commit 10aaf84f855dbcedb8ee2e3fce307e9b98320946.)

This builds two targets; the ffmpeg command line executable (which isn't executed as a test) and the ffmpeg checkasm test executable (which runs compiler generated functions and compares them with handwritten assembly versions of them). The checkasm execuable can also be run manually to microbenchmark functions, e.g. "External/ffmpeg/ffmpeg_checkasm --test=vp9dsp --bench=vp9_avg_8tap_smooth* 0".

It is not very meaningful to benchmark the execution of the whole checkasm executable, as it runs a different numbers of functions depending on the number of SIMD extensions available on the target CPU.

(Benchmarking on aarch64 on Linux currently requires direct access to the pmccntr_el0 register. To currently use a different timer register, edit ffmpeg/libavutil/aarch64/timer.h and change it to use cntvct_el0 instead of pmccntr_el0 on Linux too.)

This uses a static configuration of the project (when building the upstream project with their own build system, there's a number of build options that can be configured). Assembly is hooked up and enabled on i386, x86_64, arm and aarch64.

For architectures other than those, the checkasm test won't have any reference for detecting e.g. miscompilations of functions, but building it can still be meaningful (for testing compilation, or benchmarking the execution of the C version of functions).

Instead of having the CMake configuration check for a large number of things and generate a config.h file at configure time, things are either hardcoded, or set up with preprocessor conditions within the hand-tweaked config.h file.

mstorsjo · 2024-12-11T10:32:14Z

This is modelled and set up in exactly the same way as for dav1d. Most of it is the same - the main key differences are:

ffmpeg is a much much larger codebase, with around 2200 files to be built, instead of a bit over 100. This provides much more surface for hitting miscompilations that e.g. trigger asserts in LLVM
While it contains more code, the fraction of the code that has assembly variants obviously is smaller, but it still has a fair amount of assembly code, and corresponding checkasm tests. Unfortunately, not all assembly code has got corresponding checkasm tests (so not everything that should be possible to microbenchmark is set up for that), but it still has a quite large amount of functions hooked up.
There are minor details that differ between the implementations of checkasm, in how to pick a subset of functions to benchmark (--bench --function=<pattern> in dav1d, --bench=<pattern> in ffmpeg), how to reduce the number of iterations to benchmark (--runs=8 in ffmpeg, edit checkasm.h in dav1d), and which file to edit to opt out of pmccntr_el0 on Linux. Most of it is very similar though.

I've kept the README.md separate from the dav1d one but rewritten the bits that differ - including finding other examples for good/bad autovectorization, and a similar small guide for mapping a checkasm benchmark back to the C source; I think it's more user friendly to have it that way than to have one readme refer to the other, even if there's some duplication in the bulk text in it.

There's no tag that contains exactly what we need here, but it is usable on a recent release branch, so the git clone instructions check out that branch (and directions to a recent known-good commit on it). I.e., if there'd be larger changes on the branch, we'd need to update the CMakeLists.txt accordingly (adding/removing source files). Once there's a new tag on that branch (potentially in a month or two), I'll update the instructions to clone and point to that instead.

sjoerdmeijer · 2024-12-11T12:42:16Z

This looks a very good idea to me: it's completely optional, and a very lightweight way to optionally pull in a relatively big and real life workload that can increase our test coverage.

My very first question about how this is different from dav1d is mostly answered here I think:

ffmpeg is a much much larger codebase, with around 2200 files to be built, instead of a bit over 100. This provides much more surface for hitting miscompilations that e.g. trigger asserts in LLVM

This is a good thing, but just out of curiousity I wanted to ask if you can tell if there are other differences? For example, are these codes very similar, or are they different in any way? More testing is good enough on itself, but just trying to get a sense of the added value.

I will give this a try to see how this works and will report back soon.

mstorsjo · 2024-12-11T13:12:10Z

My very first question about how this is different from dav1d is mostly answered here I think:

ffmpeg is a much much larger codebase, with around 2200 files to be built, instead of a bit over 100. This provides much more surface for hitting miscompilations that e.g. trigger asserts in LLVM

This is a good thing, but just out of curiousity I wanted to ask if you can tell if there are other differences? For example, are these codes very similar, or are they different in any way? More testing is good enough on itself, but just trying to get a sense of the added value.

That's probably the main difference indeed.

dav1d is mostly developed by longtime ffmpeg contributors, so the code style and structure, etc, are very similar. dav1d was developed externally, outside of ffmpeg for license/process reasons as a clean-slate project - dav1d is simplified BSD, while ffmpeg is mostly LGPL (with some minor bits being GPL and BSD or similar). When dav1d was started, some minor changes were made, like using meson as the upstream build system, rather than a custom configure shell script+gnu make, but from the point of view of llvm-test-suite it's quite much the same.

From the point of view of higher level benchmarking, ffmpeg is much more complex though :-)

Outside of microbenchmarking with checkasm, it's kinda easy to benchmark decoding with dav1d; use one single input video clip that exercises most video format features, decode it and measure the time. It's very easy to get a number of e.g. the effect on real-world decoding performance from autovectorization, like measured in https://discourse.llvm.org/t/autovectorization-measurements-from-dav1d-and-ffmpeg/82623. E.g. built with clang without vectorization, decodes at 118 fps, vectorization takes it to 177 fps. Built with GCC without vectorization gives 128 fps, with vectorization 185 fps. And with all handwritten assembly, it's 450 fps.

With ffmpeg supporting hundreds of different input formats, and also supporting encoding many dozens of formats, it'd require a quite significant test matrix, where benchmarking usually wouldn't see much differences, unless you targetedly know that changes that were made specifically affect one specific input/output format. One can of course benchmark the performance of one or a couple specific known important cases though, e.g. decoding H264 or HEVC or similar.

fhahn · 2024-12-17T09:00:34Z

Just had a chance to give this a try, but it looks like it traps in memcpy_chk on macOS. macOS uses _chk variants of memory function by default while most other platforms don't, so maybe the issue doesn't show up on other platforms?

frame #2: ffmpeg_checkasm`check_loopfilter_16y + 560
frame #1: libsystem_c.dylib`__memcpy_chk + 40
frame #0: libsystem_c.dylib`__chk_fail_overflow + 24
libsystem_c.dylib`__chk_fail_overflow:
->  0x1817cbd70 <+24>: brk    #0x1

mstorsjo · 2024-12-17T09:38:45Z

Just had a chance to give this a try, but it looks like it traps in memcpy_chk on macOS. macOS uses _chk variants of memory function by default while most other platforms don't, so maybe the issue doesn't show up on other platforms?
frame #2: ffmpeg_checkasm`check_loopfilter_16y + 560
frame #1: libsystem_c.dylib`__memcpy_chk + 40
frame #0: libsystem_c.dylib`__chk_fail_overflow + 24
libsystem_c.dylib`__chk_fail_overflow:
->  0x1817cbd70 <+24>: brk    #0x1

That's odd - this code does run and gets tested very regularly on macOS.

I just retested this on macOS, built via the llvm-test-suite wrapping, and ran ffmpeg_checkasm without hitting this. I built with stock Clang from Xcode 16.2, but I don't think that bit should matter. I ran it on macOS 14.6.1; I haven't yet upgraded to Sequoia though - that could possibly make for some differences, but it also sounds kinda implausible (any overflows or similar should be kinda consistent across OSes)?

Does this happen for any run, or only with certain input seeds?

If the issue seems to be within check_loopfilter_16y, that's in tests/checkasm/vp8dsp.c; there is one memcpy there, but as far as I can see, it should be entirely within its boundaries. Also, all the ways I try to compile this file, the one memcpy in check_loopfilter_16y just gets expanded to inline ldp/stp instructions as well, no actual function call that can get handled by __memcpy_chk.

I also tested building this with address sanitizer, and it ran checkasm fine without complaining about any of these memcpies.

sjoerdmeijer · 2024-12-17T09:51:12Z

Forgot to post my results, but I've tried it too and found that it all works OK on our AArch64 linux platforms.

External/ffmpeg/README.md

To add the external source, run "git clone -b release/7.1 https://git.ffmpeg.org/ffmpeg.git" to clone the ffmpeg sources in the llvm-test-suite/test-suite-externals directory, or configure with TEST_SUITE_FFMPEG_ROOT pointing at a directory containing this source. (As this points to a branch, not a tag, it may be required to check out the known matching commit 10aaf84f855dbcedb8ee2e3fce307e9b98320946.) This builds two targets; the ffmpeg command line executable (which isn't executed as a test) and the ffmpeg checkasm test executable (which runs compiler generated functions and compares them with handwritten assembly versions of them). The checkasm execuable can also be run manually to microbenchmark functions, e.g. "External/ffmpeg/ffmpeg_checkasm --test=vp9dsp --bench=vp9_avg_8tap_smooth* 0". It is not very meaningful to benchmark the execution of the whole checkasm executable, as it runs a different numbers of functions depending on the number of SIMD extensions available on the target CPU. (Benchmarking on aarch64 on Linux currently requires direct access to the pmccntr_el0 register. To currently use a different timer register, edit ffmpeg/libavutil/aarch64/timer.h and change it to use cntvct_el0 instead of pmccntr_el0 on Linux too.) This uses a static configuration of the project (when building the upstream project with their own build system, there's a number of build options that can be configured). Assembly is hooked up and enabled on i386, x86_64, arm and aarch64. For architectures other than those, the checkasm test won't have any reference for detecting e.g. miscompilations of functions, but building it can still be meaningful (for testing compilation, or benchmarking the execution of the C version of functions). Instead of having the CMake configuration check for a large number of things and generate a config.h file at configure time, things are either hardcoded, or set up with preprocessor conditions within the hand-tweaked config.h file.

mstorsjo · 2024-12-17T10:03:44Z

Forgot to post my results, but I've tried it too and found that it all works OK on our AArch64 linux platforms.

Thanks for checking!

BTW, I noticed that I had forgot to actually commit the final amendments, so the cmakelists and readme pointed to the wrong commit to check out (which lacked a recent commit to make sure it prints benchmarks for all functions, also functions without corresponding assembly implementations) - if you checked out a specific commit manually, check that you're on 10aaf84f855dbcedb8ee2e3fce307e9b98320946 or later. I included that in the update to the PR here as well now.

mstorsjo · 2024-12-17T16:45:07Z

Just had a chance to give this a try, but it looks like it traps in memcpy_chk on macOS. macOS uses _chk variants of memory function by default while most other platforms don't, so maybe the issue doesn't show up on other platforms?
frame #2: ffmpeg_checkasm`check_loopfilter_16y + 560
frame #1: libsystem_c.dylib`__memcpy_chk + 40
frame #0: libsystem_c.dylib`__chk_fail_overflow + 24
libsystem_c.dylib`__chk_fail_overflow:
->  0x1817cbd70 <+24>: brk    #0x1

Hmm, now I rebuilt with a fresh upstream llvm.org Clang, and I did manage to hit this issue; will look into it!

fhahn · 2024-12-17T16:46:19Z

Just had a chance to give this a try, but it looks like it traps in memcpy_chk on macOS. macOS uses _chk variants of memory function by default while most other platforms don't, so maybe the issue doesn't show up on other platforms?
frame #2: ffmpeg_checkasm`check_loopfilter_16y + 560
frame #1: libsystem_c.dylib`__memcpy_chk + 40
frame #0: libsystem_c.dylib`__chk_fail_overflow + 24
libsystem_c.dylib`__chk_fail_overflow:
->  0x1817cbd70 <+24>: brk    #0x1
Hmm, now I rebuilt with a fresh upstream llvm.org Clang, and I did manage to hit this issue; will look into it!

Yep I am also hitting it with a Clang built from main. Might be that the object size computation changed somehow and is now incorrect?

mstorsjo · 2024-12-17T16:51:35Z

Hmm, now I rebuilt with a fresh upstream llvm.org Clang, and I did manage to hit this issue; will look into it!

Yep I am also hitting it with a Clang built from main. Might be that the object size computation changed somehow and is now incorrect?

Possibly, maybe... But when building ffmpeg's checkasm with the original upstream ffmpeg build system, with the same version of Clang, I'm not hitting this issue, so it's also possible that it's caused by something in the fixed static configuration used here in the llvm-test-suite integration (which I use rather than running one million configure checks for various OS/toolchain features which we usually don't care about) - or if it is an upstream Clang bug which only happens to get uncovered in specific circumstances.

I'm not hitting the same issue when building with a Clang 19.1.x release, so it seems to be related to something that has happened on git main since the last few months, so I can bisect that as well.

mstorsjo · 2024-12-18T10:18:32Z

Yep I am also hitting it with a Clang built from main. Might be that the object size computation changed somehow and is now incorrect?

Yes, you were right on the money here. I bisected the regression to llvm/llvm-project@f8c1a22 / llvm/llvm-project#117849.

But when building ffmpeg's checkasm with the original upstream ffmpeg build system, with the same version of Clang, I'm not hitting this issue, so it's also possible that it's caused by something in the fixed static configuration used here in the llvm-test-suite integration

Yes, this also was what was happening. Upstream ffmpeg only assumes __attribute__((align())) to work for local variables in some architecture/compiler combinations, and it turns out that nobody has enabled that (to trust it) for aarch64 targets. In the config.h I'm adding here, I have that option hardcoded to #define HAVE_LOCAL_ALIGNED 1; as we're supposed to test the compiler within llvm-test-suite, I assumed we can always trust the compiler to be able to manage it. (Without that, the codebase overallocates local arrays and manually aligns the pointer within them; this throws off the object size calculation.) If I add a suitable enable local_aligned in ffmpeg/configure for aarch64, I also reproduce the same issue with the upstream build system. I can also reproduce the same LLVM bug for x86_64 Linux, if I add -D_FORTIFY_SOURCE=2 while building ffmpeg there.

So, apparently a legitimate compiler bug, caught by the integration of ffmpeg into llvm-test-suite, even before it's merged :-)

With llvm/llvm-project@f8c1a22 reverted, it builds and runs fine on macOS for me. Alternatively, you can also edit the hardcoded External/ffmpeg/config.h included here in llvm-test-suite, and change HAVE_LOCAL_ALIGNED to 0 - that also confuses the object size calculation enough to not make any false assumptions :-)

sjoerdmeijer

I am happy with this, thanks for doing this.

…generating empty location Fix the regression detected by llvm/llvm-test-suite#188

mstorsjo · 2024-12-19T12:28:32Z

Seems like this already was merged in 1cea1a9 :-) Closing the PR then.

That commit does seem to lack a couple of the later updates to this PR though (a typo fix in the readme, and fixes for the pinned ffmpeg commit), I'll push fixup commits for that.

1cea1a9 was committed without the last updates from the originating PR, #188 - fix the references to the known-good git commits (which were correct in the commit message, but had been missed to be updated in the files) and fix a typo.

Since llvm/llvm-test-suite#182 and llvm/llvm-test-suite#188, these projects can now be added as external projects within llvm-test-suite.

…ated mode instead of … (#120424) …generating empty location Fix the regression detected by llvm/llvm-test-suite#188

Since llvm/llvm-test-suite#182 and llvm/llvm-test-suite#188, these projects can now be added as external projects within llvm-test-suite.

mstorsjo · 2024-12-20T12:40:29Z

For reference here, the fix for the crash in checkasm on macos was fixed in llvm/llvm-project#120424 / llvm/llvm-project@e4db3f0 - and now the ffmpeg checkasm test runs fine on macOS when built through llvm-test-suite.

fhahn · 2024-12-20T12:52:40Z

Great news, thanks!

…in approximated mode instead of … (#120424) …generating empty location Fix the regression detected by llvm/llvm-test-suite#188

Since llvm/llvm-test-suite#182 and llvm/llvm-test-suite#188, these projects can now be added as external projects within llvm-test-suite.

mstorsjo requested review from fhahn and sjoerdmeijer December 11, 2024 10:22

sjoerdmeijer reviewed Dec 17, 2024

View reviewed changes

External/ffmpeg/README.md Outdated Show resolved Hide resolved

mstorsjo force-pushed the ffmpeg branch from 0240ba2 to 958a70c Compare December 17, 2024 10:00

mstorsjo mentioned this pull request Dec 18, 2024

[llvm] Improve llvm.objectsize computation by computing GEP, alloca a… llvm/llvm-project#117849

Merged

sjoerdmeijer approved these changes Dec 18, 2024

View reviewed changes

serge-sans-paille mentioned this pull request Dec 18, 2024

[llvm] Bail out when meeting pointer with negative offset instead of … llvm/llvm-project#120424

Merged

serge-sans-paille added a commit to serge-sans-paille/llvm-project that referenced this pull request Dec 18, 2024

[llvm] Bail out when meeting pointer with negative offset instead of …

ab73245

…generating empty location Fix the regression detected by llvm/llvm-test-suite#188

mstorsjo closed this Dec 19, 2024

mstorsjo mentioned this pull request Dec 19, 2024

[docs] Mention ffmpeg and dav1d in llvm-test-suite llvm/llvm-project#120570

Merged

mstorsjo added a commit to mstorsjo/llvm-project that referenced this pull request Dec 19, 2024

[docs] Mention ffmpeg and dav1d in llvm-test-suite

e1500bf

Since llvm/llvm-test-suite#182 and llvm/llvm-test-suite#188, these projects can now be added as external projects within llvm-test-suite.

mstorsjo added a commit to mstorsjo/llvm-project that referenced this pull request Dec 19, 2024

[docs] Mention ffmpeg and dav1d in llvm-test-suite

6a159a0

Since llvm/llvm-test-suite#182 and llvm/llvm-test-suite#188, these projects can now be added as external projects within llvm-test-suite.

mstorsjo deleted the ffmpeg branch December 20, 2024 12:39

Add ffmpeg as potential external project #188

Add ffmpeg as potential external project #188

Uh oh!

Conversation

mstorsjo commented Dec 11, 2024

Uh oh!

mstorsjo commented Dec 11, 2024

Uh oh!

sjoerdmeijer commented Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mstorsjo commented Dec 11, 2024

Uh oh!

fhahn commented Dec 17, 2024

Uh oh!

mstorsjo commented Dec 17, 2024

Uh oh!

sjoerdmeijer commented Dec 17, 2024

Uh oh!

Uh oh!

mstorsjo commented Dec 17, 2024

Uh oh!

mstorsjo commented Dec 17, 2024

Uh oh!

fhahn commented Dec 17, 2024

Uh oh!

mstorsjo commented Dec 17, 2024

Uh oh!

mstorsjo commented Dec 18, 2024

Uh oh!

sjoerdmeijer left a comment

Choose a reason for hiding this comment

Uh oh!

mstorsjo commented Dec 19, 2024

Uh oh!

mstorsjo commented Dec 20, 2024

Uh oh!

fhahn commented Dec 20, 2024

Uh oh!

Uh oh!

sjoerdmeijer commented Dec 11, 2024 •

edited

Loading