-
Notifications
You must be signed in to change notification settings - Fork 708
Change the fb buffer as 16 bytes aligned #419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the fb buffer as 16 bytes aligned #419
Conversation
Hello, may I ask if there is test data, how much |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @jason-mao!
Have left one note about the usage of heap_caps_aligned_alloc in older releases.
Also would you mind sharing, what is the observed performance improvement?
Perhaps you could add a test case or some other way to benchmark the improvement you have observed?
@jason-mao Hello,How can I test this feature to take effect? I use esp32s3 to run a demo of pic_server, call the frame2jpg function 30 times and calculate the time. By comparison, I found that the method of modifying |
This modification just to make the 128bit SIMD instructions to load data at theoretical performance, the unaligned data has extra time. This is only the first step. Most of the real improvement time is in code assembly optimization. I think I can add some code on esp-adf to test it. |
I have update on #419 (comment) |
Sorry, maybe I caused some misunderstanding. I explanation here is clearly #419 (comment) |
846a3f4
to
8b7416a
Compare
…rformance with 128 bit SIMD instructions
8b7416a
to
03b1eab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version LGTM.
Given the typical framebuffer sizes, loosing a few bytes of heap due to alignment doesn't seem to be a big issue. So I'm okay with this change even without adding SIMD optimizations to the rest of this project.
@me-no-dev PTAL as well.
Thanks @jason-mao |
can provide test code? |
|
The 128 bit aligned buffer have good performance for esp_jpeg encoding. The aligned data to improve the memory load time, combined with esp32-S3 128bit SIMD instructions optimization, we got about 10 fps on OV3660 camera with YUV422 and VGA.