Skip to content

Allow boards to enable the micropython.native decorator #2271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 8, 2019

Conversation

theacodes
Copy link

@theacodes theacodes commented Nov 5, 2019

Adds CIRCUITPY_ENABLE_MPY_NATIVE for mpconfigboard.mk that allows boards to enable the micropython.native decorator.

I am 100% open to suggestions on this PR. I wanted to open a PR that more or less does the minimal amount of work to enable this. As such, I had to disable a few warnings to get this to build. The upstream code for this has deviated a bit, so if it's preferable to backport that code instead I'm up for it.

Related to #1248. I would enable it and test and mark as closes, but I don't have an nRF board to test with. It works fine for my SAMD51 boards (Feather M4 Express & Winterbloom Sol).

Adds the `CIRCUITPY_ENABLE_MPY_NATIVE` for `mpconfigboard.mk` that enables
the `micropython.native` decorator.
@dhalbert
Copy link
Collaborator

dhalbert commented Nov 5, 2019

I think this is laudable, because it's a port-agnostic way of getting speedups (at the expense of RAM). At the same time, there is further work that could be done to speed up CircuitPython in general, by probably about a factor of 2: see #2142.

I do worry about fixes to MicroPython related to this that haven't been merged. It's been a long time since we merged from upstream, so I think it would be worth looking at that. You could do a trial merge into a junk branch and see what's affected, or just browse the MicroPython "blame" info for the relevant code sections.

@theacodes
Copy link
Author

Yeah, I'm happy to take the time and do this as "correctly" as possible and see if the latest code can be pulled in. I just didn't know if we've done any sort of piecemeal merging from upstream before.

#2142 is great because it improves the speed of all ports without any need to opt-in (unlike this). This is a great parallel effort, though.

@ladyada
Copy link
Member

ladyada commented Nov 6, 2019

(psst, im into this too, there's a lot of times we could have some optimized code for drivers!)

@theacodes
Copy link
Author

After some initial exploration it looks like the changes to emitnative.c and friends are pretty extensive. I could try to merge these in, but it seems like it might be likely to also effect the bytecode stuff as well (several of the changes involved touch the bytecode stuff).

Not sure how we feel about it right now. We have three options:

  1. Merge this as-is for now, leave it as "experimental". Update this code along with the rest of the micropython code during the next merge from upstream.
  2. Don't merge this yet. Wait until the next merge from upstream to try to enable this.
  3. Attempt to merge just the emitnative.c changes (and friends) without affecting bytecode compatibility.

@dhalbert
Copy link
Collaborator

dhalbert commented Nov 6, 2019

I'm happy to have us try this, since it's all optional anyway. Yes, there may be bugs, but that's true of anything. We have no short-term plans for a merge from upstream: certainly not for 5.0.0

Copy link
Collaborator

@dhalbert dhalbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question: it looks great!

py/objfun.c Outdated
@@ -546,7 +546,9 @@ STATIC mp_uint_t convert_obj_for_inline_asm(mp_obj_t obj) {
STATIC mp_obj_t fun_asm_call(mp_obj_t self_in, size_t n_args, size_t n_kw, const mp_obj_t *args) {
mp_obj_fun_asm_t *self = self_in;

#pragma GCC diagnostic ignored "-Wint-conversion"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tracked down why you need to do this? Is this older code that could use a change in parameter types, or something like that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3439c36, this one turned out to be pretty easy to fix once I looked at what the code was trying to do.

@theacodes
Copy link
Author

Fwiw I spent about an hour trying option (3) and got reasonably far, but there's some significant changes in compile.c that I'm not confident enough to try to graft on to our fork. Maybe down the road.

@theacodes
Copy link
Author

In case anyone is curious about my work to update all of the native emitters there's a branch here. Perhaps after supercon I'll have some time to get that into a mergeable state.

@dhalbert
Copy link
Collaborator

dhalbert commented Nov 7, 2019

@theacodes Are you satisfied with the PR now, or are you still exploring? Happy to merge at this point.

@theacodes
Copy link
Author

theacodes commented Nov 7, 2019 via email

@dhalbert
Copy link
Collaborator

dhalbert commented Nov 7, 2019

Sounds good. I'll wait for you to test on nRF and then merge if that goes fine.

@theacodes
Copy link
Author

Tested on nRF (Circuit Playground Bluefruit alpha)

Build size without CIRCUITPY_ENABLE_MPY_NATIVE:

766616 bytes free in flash out of 1048576 bytes ( 1024.0 kb ).
232344 bytes free in ram for stack out of 245760 bytes ( 240.0 kb ).

Build size with it:

766596 bytes free in flash out of 1048576 bytes ( 1024.0 kb ).
232344 bytes free in ram for stack out of 245760 bytes ( 240.0 kb ).

So 20 more bytes?

Test code:

import time
import micropython


def math_py():
    x = 5
    for n in range(100000):
        x += 1
        x -= 1
        x *= 10
        x /= 10
        

@micropython.native
def math_native():
    x = 5
    for n in range(100000):
        x += 1
        x -= 1
        x *= 10
        x /= 10
        

now = time.monotonic()
math_py()
print("Python elapsed: ", time.monotonic() - now)

now = time.monotonic()
math_native()
print("Native elapsed: ", time.monotonic() - now)

Result:

Python elapsed:  3.592
Native elapsed:  2.125

🎉

Copy link
Collaborator

@dhalbert dhalbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, great! On to real-world testing and use!

@gpshead
Copy link

gpshead commented Nov 8, 2019

Thanks for this! I've been manually tweaking my repos when I wanted these for the past year or two.

Oddities if you're used to CPython: @micropython.asm_thumb et. al. aren't actual callable decorators. On builds without these enabled, the file fails to parse. Minor bummer, but that's micro-life. Not a big deal. What I did in this scenario is just keep different implementations in different files, only loading one of them:

try:
    # Optimized version - requires a custom CircuitPython build.
    from asm_thumb import fractal
    from asm_thumb import monobitmap
    HAVE_ASM = True
except (ImportError, SyntaxError):
    import fractal
    import monobitmap
    HAVE_ASM = False

As seen in https://github.com/gpshead/epaper-circuitpython/blob/master/main.py

Other things... mpycross doesn't handle @micropython.asm_thumb files (I haven't tried with native or viper). Meaning a lot more ram is required just to load code using it. It seems plausible to implement but I'm not sure if it ever was. I haven't checked upstream micropython or tried to dive into that rabbit hole.

@theacodes
Copy link
Author

theacodes commented Nov 8, 2019 via email

@tannewt
Copy link
Member

tannewt commented Nov 10, 2019

@gpshead We should add noop decorators to the micropython module in Blinka: https://github.com/adafruit/Adafruit_Blinka/blob/master/src/micropython.py

@rdagger
Copy link

rdagger commented May 11, 2020

I tried the sample code above on an nRF52840 Feather running CircuitPython 5.3.0-184-g90bd93180 on 2020-05-09. The native runs slightly slower:

Python elapsed: 4.591
Native elapsed: 4.59601

Python elapsed: 4.59094
Native elapsed: 4.59497

Python elapsed: 4.59009
Native elapsed: 4.59399

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants