-
Notifications
You must be signed in to change notification settings - Fork 3k
BLE API devirtualization #9727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLE API devirtualization #9727
Conversation
@mattbrown015 You might be interested by this. |
@pan-, thank you for your changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boy, that was an easy tools review!
Be sure to add this to the list of supported file types in the docs (you can follow my similar pr here: ARMmbed/mbed-os-5-docs#952).
@@ -413,6 +413,7 @@ def add_directory( | |||
".h": FileType.HEADER, | |||
".hh": FileType.HEADER, | |||
".hpp": FileType.HEADER, | |||
".tpp": FileType.HEADER, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember a day when header files ended with a .h
, and we liked it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you ? Headers of the C++ library, a 35 years old language, don't have extensions. In C, X macros and friends have been traditionally put in files that don't end with .h
too.
If the .h
extension as not been standardised at the language level it is because other extensions predate standardisation and these use cases are legitimate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter-pedant alert, and primed for counter-counter-pedantry.
Technically, the C++ standard doesn't say anything about file extensions. The standard headers are included via #include <atomic>
or whatever in the C++ source, but they could perfectly well be stored on disc as atomic.hh
or iso_atomic.hpp
or just built in to the compiler. Mechanism could be totally separate to whatever the #include "xxx"
does for user headers. (Similarly #include "myheader.h"
might not actually find the file myheader.h
. On RISC OS it would find h.myheader
, with .
being the directory separator).
I actually find it a bit annoying that vendors chose to drop suffixes from the C++ std
filenames. Having no extension at all is quite unhelpful for most other tools.
Anyway, no strong feelings about the separate prefix here. I also prefer to have special prefixes for things that you only want #include
d with proper care, but it's up to tools folks to object. This one's a bit borderline, because nothing would actually break if randomly included, right? Would just slow down the build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pan- There's PR to add .inc
files, wouldnt this extension cover the use case for .tpp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one's a bit borderline, because nothing would actually break if randomly included, right?
@kjbracey-arm There's no header guards .tpp
files at the moment. With header guards nothing would break.
@0xc0170 I choose tpp
as an extension because it is more descriptive than h
or hpp
(and avoid clashes with actual header files). Any extension can be use. What is expected in inc
files ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we documented this - only extensions and purpose, without "what should be there".
inc
are used currently for some autogenerated headers in TFM code.
It might be better if this extension addition is sent via separate PR where we can discuss this and alos update docs (see #9715)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a tools perspective, I have no preference on the extensions. We can adapt to what you need.
The .inc
extension is being added because the TF-M work is importing an external project into Mbed OS and the external project use .inc
files. So we're simplifying the import process for that team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does *.tpp work with other builders ie: MDK, DS-5, IAR, etc... or will this break our exporters? How is wrapping this in pragma once or ifdef... not good enough to keep the code base consistent? Apologies if this was already considered, the thread is quite long and I didn't read it all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried for all exporters (but I can!). In this case, *.tpp
files embed template class implementations and these are instantiated only once in a cpp file dedicated to the instantiation of the implementation type that contains all the required implementations.
These files can't be named .cpp
as these are not compilation unit and cannot used the sole suffix .h
as there already is a header containing the declaration so we would end up with two headers with the same name.
Suffix for template implementation really is just a convention as long as it respects the two requirements mentioned above. I started out with the suffix .impl.h
and am happy to change the suffix to whatever is decided.
Note that it would be possible to include everything in header files (I tried) at the expense of weird include patterns, longer compilation time and random breakage requiring reordering if things get moved around. So instead I tried to keep template implementation files consistent with the way we write source implementation files (.cpp) beside the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typedef generic::GenericSecurityManager< | ||
pal::vendor::cordio::CordioSecurityManager, | ||
vendor::cordio::SigningEventMonitor | ||
> GenericSecurityManagerImpl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why both namespace "impl" and "Impl" suffix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some information in the description; basically there is implementation classes and implementation interfaces. In that case GenericSecurityManagerImpl
is the implementation class. The interface imported in the global namespace will be ble::interface::SecurityManager<GenericSecurityManagerImpl>
. We don't want to import GenericSecurityManagerImpl
as it requires visibility of GenericSecurityManagerImpl
and may export thing differently than ble::interface::SecurityManager
(visibility of function can change as an example).
I can tell that the next request will be for |
SecurityManager &sm = createBLEInstance()->getSecurityManager(); | ||
ble_error_t status = sm.getLinkEncryption(connection_handle, &encryption); | ||
if (status == BLE_ERROR_NONE && | ||
(encryption == link_encryption_t::ENCRYPTED || | ||
encryption == link_encryption_t::ENCRYPTED_WITH_MITM || | ||
encryption == link_encryption_t::ENCRYPTED_WITH_SC_AND_MITM) | ||
) { | ||
cmd = GattClient::GATT_OP_WRITE_CMD; | ||
cmd = Base::GATT_OP_WRITE_CMD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
someone still hasn't configured their new IDE ;)
bool bondable, | ||
bool mitm, | ||
SecurityIOCapabilities_t iocaps, | ||
const Passkey_t passkey, | ||
const uint8_t* passkey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve of the general concept. Makes sense as an optimisation on the assumption that you want to optimise for the "1 BLE implementation in the system" case.
Would presumably be a pessimisation with two implementations, but anyone putting two implementations in one image in these sorts of systems probably needs to rethink their ROM-saving approach. (Or am I wrong?)
Here are some numbers for the BLE layer on NRF52840 compiled with GCC in release mode.
This is a great chance to do some performance comparison - I'd love to see the same numbers for all the other toolchains. I believe GCC (and ARMC6?) stand to gain the most by this, as they can't eliminate unused virtuals. Presumably IAR and ARMC5 gain much less, as they should be eliminating unused virtuals.
I believe the same CRTP approach may be worthwhile on a smaller scale for a bunch of our HAL driver classes, with their virtual void lock()
functions. Probably not many images are trying to simultaneously use SPI
and UnlockedSPI : public SPI
, and they certainly don't need run-time polymorphism.
(Although some key things do need it, like FileHandle
or Socket
, so we should still be looking at how to get compilers to optimise virtuals when actually required - eg spotting that UARTSerial
is the only FileHandle
in the build as LTO?).
If the two ports are not using the generic layer (and pal layer) then there is virtually no pessimisation as BLE interfaces are (almost) pure interfaces. If the two ports use the pal layer then it is possible to reuse the generic layer with a virtual pal without changing existing code: // note: simplified for demonstration purposes
template<class Pal>
struct GenericGattClient : ble::interface::GattClient<GenericGattClient<Pal> > {
GenericGattClient(Pal* pal) { }
};
template<class Impl>
struct PalGattClient {
void foo() {
static_cast<Impl*>(this)->foo_();
}
};
struct PalGattClientFoo : PalGattClient<PalGattClientFoo> {
void foo_() { }
};
struct PalGattClientBar : PalGattClient<PalGattClientBar> {
void foo_() { }
};
struct RuntimePalGattClient : PalGattClient<RuntimePalGattClient> {
virtual void foo_() = 0;
};
template<class Impl>
struct RuntimePalClientBridge : RuntimePalGattClient {
RuntimePalClientBridge(Impl* impl) : _impl(impl) { }
virtual void foo_() {
_impl->foo_();
}
private:
Impl* _impl;
};
namespace ble {
namespace impl {
typedef GenericGattClient<RuntimePalGattClient> GenericGattClientImpl;
typedef ble::interface::GattClient<GenericGattClientImpl> GattClient;
}
}
typedef ble::impl::GattClient GattClient;
class BleInstanceBaseFoo {
GattClient &client() {
static PalGattClientFoo pal;
static RuntimePalClientBridge<PalGattClientFoo> bridge(&pal);
static ble::impl::GenericGattClientImpl generic(&bridge);
return generic;
}
};
class BleInstanceBaseBar {
GattClient &client() {
static PalGattClientBar pal;
static RuntimePalClientBridge<PalGattClientBar> bridge(&pal);
static ble::impl::GenericGattClientImpl generic(&bridge);
return generic;
}
}; Note that in practice, two BLE radio are faaaar from being common (Unlike WiFi, I never seen any in real world product for BLE) and if you can afford two identical radio you can usually afford a device with a beefier flash.
I haven't tried yet (but I will shortly 😄 ), however in my first try the implementation wasn't instantiated separately and the average gain from full inlining was ~4K. Effect of LTO where also reduced to noise (~1% gain). |
@kjbracey-arm So I made some benchmark with the various compiler we use. Note that unlike the previous numbers exposed I removed the cordio stack as it is a fixed cost and unrelated to this PR.
There's several things to notice in this number:
|
@pan- This PR is targeting 5.12? |
@0xc0170 I think that would be the best. |
This needs a rebase now |
@0xc0170 I believe it will get (partially) squashed as well. |
@0xc0170 I made a rebase and cleaned up the history as well. |
This PR is at risk of missing 5.12 release as it's marked as "needs: work". Code freeze is coming! On Friday 1st. Please made necessary updates ASAP and make sure the reviewers are aligned for prompt code inspection. |
Not until IAR8 is in. |
CI started |
Test run: SUCCESSSummary: 12 of 12 test jobs passed |
CI passed. Reviewers, please review: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good - very excited about seeing this one coming in. I think we should discuss how we could apply the same solution to other areas of the OS.
CI started |
Test run: FAILEDSummary: 1 of 13 test jobs failed Failed test jobs:
|
Ci restarted (iar8 exporter issue we will resolve separately). |
Description
This submission replace runtime polymorphism by compile time polymorphism in BLE main APIs, generic and pal layer.
Removing virtual functions allows the linker to remove functions that are not used by application code. Here are some numbers for the BLE layer on NRF52840 compiled with GCC in release mode.
Even in our test application that uses all functions defined by the BLE API, we save around 7% of code.
More importantly this construct is the foundation for efficient compile time configuration (With both optimisation, the beacon example shrink to ~30K of code).
Principle
Compile time polymorphism is achieved with the Curiously Recurring Template Pattern (CRTP). In a nutshell, every interface definition is now a class template that accepts its implementation as parameter. Interface APIs call the (hidden) implementation defined in the derived class. If the function is not defined in the derived class then it fallbacks to a stub defined in the interface.
Backward compatibility
User facing API prototypes are now templates defined in the namespace
ble::interface
:::Gap
->ble::interface::LegacyGap
::ble::Gap
->ble::interface::Gap
::GattClient
->ble::interface::GattClient
::GattServer
->ble::interface::GattServer
::SecurityManager
->ble::interface::SecurityManager
It is expected that an implementation of ble forward the final implementation types in a header named
BleImplementationForward.h
and in the namespaceble::impl
:ble::impl::LegacyGap
->ble::interface::LegacyGap<LegacyGapImpl>
ble::impl::Gap
->ble::interface::Gap<GapImpl>
ble::impl::GattClient
->ble::interface::GattClient<GattClientImpl>
ble::impl::GattServer
->ble::interface::GattServer<GattServerImpl>
ble::impl::SecurityManager
->ble::interface::SecurityManager<SecurityManagerImpl>
These final types are then imported into the global namespace to expose types similar to the one we've been using for years. Note that to not leak information from the implementation class it is important to export the interface and not the implementation:
Instantiation
To prevent expensive instantiation of template types (and horrible inclusion pattern for the generic layer ... ), function body of the template interfaces are not defined in headers files. Instead these are defined in
.tpp
files. It is expected from an implementation to instantiate the types it export in a cpp file:Out of scope
The following items would improve this submission but are out of scope for now:
TODO
Pull request type
Reviewers
@kjbracey-arm @paul-szczepanek-arm @donatieng