[WIP] DFUService Implementation #13

AGlass0fMilk · 2020-11-16T21:36:47Z

BLE Standard Service:

Yes
No

Proposed UUIDs:

Service Name: DFUService
UUID: 53880000-65fd-4651-ba8e-91527f06c887

Characteristic Name: Update Slot Characteristic
UUID: 53880001-65fd-4651-ba8e-91527f06c887
Mandatory?

Yes
No
This characteristic may be omitted if the application only supports a single update candidate slot.
Mandatory Properties: Read, Write w/ response
Optional Properties: None
Security Requirements: Determined by application

Characteristic Name: Offset Pointer Characteristic
UUID: 53880002-65fd-4651-ba8e-91527f06c887
Mandatory?

Yes
No
Mandatory Properties: Read, Write w/ response
Optional Properties: None
Security Requirements: Determined by application

Characteristic Name: Binary Data Stream Characteristic
UUID: 53880003-65fd-4651-ba8e-91527f06c887
Mandatory?

Yes
No
Mandatory Properties: Write w/ response (TBD: for writes that are synchronized with underlying flash programming?)
Optional Properties: Write w/o response (for fast, buffered operation if device has enough resources/RAM to support)
Security Requirements: Determined by application

Characteristic Name: DFU Control Characteristic
UUID: 53880004-65fd-4651-ba8e-91527f06c887
Mandatory?

Yes
No
Mandatory Properties: Read, Write w/ response, Notify, Indicate
Optional Properties: None
Security Requirements: Determined by application

Characteristic Name: DFU Status Characteristic
UUID: 53880005-65fd-4651-ba8e-91527f06c887
Mandatory?

Yes
No
Mandatory Properties: Read, Notify, Indicate
Optional Properties: None
Security Requirements: Determined by application

Other proposed constants:

Application Layer ATT Error Response Codes:

mbed-os-experimental-ble-services/services/inc/DFUService.h

Lines 128 to 140 in f794a51

    
               /** 
        
                * As per Bluetooth Core specification V5.2, Vol 3, Part F, Table 3.4 (Error Codes) 
        
                * ATT Error Codes between 0x80 and 0x9F are reserved for use by the application 
        
                * 
        
                * These error codes are valid for the DFUService application layer in addition to those 
        
                * defined in the GattAuthCallbackReply_t enum. 
        
                */ 
        
               enum ApplicationError_t { 
        
                   AUTH_CALLBACK_REPLY_ATTERR_APP_BUSY             = 0x019E, /** DFUService is busy (eg: flush in progress) */ 
        
                   AUTH_CALLBACK_REPLY_ATTERR_APP_INVALID_SLOT_NUM = 0x019F, /** Client requested invalid slot index */ 
        
               };

Overview:

The DFU Service implements a flexible API enabling in-field updates to be transferred to a GattServer over BLE.

The primary goal of the DFU service is to enable fast, error-free binary transfers. The processes to perform validation and installation of the update binaries are not specified by the DFUService. However, the DFUService does implement features that facilitate communication of update candidate validation and installation results. This allows the DFUService to be used in a wider variety of DFU applications, both secure and non-secure.

TODO - more details... I'm getting tired of typing at the moment...

Detailed Description

Update Slot Characteristic

The Update Slot Characteristic is an optional characteristic of the DFUService. It allows the peer to select from an array of valid update slots offered by the GattServer.

An update slot is a contiguous memory address space where an update candidate binary can be written for validation and installation. The method of validation and installation is outside the scope of this service specification.

Typically, an update slot will be a contiguous region of non-volatile memory. This is not a requirement of the service, but the underlying memory must be mapped such that the update slot address range is contiguous (eg: using a ChainingBlockDevice to join disjoint regions of underlying flash memory together).

The number of slots supported is determined by the application but is limited to between 1 and 255 by the width of the slot characteristic data (uint8_t). A write request attempting to write an invalid slot index (ie: one that does not have a registered/valid backing BlockDevice) will be rejected with the application-defined response code: AUTH_CALLBACK_REPLY_ATTERR_APP_INVALID_SLOT_NUM

The Update Slot Characteristic may be omitted from the DFUService if the application only supports one update slot.

Attempting to write the Update Slot Characteristic while the application is busy processing buffered binary stream data may result in the write being rejected with the application-defined response code: AUTH_CALLBACK_REPLY_ATTERR_APP_BUSY

Implementation Considerations:

Upon a successful write to the Update Slot Characteristic, the currently-selected slot BlockDevice is deinitialized and the newly-selected slot BlockDevice is initialized.
A write to this characteristic while the binary stream buffer is not empty will flag the buffer to be flushed to the underlying BlockDevice. Writes to this characteristic will be rejected until the buffer is emptied and the flush flag is reset.
In delta mode, selecting a new slot WILL NOT result in the remaining data in the slot being written with copied application data. To accomplish this, the peer should write the offset characteristic to the point where data should be copied before changing slots.
Valid slots are intended to be empirically determined by the peer (as necessary) through write attempts. In most cases, valid slots are predetermined and known by both the client and server.

Offset Pointer Characteristic

The Offset Pointer Characteristic exposes the write offset pointer of the DFUService. The write offset pointer is the offset address of the underlying slot memory that serial binary data will be sequentially written to. It is incremented automatically when data is written to the Binary Stream Characteristic. The offset may be read by the client to validate the expected number of bytes have been processed by the DFUService. The offset may be written by the client to skip a region of memory or revisit a previously-written region of memory.

Similar to the Update Slot Characteristic, attempting to write the Offset Pointer Characteristic while the application is busy processing buffered binary stream data may result in the write being rejected with the application-defined response code: AUTH_CALLBACK_REPLY_ATTERR_APP_BUSY

Implementation Considerations:

A rejected write will initiate flushing the buffer to the selected slot block device.
Subsequent writes will be rejected until flushing is complete.
Any writes to the binary data stream characteristic while the buffer is being flushed will be ignored
If the delta bit is enabled, any memory sections skipped will be written with bytes copied from the primary application. This is not applicable for offset writes that result in the offset pointer moving backwards.

Binary Stream Characteristic

The Binary Stream Characteristic is the characteristic the peer will write to transfer update candidate binary data to the DFUService. Data written to the Binary Stream Characteristic will be subsequently written sequentially to the underlying slot BlockDevice. For each byte written to the Binary Stream Characteristic, the Offset Pointer Characteristic will be incremented.

If the peer has previously initiated a flush of the binary stream buffer, writes to this characteristic will be ignored until the flushing operation is complete. This is to prevent unintended writing after the peer has requested an offset change of slot change.

TODO - When are flush operations initiated? Offset write w/ buffered data, slot write w/ buffered data, DFU commit bit set in DFU ctrl.

Similarly, writes will be ignored while the flow control bit is set (indicating data stream should pause) in the DFU Control Characteristic.

Implementation Considerations:

The Binary Stream Characteristic uses write w/o response to improve speed of data transfer. Should we consider using write w/ response to synchronize with latency incurred with flash writes? Perhaps write w/o response could be optional depending on if the underlying DFUService buffers binary stream data or not.

Control Characteristic

The Control Characteristic is a set of bitflags, each with a specified meaning:

Bit 0: DFU_ENABLE_BIT. Setting this bit to 1 enables DFU mode. The application may reject a write attempting to set this bit if a DFU operation is currently disallowed. Upon a successful write, the application may use the change in value of this bit to trigger application-specific actions. Eg: perform various preparations such as erasing a flash region or indicate DFU mode to the device user in some way. Manually clearing the DFU_ENABLE_BIT may be interpreted by the application as a "DFU abort" event.

BIT 1: DFU_COMMIT_BIT. Setting this bit to 1 triggers a commit of the DFU operation. The actions taken by the application upon a DFU_COMMIT event are application-defined but may include (but not be limited to): Validating the update candidate binary, installing the update candidate binary, issuing a device reset to validate and install the update candidate. Upon writing the DFU_COMMIT_BIT to 1, the both the DFU_COMMIT and DFU_ENABLE bits will be cleared. This effectively ends the DFU session.

Bit 2: DELTA_MODE_EN_BIT. Setting this bit to 1 enables the delta mode feature of the DFU service. In some cases, an update candidate binary may have regions where the binary data is the same as the currently installed application. In this case, the duplicate program data does not need to be transferred to the DFUService. If this bit is set when a forward-moving offset write occurs, program data between the old offset and the new offset will be copied to the selected update candidate slot. The use of this feature requires prior knowledge of the installed application binary and the update candidate to be installed. Therefore, the GattClient application bears the burden of using the delta mode feature properly.

Bit 7: FLOW_CONTROL_PAUSE_BIT. This bit is read-only. If set, the client should pause writes to the Binary Stream Characteristic. Any writes to the Binary Stream Characteristic while this bit is set will be ignored.

TBD: other assigned bits. Perhaps this should be a 32-bit field so future additions are less limited?

The Control Characteristic may be written by the client or the server. The client is encouraged to subscribe to notifications/indications from this characteristic if write w/o response (ie: buffered mode) will be used. This is so the client can be notified when the flow control pause bit is set or cleared by the GattServer.

The application can reject requests to write this control characteristic as needed (eg: if DFU mode is not allowed at this time, the write to set DFU_ENABLE_BIT to 1 will be rejected). A write attempting to change any read only bits will be rejected by the DFUService automatically.

TODO define read only bits, define application ATT error for read only bits and DFU unavailable errors.

TBD use std::bitset?

Status Characteristic

The Status Characteristic is intended to report the status of the DFU operation. The DFUService does not require the DFU operation to be completed during the BLE connection facilitating the DFU transfer. ie: Validating and/or installing the transferred update candidate binary may or may not be performed during the server's current power cycle. It is application-defined if the update candidate binary is validated immediately after a DFU commit has been performed or if validation and installation is deferred to another software component (eg: secure bootloader).

In the latter case (deferred validation/installation), the success of the DFU operation may be obtained by reading the Status Characteristic during the next connection after the update has been applied. It may also be inferred if the GattServer implements the DeviceInformationService and exposes the current installed application version(s).

TBD: Status codes

Reviewers

@pan- @paul-szczepanek-arm

This is a WIP PR to open up the DFUService design process for comment. My goal is to have a somewhat generic update service that leaves most decisions up to the application.
Decisions such as:

Security requirements (ability to enable encryption, etc on characteristics to prevent interception of application binary over the air)
Update validation methods, if any
Update installation methods (delta updates, etc)

This is PR constitutes a draft of my first cut at designing such a DFU API.

I would appreciate your comments and feedback!

AGlass0fMilk · 2020-11-17T04:51:55Z

Need to define what happens when client attempts to write beyond the size of the underlying slot.

Perhaps this can be communicated using the status field?

In reality, this should never occur... the tools used to generate the binary will probably give errors long before any attempt to use the binary as an update.

Exceeding the address space of the slot can be considered a critical failure and cause the DFU procedure to be aborted. A new binary will have to be written in this case anyways.

AGlass0fMilk · 2020-11-17T14:23:52Z

A buffer flushing operation is an internal implementation detail. To prevent the client from writing any binary stream data during a buffer flush operation, the server will set the flow control bit in the control characteristic.

pan- · 2020-11-17T16:47:26Z

@AGlass0fMilk That is excellent, thank you for starting the discussion on this service 👍 .

Version information

From a very high level view I was seeing the DFU process as follow:

It mostly matches what you are proposing at one exception, it is not possible to get information about the installed firmware. These informations are used by the client to decide if the device requires an update or not.
One option could be to use the firmware version from the Device Information Service but it doesn't cover all the cases: For example, in some application it is desirable to update the firmware of a peripheral. A clean solution (in my opinion) could be to have several DFU services one for the peripheral firmware and another for the device itself. The client should be able to access information about the firmware; this can be an optional characteristic, if not available, the client should read the current firmware version in the DIS.

Binary transfer

This could be a service of its own 😁 . I believe the size of the Binary Stream Characteristic should be equal to the size of the MTU to maximize speed. A flow control bit raised when the local device is busy or not available seems appropriate. Why did you choose to add it into the control characteristic rather than the status ?

How does the presented system deals with loss of write w/o response ? It seems that the Offset Pointer Characteristic can be used to that effect but how does it works in practice ? Is it up to the application ?

Control Characteristic

What is the benefit of having a bit set rather than specific commands such as:

Enable the update
Apply the image
Copy memory region (and move write pointer)

Ideally this should be extendable to support application specific control commands, selecting an update slot might be considered as such (but it is so common that it could be defined directly).

Implementation

I don't think the DFU service should be tied to Block device. I'd rather see the DFU service as forwarding events to a companion class that actually does the update. In events, I can think of:

Start the update requested
Data received
Cancel update requested
Select slot
Copy existing image region

Of course the class realizing the DFU should report things back into the DFU service for example report failures, report state changes, block transfer until buffer are free, etc...

AGlass0fMilk · 2020-11-17T19:12:49Z

It mostly matches what you are proposing at one exception, it is not possible to get information about the installed firmware. These informations are used by the client to decide if the device requires an update or not.
One option could be to use the firmware version from the Device Information Service but it doesn't cover all the cases: For example, in some application it is desirable to update the firmware of a peripheral. A clean solution (in my opinion) could be to have several DFU services one for the peripheral firmware and another for the device itself. The client should be able to access information about the firmware; this can be an optional characteristic, if not available, the client should read the current firmware version in the DIS.

I see what you're saying. That could work. We could have an optional Firmware Revision String (0x2A26) characteristic in the DFU service. It is required if the information is not available through an existing Device Information Service elsewhere in the GATT Profile.

One unknown now:
How does the GATT Client identify which DFUService goes to which firmware? Should we have some sort of optional Firmware Description characteristic that is required if the above Firmware Revision String characteristic is available in the DFUService? What kind of value should this description store? Should it be a string? Or should we define an enumeration of firmware types (eg: "printer firmware", "mouse firmware", "accelerometer firmware", ...). I think I like the string idea better as it is more readable and flexible.

Binary transfer

This could be a service of its own 😁 .

I suppose it could be, how would you split this off from DFU? We would then be kind of specifying a partial "DFU Profile". We could potentially make the Binary Transfer Characteristic a reusable class in some way. I have a few other ideas along this line. The underlying code is very similar to my UARTService implementation and I feel there is enough commonality that the implementations could be refined into a single reusable class.

I believe the size of the Binary Stream Characteristic should be equal to the size of the MTU to maximize speed.

Agreed. That is currently what the service defaults to. It allows space for the maximum configured MTU:

mbed-os-experimental-ble-services/services/inc/DFUService.h

Lines 35 to 45 in f794a51

    
           /** 
        
            * Maximum length of data (in bytes) that the DFU service 
        
            * can receive at one time. 
        
            * 
        
            * Typically MTU - 3 bytes for overhead 
        
            */ 
        
           #ifndef MBED_CONF_CORDIO_DESIRED_ATT_MTU 
        
           #define BLE_DFU_SERVICE_MAX_DATA_LEN MBED_CONF_BLE_DFU_SERVICE_BUFFER_SIZE 
        
           #else 
        
           #define BLE_DFU_SERVICE_MAX_DATA_LEN (MBED_CONF_CORDIO_DESIRED_ATT_MTU - 3) 
        
           #endif

A flow control bit raised when the local device is busy or not available seems appropriate. Why did you choose to add it into the control characteristic rather than the status ?

So the control characteristic I was thinking of more like a register.

Originally, the flow control bit was its own characteristic. However, I like to try and keep the number of separate characteristics as low as reasonably possible. I've encountered issues in the past with having too many custom UUID characteristics (Old Nordic softdevice only allowed a certain number) and there is a noticeable delay incurred during service discovery as the number of characteristics goes up... of course this can be solved by using a faster connection interval.

My thoughts were:

The status characteristic is used for major events like a failure or success. There will usually only be one status update per DFU session (at the end). My thought was that this characteristic could be read by the client after a reboot to check the result of the update.

Somewhat of a tangent:

In my specific use case, I am using mcuboot to validate and install firmware updates. This allows me to reduce code size in my application by not adding the various crypto functions to perform digital signature verification and whatever else on the firmware update -- the bootloader takes care of it. This makes the status reporting more complicated because the BLE must be disconnected before the status will be available. I tried to capture this use case as well as the use case where the validation is performed by the application itself in this specification.

A failed update can be detected by checking the firmware version has been updated. If the update failed, the status characteristic will contain the most recent reason why.

How does the presented system deals with loss of write w/o response ? It seems that the Offset Pointer Characteristic can be used to that effect but how does it works in practice ? Is it up to the application ?

In my experience, write w/o response has significantly less latency when compared to a regular write (w/ response). This is mostly true for BLE stacks operating in a higher-level OS (Linux, Android, Mac, Windows, etc).

My thought is that for quick firmware updates, write w/o response can be sufficient as long as the application is okay with the occasional failed update. The application may or may not check the Offset Pointer Characteristic to ensure the data has been received and written properly.

I guess with a large enough MTU, using write w/ response can be fast as well. So perhaps we just get rid of the write w/o response option? What is your experience with this @pan- ?

Control Characteristic

What is the benefit of having a bit set rather than specific commands such as:
* Enable the update

* Apply the image

* Copy memory region (and move write pointer)
Ideally this should be extendable to support application specific control commands, selecting an update slot might be considered as such (but it is so common that it could be defined directly).

I like the idea of application-specific control commands.

Implementation

I don't think the DFU service should be tied to Block device.

Why not? I think it's a pretty beautiful and clean architecture. I'm a huge fan of the BlockDevice API.

I'd rather see the DFU service as forwarding events to a companion class that actually does the update. In events, I can think of:
* Start the update requested

* Data received

* Cancel update requested

* Select slot

* Copy existing image region
Of course the class realizing the DFU should report things back into the DFU service for example report failures, report state changes, block transfer until buffer are free, etc...

If you can expand on this use case it might help convince me not to tie this to BlockDevice 😄

AGlass0fMilk · 2020-11-17T20:28:31Z

Regarding the block device conversation above: I thought about it a bit and I see why you would want to have some sort of DFU abstraction between the service and the memory which is operated on. It would be useful in the case of updating firmware on a remote device.

However, I also think this:

The BlockDevice API is well established and fairly clean. Perhaps it would be satisfactory to tie the DFUService implementation to BlockDevice. In the case where a DFU targets a remote device in the system, the developer must implement a 'RemoteBlockDevice' that forwards the calls to the remote device as if it were a local BlockDevice.

This prevents us from having to create another API for some sort of "DFU Manager" class.

Thoughts, @pan- ? Can you see a case where this is still insufficient?

pan- · 2020-11-17T21:05:04Z

Thanks for sharing your thoughts, that's it is useful for me to understand the reasoning behind design decisions!

How does the GATT Client identify which DFUService goes to which firmware?
Characteristic User Descriptors can be used to identify the firmware endpoint (local device, peripheral, ...). A string could be fine and we can let it to be defined by the application.

Here is how its done for Battery services (multiple battery service can be exposed by a single server):

When a device has more than one instance of the Battery service, each Battery Level characteristic shall include a Characteristic Presentation Format descriptor that has a namespace/description value that is unique for that instance of the Battery service.

There is not format for the namespace/description, it is up to the application. By default if a single service is on the server, the client should not care about it. (Not that even if I brought the subject, I don't think we should think too much about it as long as an option is available for improvement).

I guess with a large enough MTU, using write w/ response can be fast as well. So perhaps we just get rid of the write w/o response option? What is your experience with this @pan- ?

Regular write is not an option, the power of notification and write w/o response is that you can stack severals in a single connection event. In my latest experience, we used a rolling counter in the first byte to identify the fragment id. The server was sending a notification containing the expected fragment ID whenever an an unexpected fragment ID was received so the client can retry the transfer at the fragment indicated by the server. It worked well and the latency to detect an error was good (at most two connection events).

Why not? I think it's a pretty beautiful and clean architecture. I'm a huge fan of the BlockDevice API.

I'm thinking of testing and extendibility but at the same time, a block device implementation would solve 95% of the use cases out there 😅 . I'm probably overthinking it at this stage. If there's a need it should be easy to split BLE handling from the BD logic. There will probably be some DFU Manager like class from users if they integrate things such as image verification in the application but I really like the idea of the default use case where all of these operations are made in the bootloader and therefore not exposed in the base implementation of the DFU service!

AGlass0fMilk · 2020-11-18T18:02:32Z

Characteristic User Descriptors can be used to identify the firmware endpoint (local device, peripheral, ...). A string could be fine and we can let it to be defined by the application.

Here is how its done for Battery services (multiple battery service can be exposed by a single server):
When a device has more than one instance of the Battery service, each Battery Level characteristic shall include a Characteristic Presentation Format descriptor that has a namespace/description value that is unique for that instance of the Battery service.
There is not format for the namespace/description, it is up to the application. By default if a single service is on the server, the client should not care about it. (Not that even if I brought the subject, I don't think we should think too much about it as long as an option is available for improvement).

This seems like a good solution to me. I will add this to the final specification and add a string name parameter that must be provided when enabling the optional firmware version characteristic. I think having a unique Characteristic User Description Descriptor associated with the firmware version characteristic of a DFUService instance is sufficient. We do not need to complicate things by adding arbitrary namespace/description enums... unless you think this would be useful in some way.

I'm thinking of testing and extendibility but at the same time, a block device implementation would solve 95% of the use cases out there. I'm probably overthinking it at this stage. If there's a need it should be easy to split BLE handling from the BD logic. There will probably be some DFU Manager like class from users if they integrate things such as image verification in the application but I really like the idea of the default use case where all of these operations are made in the bootloader and therefore not exposed in the base implementation of the DFU service!

My goal of the DFUService is to simply act as a transport that covers the most use cases. Things like validation and installation can be triggered and monitored using the DFUService but need not be handled directly by the DFUService itself. After all, BLE is simply a transport for firmware updates. Adding anything more makes the service less reusable/extensible.

I think BlockDevice provides a suitable API for DFU. Most, if not all, DFU implementations aim to accomplish writing a new binary to memory as fast as possible. With the BlockDevice API, there are many possible configurations just with what Mbed OS provides, eg: writing to a RAM-based BlockDevice, joining multiple memory regions together, support for a number of underlying memory devices. If anything more advanced is needed, the user simply needs to implement the BlockDevice API for whatever device/transport they are using.

I think it's a nice abstraction for writing to memory -- which I think all DFU use cases require. @pan- I appreciate your feedback as I may not be able to see a use case where this implementation is not ideal (please let me know if you can think of them)!

If someone wants to implement a DFUService subclass that provides on-the-fly image verification, we could make the data handling function virtual so they can add whatever logic they want before calling the base class implementation (or not). eg: Override the original data handling function, add the incoming bytes to an ongoing SHA hash, then call the base class implementation to write the bytes to memory. Of course this may not be a great idea if packets are lost... SHA hashing should probably be done once the transfer is complete!

I have an immediate need for such a DFUService so I will be working on a first cut of this service as we develop the final specification for V1.

Let me know if you have any other thoughts!

AGlass0fMilk · 2020-11-18T18:14:22Z

Forgot to respond to this above:

Regular write is not an option,

I figured it would be painfully slow.

the power of notification and write w/o response is that you can stack severals in a single connection event. In my latest experience, we used a rolling counter in the first byte to identify the fragment id. The server was sending a notification containing the expected fragment ID whenever an an unexpected fragment ID was received so the client can retry the transfer at the fragment indicated by the server. It worked well and the latency to detect an error was good (at most two connection events).

This is a clever idea. Maybe it would be possible to implement a similar "retry" mechanism using the offset pointer characteristic:

Perhaps we could make the offset pointer characteristic notifiable so that as the client writes data, it can be notified whenever data is successfully written. If they get out of sync, the client can return the pointer to a position that was known to be in-sync and continue writing from there... One caveat I see is that the write size of the underlying block device may not match (or be a multiple of) the MTU and so the client may not see 1-to-1 offset pointer advances each time it writes the binary stream characteristic...

I think for now I will implement your fragment ID method since it is simpler.

Side conversation @pan- :
I was going to make a separate issue for this but I figured I would just ask it here:

In my speed testing, I have been unable to successfully configure the MTU above 256 bytes. The build succeeds but all the characteristics become blank (corruption of the BLE memory?)

There is a comment in the cordio mbed_lib.json here:

        "rx-acl-buffer-size": {
            "help": "Size of the buffer holding the reassembled complete ACL packet. This will limit the effective ATT_MTU (to its value minus 4 bytes for the header). The size of the buffer must be small enough to be allocated from the existing cordio pool. If this value is increased you may need to adjust the memory pool.",
            "value": 70
        },

Highlighting this part due to extreme scrolling requirement: If this value is increased you may need to adjust the memory pool.

How does one go about adjusting the memory pool of the Cordio stack so the rx-acl-buffer-size can be increased to support an MTU of 512? This should be added to the configuration help string.

AGlass0fMilk · 2020-11-18T19:13:31Z

So for our implementation here, I think we can use the existing status characteristic to accomplish the fragment ID error correction scheme you described.

If the highest bit of the 8-bit status characteristic value is set, the 7LSB will be the expected sequence ID that did not match what the client sent. The client should then restart transmission from that sequence ID. It rolls over at 127, which should be more than enough to detect errors and return the to appropriate spot in the binary.

@pan- What do you think?

pan- · 2020-11-19T16:12:27Z

Perhaps we could make the offset pointer characteristic notifiable so that as the client writes data, it can be notified whenever data is successfully written. If they get out of sync, the client can return the pointer to a position that was known to be in-sync and continue writing from there... One caveat I see is that the write size of the underlying block device may not match (or be a multiple of) the MTU and so the client may not see 1-to-1 offset pointer advances each time it writes the binary stream characteristic...

That's would be hard to get it right because the client does multiple write w/o request in a connection event and won't get an answer before the next connection event. It is much simpler to signal failures than acknowledge success.

If the highest bit of the 8-bit status characteristic value is set, the 7LSB will be the expected sequence ID that did not match what the client sent. The client should then restart transmission from that sequence ID. It rolls over at 127, which should be more than enough to detect errors and return the to appropriate spot in the binary.

Seems good.

How does one go about adjusting the memory pool of the Cordio stack so the rx-acl-buffer-size can be increased to support an MTU of 512? This should be added to the configuration help string.

Your right, the documentation should be amended, the adjustment depends on whether these buffers are rx or tx and how many do you want. The computation is here. Adjustment is (ACL config - default ACL config) * (rx count + tx count).

AGlass0fMilk · 2020-11-19T16:39:02Z

I suggest we make it mandatory to have a Firmware Revision String Characteristic somewhere on the Gatt Server:

For the DFU Service, it is optional to have the FW Rev Characteristic if and only if the following are all true:

There is only one instance of the DFU Service on the Gatt server and
The appropriate Firmware Revision String Characteristic is available within an existing Device Information Service instance on the Gatt Server

If there is only one Firmware Revision String Characteristic available on the server, the Characteristic User Description descriptor becomes optional.

pan- · 2020-11-24T17:00:39Z

@AGlass0fMilk I must find time to review this service in depth but it looks very good. What I want to personally try is transfer performances and write on the internal flash to see how it affect BLE.

What do you think of the service controlling the connection parameters ? Reducing the connection interval should increase transfer speed while increasing slave latency could work around the problem that arise when the internal flash is written (block the MCU, a connection event might be missed).

The service could also negotiate the 2MPhy to be used as well as the MTU size. I see these QoL improvements being guard behind defines so it doesn't increase the footprint if it's not required.

What would be the starting point if we want to help you with this ? How do you test the service today ?

Side note: A service that allows a client to make the server send a connection parameter update request would be a useful addition, mobile platforms generally accepts new connection parameters but do not have options to tune it.

AGlass0fMilk · 2020-11-24T17:10:43Z

@pan- Writing this as you commented 😁

I have created a very rough test script and Mbed application to test throughput. You can see the working repo here: https://github.com/AGlass0fMilk/mbed-ble-dfu-testing

Currently, the throughput isn't super great but it is acceptable given the poor BLE support of most desktop operating systems.

My laptop's chipset supports BLE 5 but the software apparently doesn't. So my testing is done with the default 1M Phy.

I am on Ubuntu 20.04 and using the python library bleak for cross-platform BLE. The BLE backend of bleak on Linux is through dbus, which I think is the bottleneck at this point. Inter-write delays have to be carefully tuned to avoid having dbus silently drop packets.

With my current setup, transferring 384 bytes at a time (MTU is set to 512 but I didn't feel like tuning the delays to work properly), I was able to reliably transfer a 512kB random binary (created with: dd if=/dev/urandom of=update.bin count=1024) to the QSPI on board an nRF52840DK in about 116 seconds. That's about 4.5 kB/s, which is pretty slow. Definitely can improve this number using a different client OS. Thankfully, 512kB is a reasonable update binary size for many targets.

It might even be worth it to make an Mbed DFU Client that simply copies a binary from on-board flash...

Ultimately it would be nice to make open source clients for the major platforms (Mac/Windows/Linux, iOS, Android)

Some timely reading that was just published by Punch Through: https://punchthrough.com/ble-throughput-part-4/

I read part 3 and was wondering how throughput could be improved by the 2M phy!

AGlass0fMilk · 2020-11-24T17:29:17Z

@pan- Regarding how your team can help:

The DFUService code itself is okay, I haven't put much thought into mutexes and how to offload flash writes from the BLE process. Ideally it would be compatible with baremetal but it may complicate things if writing to flash can block BLE event processing for a long time.

That being said, I think the next steps would be:
1.) Create a test suite for the DFUService to test corner cases, out-of-order sequence numbers, flow control, etc.
2.) Create throughput testing
3.) Create clients for various operating systems (we need to prioritize which)

I think as part of the automated testing development (1) we can add a test case for throughput to accomplish (2) with minimal extra code.

(3) would come after the DFUService has been fully tested. The bleak client I have should work on Windows and Mac (probably will need a couple modifications) but I have not tested this.

The way I have been testing with the client and server applications mentioned above is:

Build and flash the mbed-ble-dfu-testing app to an nRF52840DK
Generate a random update binary of desired size: dd if=/dev/urandom of=update.bin count=1024 (size = 512 * count)
Run the ble_dfu.py script
1. I have been developing the script with python 3.9, so ideally you should install that and create a virtualenv
2. Install bleak: pip install bleak
3. Execute the script: python ble_dfu.py TestDFU update.bin -v
Wait for binary to be transferred completely. Script will disconnect at the end of the process.
Compare the transferred binary on the QSPI to the original
- For this, I use nrfjprog to copy the QSPI to an intel hex file... unfortunately it copies the whole 8MB so it takes a while.
- The entire command I use is: nrfjprog -f nrf52 --readqspi qspi2.hex && arm-none-eabi-objcopy -I ihex qspi2.hex -O binary qspi2.bin && dd if=qspi2.bin of=qspi2-512kb.bin count=1024 && cmp update.bin qspi2-512kb.bin
- The above command copies the QSPI to an intel hex file, then converts it to a binary, then copies the first 512kB, then uses cmp to check it matches the original update.bin byte-for-byte.

@pan- Let me know what your thoughts are or if you need any more information about my testing process.

I have some projects that need BLE DFU now so I will be working with this "beta" version for now. I can help with the automated/throughput testing as well.

pan- · 2020-11-24T17:59:15Z

Thanks for sharing how you test the DFU service, that is really helpful for anyone that wants to be involved!

The BLE backend of bleak on Linux is through dbus, which I think is the bottleneck at this point. Inter-write delays have to be carefully tuned to avoid having dbus silently drop packets.

Performances also depends on the controller used, in our testing some had terrible performances outside of DBUS latency. In the end we were transferring block of data that to not roll over the fragment counter. Blocks were acknowledged with an indication upon full reception (there was some mechanism to set up the size of the transfer too).

I have some projects that need BLE DFU now so I will be working with this "beta" version for now. I can help with the automated/throughput testing as well.

I think it is the best to go with something and iterate but it would be better if services and characteristics are settled before you go in production 😅 .

1.) Create a test suite for the DFUService to test corner cases, out-of-order sequence numbers, flow control, etc.

Unit test usually shine in these area and it is easy to setup. Eventually a compilation of both unit test and integration test would be the best.

I will try for myself the code and test this week to provide more feedback.

Thanks George.

I realized that there is no way in mbed-os to enable the data length extension from the server. There is no calls to DmConnSetDataLen anywhere in the C++ API. That will be addressed by the connectivity team.

AGlass0fMilk · 2020-11-25T19:08:00Z

My below response is related to your previous comments:

What do you think of the service controlling the connection parameters ? Reducing the connection interval should increase transfer speed while increasing slave latency could work around the problem that arise when the internal flash is written (block the MCU, a connection event might be missed).

The service could also negotiate the 2MPhy to be used as well as the MTU size. I see these QoL improvements being guard behind defines so it doesn't increase the footprint if it's not required.

I'm not sure if the service should control these things as it starts introducing many dependencies and centralizing a lot of functionality all in one place. Perhaps a better way to handle this would be through documentation and guides that detail best practices for DFU throughput and reliability. There must be an example application for this service once it is done and we should document any considerations when setting up connection parameters, etc.

Perhaps as an extension this could be added.

Side note: A service that allows a client to make the server send a connection parameter update request would be a useful addition, mobile platforms generally accepts new connection parameters but do not have options to tune it.

I agree, see below comments.

Performances also depends on the controller used, in our testing some had terrible performances outside of DBUS latency. In the end we were transferring block of data that to not roll over the fragment counter. Blocks were acknowledged with an indication upon full reception (there was some mechanism to set up the size of the transfer too).

I like to take advantage of Bluetooth's existing control mechanisms as much as possible. I think negotiating the MTU is acceptable for this.

However, there seem to be many customizations that are useful in some cases. I would like to design an extensible way to add these features in the future (or have users add their own) while still having some base compatibility. This would let us develop basic, maybe even universal, DFU clients. A client can try to enable an extension and the server can reject the write request if it does not support it.

@pan- What do you think the best way to design this into the DFUService would be? We could extend the control flags characteristic to 32-bits (or more) and reserve some for future use by the DFUService spec and some for use by application.

Incrementally we could add features like delta mode, fragment size negotiation, initiating a peripheral-negotiated connection parameter update requests, etc.

Perhaps we could instead have optional characteristics that provide these extensions and define these as we revise the DFUService specification? This would be the most "Bluetoothy" (if you will) way to do it in my opinion.

I think it is the best to go with something and iterate but it would be better if services and characteristics are settled before you go in production sweat_smile .

No problem, the projects I am talking about are prototypes/proof-of-concepts at this stage.

1.) Create a test suite for the DFUService to test corner cases, out-of-order sequence numbers, flow control, etc.

Unit test usually shine in these area and it is easy to setup. Eventually a compilation of both unit test and integration test would be the best.

@pan- Could you reply with an initial test plan that you're thinking of? I'd like some input and then I can add my thoughts and start the implementation.

I'm thinking we will go with what you suggested in another issue I raised -- use the BLE test suite by itself without using greentea and all that... ultimately simpler I suppose.

If you're feeling like it, you can start a boilerplate test app 😄

I will probably start working on this after the long holiday weekend here in the US. 🦃

I realized that there is no way in mbed-os to enable the data length extension from the server. There is no calls to DmConnSetDataLen anywhere in the C++ API. That will be addressed by the connectivity team.

I'm sure we will uncover more functions that are currently unavailable in the C++ API while going through this! It will be a good way to test (and subsequently expand) the capabilities of the current Cordio BLE API 😄

AGlass0fMilk · 2020-12-02T22:31:20Z

I have been working with this a bit more this week and have performed updates using it with mcuboot. A few thoughts:

Application extensions are not specified -- the user may create a non-standard DFU Service implementation by adding additional custom characteristics to the DFU Service.
Delta mode should be removed from the standard control bits.
Instead, we can implement our own "official" application extension characteristic. This can be used to specify Mbed-OS-specific DFU features such as delta mode, requesting a connection reconfiguration for throughput optimization, etc.
@pan-, I see what you mean now about de-coupling the BlockDevice from this implementation. This would eliminate the problem of separating BLE event processing from processing queued binary data. Perhaps we can create some sort of DFUHandler class? This would let us use some design patterns (such as a decorator maybe) to encapsulate DFU extensions such as delta mode, etc. What do you think?
I'm thinking we rename this to FOTAService to avoid confusion with Nordic's implementation 😄 (plus it's more fun to say)

AGlass0fMilk · 2020-12-02T23:02:13Z

I also think we should introduce a read only bit in the control characteristic that tells the client whether a reboot (and thus a disconnection) is required to perform a DFU commit (ie: must reboot so bootloader can install/validate the update candidate).

paul-szczepanek-arm · 2020-12-11T13:05:46Z

services/inc/DFUService.h

+ * Maximum number of slots available
+ */
+#ifndef MBED_CONF_BLE_DFU_SERVICE_MAX_SLOTS
+#define MBED_CONF_BLE_DFU_SERVICE_MAX_SLOTS 3


config like that could be part of the mbed_lib.json which would provide the default values

paul-szczepanek-arm · 2020-12-11T13:35:35Z

services/inc/DFUService.h

+
+    protected:
+
+        DFUService& _dfu_svc;


doesn't appear that storing this is needed for anything

paul-szczepanek-arm · 2020-12-11T13:41:12Z

services/inc/DFUService.h

+     * @note If the application does not explicitly reject the control request,
+     * the request will be accepted by default.
+     */
+    void on_dfu_control_request(mbed::Callback<GattAuthCallbackReply_t(ControlChange&)> cb) {


We're trying to standardise on one callback model in BLE - the EventHandler model, I think it would work just as well in this case and have the benefit of uniformity with others.

paul-szczepanek-arm · 2020-12-11T13:42:03Z

services/inc/DFUService.h

+    /**
+     * Schedule a serialized call to process_buffer on the event queue.
+     *
+     * @note this function has no effect if a write has already been scheduled


maybe it should return an error then, otherwise it's a programming error and should be at least an assert

paul-szczepanek-arm · 2020-12-11T14:11:09Z