-
Notifications
You must be signed in to change notification settings - Fork 178
Crash reporting documentation #831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash reporting documentation #831
Conversation
4fd3b9b
to
6715706
Compare
@kegilbert @AnotherButler - Please review. |
Small nit that's not part of the included changes, but the very first line of Error.md uses the word 'error' 5 times in one sentence.
|
docs/api/platform/Error.md
Outdated
@@ -130,6 +130,37 @@ Mbed OS application and system developers may need to define error codes specifi | |||
|
|||
Some applications may want to do custom error handling when an error is reported using `MBED_ERROR()` or `MBED_WARNING()`. Applications can accomplish this by registering an error hook function with the Mbed OS error handling system using the **mbed_set_error_hook()** API. This function is called with error context information whenever the system handles an **MBED_ERROR()** or **MBED_WARNING()** invocation. This function should be implemented for re-entrancy because multiple threads may invoke `MBED_ERROR()` or `MBED_WARNING()`, which may cause the error hook to be called in parallel. | |||
|
|||
### Crash reporting and auto-reboot | |||
Whenever a fatal error happens in the system, MbedOS error handling system collects key information such as error code, error location, register context(in the case of fault exceptions) etc. and stores them in a special area in RAM region called Crash-data-RAM. The error information stored in Crash-data-RAM is in binary format and follows the `mbed_error_ctx` structure defined in `mbed_error.h`. The system then triggers a warm-reset without losing the RAM contents where we have the error information collected. After the system reboots, during MbedOS initialization the Crash-data-RAM region is checked to find if there is valid error information captured. This is done by using a CRC value calculated over the stored error information and is appended as part of information stored in Crash-data-RAM. If the system detects that the reboot was triggered by a fatal error, it will invoke a callback function with a pointer to the error context structure stored in Crash-data-RAM. The default callback function is defined with `WEAK` attribute, which can be overridden by the application if required. Below is the signature for the callback: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit:
Move the following to the second line maybe? Sounds a little cleaner to me.
"...and stores them in a special area in RAM region called Crash-data-RAM." ->
"...and stores them in a reserved RAM region called Crash-data-RAM
"
docs/api/platform/Error.md
Outdated
|
||
#### Adding Crash-data-RAM region for crash reporting | ||
As mentioned above, the crash reporting feature requires a special memory region, called Crash-data-RAM to work. This region is 256 bytes in size and is allocated using linker scripts for the target for each toolchain. Although all platforms support crash reporting feature, not all targets are currently modified to allocate this Crash-data-RAM region. | ||
See `mbed_lib.json` in platform directory to see which targets currently enabled with crash reporting. In order to enable crash reporting in other targets, you must modify the linker scripts for those targets to allocate the Crash-data-RAM region. You may refer the linker scripts for one of the targets already enabled with crash reporting to understand how the Crash-data-RAM region is allocated. Below are some guidelines to make the linker script changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See `mbed_lib.json` in platform directory to see which targets currently enabled with crash reporting. In order to enable crash reporting in other targets, you must modify the linker scripts for those targets to allocate the Crash-data-RAM region. You may refer the linker scripts for one of the targets already enabled with crash reporting to understand how the Crash-data-RAM region is allocated. Below are some guidelines to make the linker script changes. | |
See `mbed_lib.json` in the platform directory to see which targets currently enable crash reporting. In order to enable crash reporting for other targets, you must modify the linker scripts for those targets to allocate the Crash-data-RAM region. You may refer to the linker scripts for one of the targets already enabled with crash reporting to understand how the Crash-data-RAM region is allocated. Below are some guidelines to make the linker script changes. |
docs/api/platform/Error.md
Outdated
See `mbed_lib.json` in platform directory to see which targets currently enabled with crash reporting. In order to enable crash reporting in other targets, you must modify the linker scripts for those targets to allocate the Crash-data-RAM region. You may refer the linker scripts for one of the targets already enabled with crash reporting to understand how the Crash-data-RAM region is allocated. Below are some guidelines to make the linker script changes. | ||
|
||
* The region size should be 256 bytes and aligned at 8-byte offset. | ||
* If you are enabling the Crash-data-RAM for *ARM compiler*, linker script must export the following symbols: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* If you are enabling the Crash-data-RAM for *ARM compiler*, linker script must export the following symbols: | |
* If you are enabling the Crash-data-RAM for the *ARM compiler*, linker script must export the following symbols: |
docs/api/platform/Error.md
Outdated
See `mbed_lib.json` in platform directory to see which targets currently enabled with crash reporting. In order to enable crash reporting in other targets, you must modify the linker scripts for those targets to allocate the Crash-data-RAM region. You may refer the linker scripts for one of the targets already enabled with crash reporting to understand how the Crash-data-RAM region is allocated. Below are some guidelines to make the linker script changes. | ||
|
||
* The region size should be 256 bytes and aligned at 8-byte offset. | ||
* If you are enabling the Crash-data-RAM for *ARM compiler*, linker script must export the following symbols: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* If you are enabling the Crash-data-RAM for *ARM compiler*, linker script must export the following symbols: | |
* If you are enabling the Crash-data-RAM for *ARM compiler*, linker scripts must export the following symbols: |
docs/api/platform/Error.md
Outdated
__\_\_CRASH_DATA_RAM_START\_\___ - Indicates start address of Crash-data-RAM region. | ||
__\_\_CRASH_DATA_RAM_END\_\___ - Indicates end address of Crash-data-RAM region. | ||
|
||
It's important that these regions should be marked with appropriate attributes(based on toolchain) to mark them as uninitialized region. For example, for ARM Compiler Crash-data-RAM can be marked with attribute *EMPTY*. There is no hard requirement about the placement of this region. The only requirement is that it should be placed such that no other entity is overwriting this region when rebooted or at runtime. But in order to avoid fragmentation its best placed just after the vector table region, or if there is no vector table region, it can be placed at the bottom of RAM(lowest address). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important that these regions should be marked with appropriate attributes(based on toolchain) to mark them as uninitialized region. For example, for ARM Compiler Crash-data-RAM can be marked with attribute *EMPTY*. There is no hard requirement about the placement of this region. The only requirement is that it should be placed such that no other entity is overwriting this region when rebooted or at runtime. But in order to avoid fragmentation its best placed just after the vector table region, or if there is no vector table region, it can be placed at the bottom of RAM(lowest address). | |
It's important that these regions should be marked with appropriate attributes(based on toolchain) to mark them as uninitialized regions. For example, the ARM Compiler Crash-data-RAM region can be marked with attribute *EMPTY*. There is no hard requirement about the placement of this region. The only requirement is that it should be placed such that no other entity is overwriting this region when rebooted or at runtime. But in order to avoid fragmentation its best placed just after the vector table region, or if there is no vector table region, it can be placed at the bottom of RAM(lowest address). |
docs/api/platform/Error.md
Outdated
|
||
#### Configuring crash reporting and auto-reboot | ||
MbedOS crash reporting implementation provides many options to configure the crash reporting behavior. | ||
Below is the list of new configuration options available to configure crash reporting functionality. These configuration options are defined in `mbed_lib.json` under platform directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below is the list of new configuration options available to configure crash reporting functionality. These configuration options are defined in `mbed_lib.json` under platform directory. | |
Below is the list of new configuration options available to configure crash reporting functionality. These configuration options are defined in `mbed_lib.json` under the platform directory. |
docs/api/platform/Error.md
Outdated
MbedOS crash reporting implementation provides many options to configure the crash reporting behavior. | ||
Below is the list of new configuration options available to configure crash reporting functionality. These configuration options are defined in `mbed_lib.json` under platform directory. | ||
|
||
`crash-capture-enabled` - Enables crash context capture when the system enters a fatal error/crash. When this is disabled it also disables other dependent options mentioned below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the configuration options laid out here into a list (Github proposed changes beta feature doesn't support multiline changes yet):
"Below is the list....."
crash-capture-enabled
- Enables crash context capture when the system enters a fatal error/crash. When this is disabled it also disables other dependent options mentioned below.fatal-error-auto-reboot-enabled
- Setting this to true enables auto-reboot on fatal errors.reboot-crash-report-enabled
- Enables crash report printing over terminal when the system reboots after a fatal error. The format of this report is identical to error reporting structure mentioned in error report.error-reboot-max
- Maximum number of auto-reboots permitted on fatal errors. The system will stop auto-rebooting once the maximum limit is reached. Setting this to value 0 will disable auto-reboot.
docs/api/platform/Error.md
Outdated
@@ -298,12 +329,99 @@ void save_all_errors() { | |||
mbed_clear_all_errors(); | |||
} | |||
``` | |||
|
|||
#### Using `mbed_get_reboot_error_info()` to retrieve the reboot error info | |||
The error context captured by the error handling system can be retrieved using mbed_get_reboot_error_info() API. See the below code for example usage of that API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open to alternative phrasing here, mainly trying to remove the API term if this sentence is only referencing a single function call (could have the usage of API in a sentence wrong as well, let me know if this is fine as is).
The error context captured by the error handling system can be retrieved using mbed_get_reboot_error_info() API. See the below code for example usage of that API. | |
The error context captured by the error handling system can be retrieved using `mbed_get_reboot_error_info()`. See the below code for an example using that function. |
docs/api/platform/Error.md
Outdated
``` | ||
|
||
#### Using `mbed_get_reboot_fault_context()` to retrieve the fault context info | ||
The fault context captured can be retrieved using mbed_get_reboot_fault_context() API. See the below code for example usage of that API. The example code below checks for error_status using the error context and then retrieves the fault context using mbed_get_reboot_fault_context() API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, open to alternative phrasing here, mainly trying to remove the API term if this sentence is only referencing a single function call (could have the usage of API in a sentence wrong as well, let me know if this is fine as is).
The fault context captured can be retrieved using mbed_get_reboot_fault_context() API. See the below code for example usage of that API. The example code below checks for error_status using the error context and then retrieves the fault context using mbed_get_reboot_fault_context() API. | |
The fault context captured can be retrieved using `mbed_get_reboot_fault_context()`. See the below code for an example using that function. The example code below checks for error_status using the error context and then retrieves the fault context using `mbed_get_reboot_fault_context()`. |
The error context captured by the error handling system can be retrieved using mbed_get_reboot_error_info() API. See the below code for example usage of that API. | ||
In the example below, a status variable reboot_error_detected has been used to track the presence of error context capture. | ||
|
||
```CPP TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we missing for the snipping compilation script to run this so we can remove the TODO?
#### Using `mbed_get_reboot_fault_context()` to retrieve the fault context info | ||
The fault context captured can be retrieved using mbed_get_reboot_fault_context() API. See the below code for example usage of that API. The example code below checks for error_status using the error context and then retrieves the fault context using mbed_get_reboot_fault_context() API. | ||
|
||
```CPP TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we missing for the snipping compilation script to run this so we can remove the TODO?
docs/api/platform/Error.md
Outdated
``` | ||
|
||
#### Using `mbed_reset_reboot_count()` to reset the reboot count | ||
`mbed_reset_reboot_error_info()` API can be used to specifically reset the reboot count stored as part error information stored in Crash-data-RAM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, open to alternative phrasing here, mainly trying to remove the API term if this sentence is only referencing a single function call (could have the usage of API in a sentence wrong as well, let me know if this is fine as is).
`mbed_reset_reboot_error_info()` API can be used to specifically reset the reboot count stored as part error information stored in Crash-data-RAM. | |
`mbed_reset_reboot_error_info()` can be used to specifically reset the reboot count stored as part of the error information stored in Crash-data-RAM. |
#### Using `mbed_reset_reboot_count()` to reset the reboot count | ||
`mbed_reset_reboot_error_info()` API can be used to specifically reset the reboot count stored as part error information stored in Crash-data-RAM. | ||
Calling this function will set the reboot count to 0. | ||
```CPP TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we missing for the snipping compilation script to run this so we can remove the TODO?
@SenRamakri Overall looks great! Just minor nits and a small question above. Question: I feel like the API examples are a bit far from the section explaining the crash reboot feature (and same of the error API sections). Is it worth adding some jump here for examples/API reference! links at the end of those sections? |
@kegilbert - I fixed the review comments. Please take a look, and I have removed TODOs but lets see if it fails and we have to fix them accordingly. |
Hey! @geky the snipping tool caught something :D This won't build properly until the reboot PR is merged in retrospect, but did catch a missing |
{ | ||
if (reboot_error_detected == 1) { | ||
if (MBED_SUCCESS == mbed_get_reboot_error_info(&error_ctx)) { | ||
printf("\nSuccessfully read error context\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a closing }
docs/api/platform/Error.md
Outdated
``` | ||
|
||
#### Using `mbed_reset_reboot_error_info()` to clear the reboot error info | ||
`mbed_reset_reboot_error_info()` API can be used to clear the reboot error info if required by the application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing as before in reference to the API reference. In my mind
... using the Mbed OS Crash Reboot API
is correct,
...using the mbed_reset_reboot_error_info() API
should be rephrased from API to function as a member of the API.
Thoughts? I could be off the mark on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, Im removing the API working.
458cf8e
to
815cbc9
Compare
@kegilbert - I have fixed the bracket issue, but still need the TODO as you mentioned. |
Yeah we can put the TODOs back until ARMmbed/mbed-os#8702 is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Do not merge this until ARMmbed/mbed-os#8702 is merged. |
Please review my edits to make sure I didn't accidentally change the meaning of anything. |
adf71a8
to
2c648c7
Compare
This PR is only waiting on the code dependency. |
@AnotherButler - The code has been merged to mbed-os, this docs changes can be merged now. |
Fix capitalization and punctuation.
Make an initial basic copy edit.
…from implementation
2c648c7
to
6a0530a
Compare
Documentation for crash reporting feature in Mbed OS. This is targeted for 5.11 release.