Skip to content

TDBStore: Perform garbage collection on failed writes #9200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 8, 2019

Conversation

davidsaada
Copy link
Contributor

Description

In TDBStore, partial writes may turn the storage unusable, as records are written sequentially. This change adds garbage collection following each write failure, in order to prevent such a scenario.

Pull request type

[x] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

@ciarmcom ciarmcom requested review from a team December 26, 2018 16:00
@ciarmcom
Copy link
Member

@davidsaada, thank you for your changes.
@ARMmbed/mbed-os-storage @ARMmbed/mbed-os-maintainers please review.

@dannybenor
Copy link

@davidsaada Very good solution. We need to enhance the triggering of GC to the case a failure was not detected by the software and you will not get a return error. I think we should check the checksum at the end of set_finalize

@davidsaada davidsaada force-pushed the david_tdbstore_gc_if_corrupt branch from aea5c7f to 37f10ab Compare December 27, 2018 10:26
@davidsaada
Copy link
Contributor Author

Added read verification following @dannybenor's comment.

@cmonr
Copy link
Contributor

cmonr commented Jan 2, 2019

CI started

@mbed-ci
Copy link

mbed-ci commented Jan 3, 2019

Test run: FAILED

Summary: 1 of 11 test jobs failed
Build number : 1
Build artifacts

Failed test jobs:

  • jenkins-ci/mbed-os-ci_greentea-test

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 3, 2019

CI restarted

@cmonr
Copy link
Contributor

cmonr commented Jan 3, 2019

@davidsaada Please take a look at the greentea test failures: http://mbed-os-ci.s3-website-eu-west-1.amazonaws.com/?prefix=jenkins-ci/artifacts/9200/1/greentea-test/685/FAIL/

The K64F looks related to the PR, don't think the NRF52 is.

Partial writes may turn storage unusable. GC clears this scenario.
@davidsaada davidsaada force-pushed the david_tdbstore_gc_if_corrupt branch from 37f10ab to 72f6f6c Compare January 6, 2019 13:03
@davidsaada
Copy link
Contributor Author

davidsaada commented Jan 6, 2019

@davidsaada Please take a look at the greentea test failures: http://mbed-os-ci.s3-website-eu-west-1.amazonaws.com/?prefix=jenkins-ci/artifacts/9200/1/greentea-test/685/FAIL/

The K64F looks related to the PR, don't think the NRF52 is.

The NRF52 failure is indeed not related to the PR.
As for the K64F one: The test sets a value, gets it and then compares the read data to the written one - with the failed test producing different results. Now, this test was missing the return code check of the get API, meaning that the call could have failed (maybe due to media error), but we didn't know it. Test was fixed now to include the error code checking (along with a few others with the same problem). Regardless, tried to reproduce the failures on several K64F boards (local & remote), but without success.
Bottom line - please rerun the test. We may either see a real problem or a media (SD) one, according to the results.

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 7, 2019

CI restarated

@mbed-ci
Copy link

mbed-ci commented Jan 7, 2019

Test run: FAILED

Summary: 1 of 11 test jobs failed
Build number : 2
Build artifacts

Failed test jobs:

  • jenkins-ci/mbed-os-ci_exporter

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 7, 2019

Exporters restarted (pipeline closed error)

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 7, 2019

Bottom line - please rerun the test. We may either see a real problem or a media (SD) one, according to the results.

Tests passed, what shall we do now?

@davidsaada
Copy link
Contributor Author

Tests passed, what shall we do now?

I think this should merge now. Only thing I can think for that failure was a one time glitch in the storage, which was covered by the missing error code checking.

@dannybenor
Copy link

I think this PR should be merged

@0xc0170 0xc0170 merged commit 5a5ad8d into ARMmbed:master Jan 8, 2019
@davidsaada davidsaada deleted the david_tdbstore_gc_if_corrupt branch January 14, 2019 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants