1
1
## The design of the little filesystem
2
2
3
- The littlefs is a little fail-safe filesystem designed for embedded systems.
3
+ A little fail-safe filesystem designed for embedded systems.
4
4
5
5
```
6
6
| | | .---._____
@@ -16,9 +16,9 @@ more about filesystem design by tackling the relative unsolved problem of
16
16
managing a robust filesystem resilient to power loss on devices
17
17
with limited RAM and ROM.
18
18
19
- The embedded systems the littlefs is targeting are usually 32bit
20
- microcontrollers with around 32Kbytes of RAM and 512Kbytes of ROM. These are
21
- often paired with SPI NOR flash chips with about 4Mbytes of flash storage.
19
+ The embedded systems the littlefs is targeting are usually 32 bit
20
+ microcontrollers with around 32KB of RAM and 512KB of ROM. These are
21
+ often paired with SPI NOR flash chips with about 4MB of flash storage.
22
22
23
23
Flash itself is a very interesting piece of technology with quite a bit of
24
24
nuance. Unlike most other forms of storage, writing to flash requires two
@@ -32,17 +32,17 @@ has more information if you are interesting in how this works.
32
32
This leaves us with an interesting set of limitations that can be simplified
33
33
to three strong requirements:
34
34
35
- 1 . ** Fail-safe ** - This is actually the main goal of the littlefs and the focus
36
- of this project. Embedded systems are usually designed without a shutdown
37
- routine and a notable lack of user interface for recovery, so filesystems
38
- targeting embedded systems should be prepared to lose power an any given
39
- time.
35
+ 1 . ** Power-loss resilient ** - This is the main goal of the littlefs and the
36
+ focus of this project. Embedded systems are usually designed without a
37
+ shutdown routine and a notable lack of user interface for recovery, so
38
+ filesystems targeting embedded systems must be prepared to lose power an
39
+ any given time.
40
40
41
41
Despite this state of things, there are very few embedded filesystems that
42
- handle power loss in a reasonable manner, and can become corrupted if the
43
- user is unlucky enough.
42
+ handle power loss in a reasonable manner, and most can become corrupted if
43
+ the user is unlucky enough.
44
44
45
- 2 . ** Wear awareness ** - Due to the destructive nature of flash, most flash
45
+ 2 . ** Wear leveling ** - Due to the destructive nature of flash, most flash
46
46
chips have a limited number of erase cycles, usually in the order of around
47
47
100,000 erases per block for NOR flash. Filesystems that don't take wear
48
48
into account can easily burn through blocks used to store frequently updated
@@ -78,9 +78,9 @@ summary of the general ideas behind some of them.
78
78
Most of the existing filesystems fall into the one big category of filesystem
79
79
designed in the early days of spinny magnet disks. While there is a vast amount
80
80
of interesting technology and ideas in this area, the nature of spinny magnet
81
- disks encourage properties such as grouping writes near each other, that don't
81
+ disks encourage properties, such as grouping writes near each other, that don't
82
82
make as much sense on recent storage types. For instance, on flash, write
83
- locality is not as important and can actually increase wear destructively.
83
+ locality is not important and can actually increase wear destructively.
84
84
85
85
One of the most popular designs for flash filesystems is called the
86
86
[ logging filesystem] ( https://en.wikipedia.org/wiki/Log-structured_file_system ) .
@@ -97,8 +97,7 @@ scaling as the size of storage increases. And most filesystems compensate by
97
97
caching large parts of the filesystem in RAM, a strategy that is unavailable
98
98
for embedded systems.
99
99
100
- Another interesting filesystem design technique that the littlefs borrows the
101
- most from, is the [ copy-on-write (COW)] ( https://en.wikipedia.org/wiki/Copy-on-write ) .
100
+ Another interesting filesystem design technique is that of [ copy-on-write (COW)] ( https://en.wikipedia.org/wiki/Copy-on-write ) .
102
101
A good example of this is the [ btrfs] ( https://en.wikipedia.org/wiki/Btrfs )
103
102
filesystem. COW filesystems can easily recover from corrupted blocks and have
104
103
natural protection against power loss. However, if they are not designed with
@@ -150,12 +149,12 @@ check our checksum we notice that block 1 was corrupted. So we fall back to
150
149
block 2 and use the value 9.
151
150
152
151
Using this concept, the littlefs is able to update metadata blocks atomically.
153
- There are a few other tweaks, such as using a 32bit crc and using sequence
152
+ There are a few other tweaks, such as using a 32 bit crc and using sequence
154
153
arithmetic to handle revision count overflow, but the basic concept
155
154
is the same. These metadata pairs define the backbone of the littlefs, and the
156
155
rest of the filesystem is built on top of these atomic updates.
157
156
158
- ## Files
157
+ ## Non-meta data
159
158
160
159
Now, the metadata pairs do come with some drawbacks. Most notably, each pair
161
160
requires two blocks for each block of data. I'm sure users would be very
@@ -224,12 +223,12 @@ Exhibit A: A linked-list
224
223
225
224
To get around this, the littlefs, at its heart, stores files backwards. Each
226
225
block points to its predecessor, with the first block containing no pointers.
227
- If you think about this , it makes a bit of sense. Appending blocks just point
228
- to their predecessor and no other blocks need to be updated. If we update
229
- a block in the middle, we will need to copy out the blocks that follow,
230
- but can reuse the blocks before the modified block. Since most file operations
231
- either reset the file each write or append to files, this design avoids
232
- copying the file in the most common cases.
226
+ If you think about for a while , it starts to make a bit of sense. Appending
227
+ blocks just point to their predecessor and no other blocks need to be updated.
228
+ If we update a block in the middle, we will need to copy out the blocks that
229
+ follow, but can reuse the blocks before the modified block. Since most file
230
+ operations either reset the file each write or append to files, this design
231
+ avoids copying the file in the most common cases.
233
232
234
233
```
235
234
Exhibit B: A backwards linked-list
@@ -351,7 +350,7 @@ file size doesn't have an obvious implementation.
351
350
352
351
We can start by just writing down an equation. The first idea that comes to
353
352
mind is to just use a for loop to sum together blocks until we reach our
354
- file size. We can write equation equation as a summation:
353
+ file size. We can write this equation as a summation:
355
354
356
355
![ summation1] ( https://latex.codecogs.com/svg.latex?N%20%3D%20%5Csum_i%5En%5Cleft%5BB-%5Cfrac%7Bw%7D%7B8%7D%5Cleft%28%5Ctext%7Bctz%7D%28i%29&plus ; 1%5Cright%29%5Cright%5D )
357
356
@@ -374,7 +373,7 @@ The [On-Line Encyclopedia of Integer Sequences (OEIS)](https://oeis.org/).
374
373
If we work out the first couple of values in our summation, we find that CTZ
375
374
maps to [ A001511] ( https://oeis.org/A001511 ) , and its partial summation maps
376
375
to [ A005187] ( https://oeis.org/A005187 ) , and surprisingly, both of these
377
- sequences have relatively trivial equations! This leads us to the completely
376
+ sequences have relatively trivial equations! This leads us to a rather
378
377
unintuitive property:
379
378
380
379
![ mindblown] ( https://latex.codecogs.com/svg.latex?%5Csum_i%5En%5Cleft%28%5Ctext%7Bctz%7D%28i%29&plus ; 1%5Cright%29%20%3D%202n-%5Ctext%7Bpopcount%7D%28n%29 )
@@ -383,7 +382,7 @@ where:
383
382
ctz(i) = the number of trailing bits that are 0 in i
384
383
popcount(i) = the number of bits that are 1 in i
385
384
386
- I find it bewildering that these two seemingly unrelated bitwise instructions
385
+ It's a bit bewildering that these two seemingly unrelated bitwise instructions
387
386
are related by this property. But if we start to disect this equation we can
388
387
see that it does hold. As n approaches infinity, we do end up with an average
389
388
overhead of 2 pointers as we find earlier. And popcount seems to handle the
@@ -1154,21 +1153,26 @@ develops errors and needs to be moved.
1154
1153
1155
1154
The second concern for the littlefs, is that blocks in the filesystem may wear
1156
1155
unevenly. In this situation, a filesystem may meet an early demise where
1157
- there are no more non-corrupted blocks that aren't in use. It may be entirely
1158
- possible that files were written once and left unmodified, wasting the
1159
- potential erase cycles of the blocks it sits on.
1156
+ there are no more non-corrupted blocks that aren't in use. It's common to
1157
+ have files that were written once and left unmodified, wasting the potential
1158
+ erase cycles of the blocks it sits on.
1160
1159
1161
1160
Wear leveling is a term that describes distributing block writes evenly to
1162
1161
avoid the early termination of a flash part. There are typically two levels
1163
1162
of wear leveling:
1164
- 1 . Dynamic wear leveling - Blocks are distributed evenly during blocks writes.
1165
- Note that the issue with write-once files still exists in this case.
1166
- 2 . Static wear leveling - Unmodified blocks are evicted for new block writes.
1167
- This provides the longest lifetime for a flash device.
1168
-
1169
- Now, it's possible to use the revision count on metadata pairs to approximate
1170
- the wear of a metadata block. And combined with the COW nature of files, the
1171
- littlefs could provide a form of dynamic wear leveling.
1163
+ 1 . Dynamic wear leveling - Wear is distributed evenly across all ** dynamic**
1164
+ blocks. Usually this is accomplished by simply choosing the unused block
1165
+ with the lowest amount of wear. Note this does not solve the problem of
1166
+ static data.
1167
+ 2 . Static wear leveling - Wear is distributed evenly across all ** dynamic**
1168
+ and ** static** blocks. Unmodified blocks may be evicted for new block
1169
+ writes. This does handle the problem of static data but may lead to
1170
+ wear amplification.
1171
+
1172
+ In littlefs's case, it's possible to use the revision count on metadata pairs
1173
+ to approximate the wear of a metadata block. And combined with the COW nature
1174
+ of files, littlefs could provide your usually implementation of dynamic wear
1175
+ leveling.
1172
1176
1173
1177
However, the littlefs does not. This is for a few reasons. Most notably, even
1174
1178
if the littlefs did implement dynamic wear leveling, this would still not
@@ -1179,19 +1183,20 @@ As a flash device reaches the end of its life, the metadata blocks will
1179
1183
naturally be the first to go since they are updated most often. In this
1180
1184
situation, the littlefs is designed to simply move on to another set of
1181
1185
metadata blocks. This travelling means that at the end of a flash device's
1182
- life, the filesystem will have worn the device down as evenly as a dynamic
1183
- wear leveling filesystem could anyways. Simply put, if the lifetime of flash
1184
- is a serious concern, static wear leveling is the only valid solution.
1186
+ life, the filesystem will have worn the device down nearly as evenly as the
1187
+ usual dynamic wear leveling could. More aggressive wear leveling would come
1188
+ with a code-size cost for marginal benefit.
1189
+
1185
1190
1186
- This is a very important takeaway to note. If your storage stack uses highly
1187
- sensitive storage such as NAND flash. In most cases you are going to be better
1188
- off just using a [ flash translation layer (FTL)] ( https://en.wikipedia.org/wiki/Flash_translation_layer ) .
1191
+ One important takeaway to note, if your storage stack uses highly sensitive
1192
+ storage such as NAND flash, static wear leveling is the only valid solution.
1193
+ In most cases you are going to be better off using a full [ flash translation layer (FTL)] ( https://en.wikipedia.org/wiki/Flash_translation_layer ) .
1189
1194
NAND flash already has many limitations that make it poorly suited for an
1190
1195
embedded system: low erase cycles, very large blocks, errors that can develop
1191
1196
even during reads, errors that can develop during writes of neighboring blocks.
1192
1197
Managing sensitive storage such as NAND flash is out of scope for the littlefs.
1193
1198
The littlefs does have some properties that may be beneficial on top of a FTL,
1194
- such as limiting the number of writes where possible. But if you have the
1199
+ such as limiting the number of writes where possible, but if you have the
1195
1200
storage requirements that necessitate the need of NAND flash, you should have
1196
1201
the RAM to match and just use an FTL or flash filesystem.
1197
1202
0 commit comments