Skip to content

New feature: mbed cache #627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 13, 2018
Merged

Conversation

screamerbg
Copy link
Contributor

This feature aims to minimize traffic and reduce import times, by making Mbed CLI cache repositories as a default behavior. Caching is done via storing repository indexes under the Mbed CLI user config folder - typically ~/.mbed/mbed-cache/ on UNIX systems, or %userprofile%/.mbed/mbed-cache/ on Windows systems.

Compared to a fully checked out repository, indexes are significantly smaller in size and number of files, and contain the whole revision history of that repository. This allows Mbed CLI to quickly create copies of previously downloaded repository indexes and pull/fetch only the latest changes from the remote repositories, therefore dramatically reducing network traffic and download times, especially for big repositories like mbed-os.

Workflow
No impact to existing workflows.

This PR introduces caching as default behavior and also a new mbed cache sub-command for cache management:

mbed cache [on|off|dir <path>|ls|purge|-h|--help]
  • on - Turn repository caching on. Will use either the default or the user specified cache directory.
  • off - Turn repository caching off. Note that this doesn't purge cached repositories. See "purge".
  • dir - Set cache directory. Set to "default" to let mbed CLI determine the cache directory location. Typically this is ~/.mbed/mbed-cache/ on UNIX systems, or %%userprofile%%/.mbed/mbed-cache/ on Windows systems.
  • ls - List cached repositories and their size.
  • purge - Purge cached repositories. Note that this doesn't turn caching off.
  • -h or --help - Print cache command options.

If no sub-command is specified to mbed cache, then mbed CLI would print the current cache setting (ENABLED or DISABLED) and the path to the local cache directory.

For safety reasons, Mbed CLI will always use mbed-cache subfolder to a user specified location. This ensure that no user files will deleted during purge even if the user has specified root/system folder as a cache location (e.g. mbed cache dir / or mbed cache dir C:\).

Security notice: It's generally recommended to user cache location inside your own home directory. If you use cache location outside your user home/profile, then other system users might be able to access the repository cache and therefore the data of the cached repositories

How this works

Behind the scenes during mbed import or mbed add, Mbed CLI will check whether repository caching is enabled and whether a cache folder exists for that repository in the correct location. Location is determined based on URL, e.g.
https://github.com/ARMmbed/mbed-os-example-client will be cached in `~/.mbed/mbed-cache/github.com/ARMmbed/mbed-os-example-client' (the actual files of the repositories will not be checked out, just the index).

If repo index exists, Mbed CLI would make a carbon copy to the destination folder and try fetch for Git, or pull for Mercurial.

  • If fetch / pull succeeds, then Mbed CLI checks-outs the repository normally.
  • If fetch/pull fails, then it's highly likely that the remote repository has been rewritten (bad bad repo admin!). In that case Mbed CLI ignores the cached repo, wipes the copy and does a normal clone of the repository, and lastly "feeds" the new/fresh index to the cache.

Listing of cached repositories is done via mbed cache ls, e.g.

$ mbed cache ls

[mbed] Listing cached repositories in "/Users/mihsto01/.mbed"
* https://github.com/ARMmbed/atmel-rf-driver                            240.4KB
* https://github.com/ARMmbed/easy-connect                               112.3KB
* https://github.com/ARMmbed/esp8266-driver                             161.6KB
* https://github.com/ARMmbed/mbed-client                                  4.3MB
* https://github.com/ARMmbed/mbed-client-c                                5.0MB
* https://github.com/ARMmbed/mbed-client-classic                        436.7KB
* https://github.com/ARMmbed/mbed-client-mbed-tls                       296.8KB
* https://github.com/ARMmbed/mbed-os                                    196.6MB
* https://github.com/ARMmbed/mbed-os-example-client                       8.1MB
* https://github.com/ARMmbed/mcr20a-rf-driver                            93.3KB
* https://github.com/ARMmbed/pal                                          5.6MB
* https://github.com/ARMmbed/stm-spirit1-rf-driver                      384.8KB
* https://github.com/ARMmbed/wifi-x-nucleo-idw01m1                      285.1KB
* https://github.com/ARMmbed/wizfi310-driver                            119.9KB
-------------------------------------------------------------------------------
Total size:                                                             221.8MB

Documentation
Documentation is included with this PR. There's a new section called "Repository caching".

@AnotherButler please review

Tests
Not yet

CC @sg- @theotherjimmy @janjongboom
Note that due to the additionally stored files, there might be impact to CI systems @JanneKiiskila @studavekar @0xc0170 @adbridge.

@screamerbg
Copy link
Contributor Author

screamerbg commented Feb 18, 2018

Comparison between first import and subsequent imports with caching enabled. The first import is heavily dependent on internet speed, and in the case below mbed-os was downloaded with 1.7 Megabytes per second.

Prep

$ mbed cache purge

First import

$ time mbed import mbed-os-example-client
[mbed] Importing program "mbed-os-example-client" from "https://github.com/ARMmbed/mbed-os-example-client" at latest revision in the current branch
[mbed] Adding library "easy-connect" from "https://github.com/ARMmbed/easy-connect" at rev #06594ba91bc1
[mbed] Adding library "easy-connect/atmel-rf-driver" from "https://github.com/ARMmbed/atmel-rf-driver" at rev #ca9782e68f5f
[mbed] Adding library "easy-connect/esp8266-driver" from "https://github.com/ARMmbed/esp8266-driver" at rev #b0d79dad507d
[mbed] Adding library "easy-connect/mcr20a-rf-driver" from "https://github.com/ARMmbed/mcr20a-rf-driver" at rev #93661a696735
[mbed] Adding library "easy-connect/stm-spirit1-rf-driver" from "https://github.com/ARMmbed/stm-spirit1-rf-driver" at rev #ce9e2f81f95f
[mbed] Adding library "easy-connect/wifi-x-nucleo-idw01m1" from "https://github.com/ARMmbed/wifi-x-nucleo-idw01m1" at rev #257d0878561b
[mbed] Adding library "easy-connect/wizfi310-driver" from "https://github.com/ARMmbed/wizfi310-driver" at rev #e78ec79cf496
[mbed] Adding library "mbed-client" from "https://github.com/ARMmbed/mbed-client" at rev #ea04c5de7822
[mbed] Adding library "mbed-client/mbed-client-c" from "https://github.com/ARMmbed/mbed-client-c" at rev #ecfa619e42b2
[mbed] Adding library "mbed-client/mbed-client-classic" from "https://github.com/ARMmbed/mbed-client-classic" at rev #4e66929607c3
[mbed] Adding library "mbed-client/mbed-client-mbed-tls" from "https://github.com/ARMmbed/mbed-client-mbed-tls" at rev #7e1b6d815038
[mbed] Adding library "mbed-os" from "https://github.com/ARMmbed/mbed-os" at rev #569159b784f7
[mbed] Adding library "pal" from "https://github.com/ARMmbed/pal" at rev #60ce64d5ec35
                                                                                
real	2m15.357s
user	0m18.448s
sys	0m10.882s

Downloaded ~221MB with average speed 1.7MB/s

Second import

$ rm -fr mbed-os-example-client

$ time mbed import mbed-os-example-client
[mbed] Importing program "mbed-os-example-client" from "https://github.com/ARMmbed/mbed-os-example-client" at latest revision in the current branch
[mbed] Adding library "easy-connect" from "https://github.com/ARMmbed/easy-connect" at rev #06594ba91bc1
[mbed] Adding library "easy-connect/atmel-rf-driver" from "https://github.com/ARMmbed/atmel-rf-driver" at rev #ca9782e68f5f
[mbed] Adding library "easy-connect/esp8266-driver" from "https://github.com/ARMmbed/esp8266-driver" at rev #b0d79dad507d
[mbed] Adding library "easy-connect/mcr20a-rf-driver" from "https://github.com/ARMmbed/mcr20a-rf-driver" at rev #93661a696735
[mbed] Adding library "easy-connect/stm-spirit1-rf-driver" from "https://github.com/ARMmbed/stm-spirit1-rf-driver" at rev #ce9e2f81f95f
[mbed] Adding library "easy-connect/wifi-x-nucleo-idw01m1" from "https://github.com/ARMmbed/wifi-x-nucleo-idw01m1" at rev #257d0878561b
[mbed] Adding library "easy-connect/wizfi310-driver" from "https://github.com/ARMmbed/wizfi310-driver" at rev #e78ec79cf496
[mbed] Adding library "mbed-client" from "https://github.com/ARMmbed/mbed-client" at rev #ea04c5de7822
[mbed] Adding library "mbed-client/mbed-client-c" from "https://github.com/ARMmbed/mbed-client-c" at rev #ecfa619e42b2
[mbed] Adding library "mbed-client/mbed-client-classic" from "https://github.com/ARMmbed/mbed-client-classic" at rev #4e66929607c3
[mbed] Adding library "mbed-client/mbed-client-mbed-tls" from "https://github.com/ARMmbed/mbed-client-mbed-tls" at rev #7e1b6d815038
[mbed] Adding library "mbed-os" from "https://github.com/ARMmbed/mbed-os" at rev #569159b784f7
[mbed] Adding library "pal" from "https://github.com/ARMmbed/pal" at rev #60ce64d5ec35

real	0m20.927s
user	0m4.017s
sys	0m4.932s

Downloaded ~1.2MB

Copy link
Contributor

@janjongboom janjongboom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on the code, but I applaud the idea.

Copy link
Contributor

@sg- sg- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@screamerbg
Copy link
Contributor Author

@JanneKiiskila
Copy link
Contributor

Haven't had time to review the code, but the idea is superb.

mbed/mbed.py Outdated
@subcommand('cache',
dict(name='on', nargs='?', help='Turn repository caching on. Will use either the default or the user specified cache directory.'),
dict(name='off', nargs='?', help='Turn repository caching off. Note that this doesn\'t purge cached repositories. See "purge".'),
dict(name='dir', nargs='?', help='Set cache directory. Set to "default" to let mbed CLI determine the cache directory location. Typically this is "~/.mbed/mbed-cache/" on UNIX, or "%%userprofile%%/.mbed/mbed-cache/" on Windows.'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to actually print out the folder it WOULD really use rather than guess? Do we know at this point or can we use just %s and substitute it in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you mean?

$ mbed cache
[mbed] Repository cache is ENABLED.
[mbed] Cache location "/Users/mihsto01/.mbed"

Note that the actual folder is /Users/mihsto01/.mbed/mbed-cache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Janne is asking about the help. so mbed cache -h prints something OS + user specific.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed minor patch to support this.

@studavekar
Copy link
Contributor

Looks good 👍

README.md Outdated
@@ -851,6 +851,30 @@ You can combine the options of the Mbed update command for the following scenari

Use these with caution because your uncommitted changes and unpublished libraries cannot be restored.

## Repository caching

To minimize traffic and reduce import times, by default Mbed CLI would cache repositories by storing their indexes under the Mbed CLI user config folder - typically `~/.mbed/mbed-cache/` on UNIX systems, or `%userprofile%/.mbed/mbed-cache/` on Windows systems. Compared to a fully checked out repository, indexes are significantly smaller in size and number of files, and contain the whole revision history of that repository. This allows Mbed CLI to quickly create copies of previously downloaded repository indexes and pull/fetch only the latest changes from the remote repositories, therefore dramatically reducing network traffic and download times, especially for big repositories like `mbed-os`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tense is weird here. Could you make it all active?

would cache -> caches

README.md Outdated
mbed cache [on|off|dir <path>|ls|purge|-h|--help]
```

* `on` - Turn repository caching on. Will use either the default or the user specified cache directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second sentence needs a subject.

"The cache will either be in the user specified location or the default location if the user has not specified a location"

README.md Outdated
* `purge` - Purge cached repositories. Note that this doesn't turn caching off.
* `-h` or `--help` - Print cache command options.

If no sub-command is specified to `mbed cache`, then mbed CLI would print the current cache setting (ENABLED or DISABLED) and the path to the local cache directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tense: would print -> prints

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that different compared to https://github.com/ARMmbed/mbed-cli/blob/master/README.md#compiler-detection-through-the-path. E.g. "If none of the above are configured, the mbed compile command will fall back to checking your PATH for an...". Are you suggesting that the tense should be corrected in the that part of the documentation as well? Note that you can find other occurrences of this all over the mbed CLI docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should probably correct the tense throughout the document. That should come as another PR then.

README.md Outdated

If no sub-command is specified to `mbed cache`, then mbed CLI would print the current cache setting (ENABLED or DISABLED) and the path to the local cache directory.

For safety reasons, Mbed CLI will always use `mbed-cache` subfolder to a user specified location. This ensure that no user files will deleted during `purge` even if the user has specified root/system folder as a cache location (e.g. `mbed cache dir /` or `mbed cache dir C:\`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the first sentence is hard to understand.
Proposal:
"For safty reasons, Mbed CLI always uses a sub-directory, mbed-cache, within the user specified cache location.

README.md Outdated

For safety reasons, Mbed CLI will always use `mbed-cache` subfolder to a user specified location. This ensure that no user files will deleted during `purge` even if the user has specified root/system folder as a cache location (e.g. `mbed cache dir /` or `mbed cache dir C:\`).

**Security notice**: It's generally recommended to use cache location inside your profile home directory. If you use cache location outside your user home/profile, then other system users might be able to access the repository cache and therefore the data of the cached repositories
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no subject on the first sentence. Who is recommending the cache location? Probably the Mbed CLI developers or something like that.

Copy link
Contributor

@theotherjimmy theotherjimmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The cache command function could be broken up into sub commands as functions, and I would find that more readable.

The reported cache location will now contain "mbed-cache" additional sub-folder.
Copy link
Contributor

@adbridge adbridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy edit for minor grammar nits.
@screamerbg
Copy link
Contributor Author

All stakeholders approved this PR. Will be merged soon (just fixing CI issues).

@screamerbg screamerbg merged commit 3a979d9 into ARMmbed:master Mar 13, 2018
@screamerbg
Copy link
Contributor Author

@JanneKiiskila @studavekar Notice that this feature will turn caching on by default. On a CI system you might want to turn caching off as it will cache every PR repository. It's done by simply running mbed cache off while logged with the same user as the CI process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants