|
| 1 | +## NodeJS - NPM Lambda Builder |
| 2 | + |
| 3 | +### Scope |
| 4 | + |
| 5 | +This package is an effort to port the Claudia.JS packager to a library that can |
| 6 | +be used to handle the dependency resolution portion of packaging NodeJS code |
| 7 | +for use in AWS Lambda. The scope for this builder is to take an existing |
| 8 | +directory containing customer code, including a valid `package.json` manifest |
| 9 | +specifying third-party dependencies. The builder will use NPM to include |
| 10 | +production dependencies and exclude test resources in a way that makes them |
| 11 | +deployable to AWS Lambda. |
| 12 | + |
| 13 | +### Challenges |
| 14 | + |
| 15 | +NPM normally stores all dependencies in a `node_modules` subdirectory. It |
| 16 | +supports several dependency categories, such as development dependencies |
| 17 | +(usually third-party build utilities and test resources), optional dependencies |
| 18 | +(usually required for local execution but already available on the production |
| 19 | +environment, or peer-dependencies for optional third-party packages) and |
| 20 | +production dependencies (normally the minimum required for correct execution). |
| 21 | +All these dependency types are mixed in the same directory. |
| 22 | + |
| 23 | +To speed up Lambda startup time and optimise usage costs, the correct thing to |
| 24 | +do in most cases is just to package up production dependencies. During development |
| 25 | +work we can expect that the local `node_modules` directory contains all the |
| 26 | +various dependency types, and NPM does not provide a way to directly identify |
| 27 | +just the ones relevant for production. To identify production dependencies, |
| 28 | +this packager needs to copy the source to a clean temporary directory and re-run |
| 29 | +dependency installation there. |
| 30 | + |
| 31 | +It's reasonable to expect that some developers will not carefully separate |
| 32 | +production dependencies from test resources, so this packager will need to |
| 33 | +support overriding the categories of dependencies to include. |
| 34 | + |
| 35 | +NPM also provides support for running user-defined scripts as part of the build |
| 36 | +process, so this packager needs to support standard NPM script execution. |
| 37 | + |
| 38 | +NPM, since version 5, uses symbolic links to optimise disk space usage, so |
| 39 | +cross-project dependencies will just be linked to elsewhere on the local disk |
| 40 | +instead of included in the `node_modules` directory. This means that just copying |
| 41 | +the `node_modules` directory (even if symlinks would be resolved to actual paths) |
| 42 | +far from optimal to create a stand-alone module. Copying would lead to significantly |
| 43 | +larger packages than necessary, as sub-modules might still have test resources, and |
| 44 | +common references from multiple projects would be duplicated. |
| 45 | + |
| 46 | +NPM also uses a locking mechanism (package-lock.json) that's in many ways more |
| 47 | +broken than functional, as it in some cases hard-codes locks to local disk |
| 48 | +paths, and gets confused by including the same package as a dependency |
| 49 | +throughout the project tree in different dependency categories |
| 50 | +(development/optional/production). Although the official tool recommends |
| 51 | +including this file in the version control, as a way to pin down dependency |
| 52 | +versions, when using on several machines with different project layout it can |
| 53 | +lead to uninstallable dependencies. |
| 54 | + |
| 55 | +NPM dependencies are usually plain javascript libraries, but they may include |
| 56 | +native binaries precompiled for a particular platform, or require some system |
| 57 | +libraries to be installed. A notable example is `sharp`, a popular image |
| 58 | +manipulation library, that uses symbolic links to system libraries. Another |
| 59 | +notable example is `puppeteer`, a library to control a headless Chrome browser, |
| 60 | +that downloads a Chromium binary for the target platform during installation. |
| 61 | + |
| 62 | +To fully deal with those cases, this packager may need to execute the |
| 63 | +dependency installation step on a Docker image compatible with the target |
| 64 | +Lambda environment. |
| 65 | + |
| 66 | +### Implementation |
| 67 | + |
| 68 | +The general algorithm for preparing a node package for use on AWS Lambda |
| 69 | +is as follows. |
| 70 | + |
| 71 | +#### Step 1: Prepare a clean copy of the project source files |
| 72 | + |
| 73 | +Execute `npm pack` to perform project-specific packaging using the supplied |
| 74 | +`package.json` manifest, which will automatically exclude temporary files, |
| 75 | +test resources and other source files unnecessary for running in a production |
| 76 | +environment. |
| 77 | + |
| 78 | +This will produce a `tar` archive that needs to be unpacked into the artefacts directory. |
| 79 | +Note that the archive will actually contain a `package` subdirectory containing the files, |
| 80 | +so it's not enough to just directly unpack files. |
| 81 | + |
| 82 | +#### Step 2: Rewrite local dependencies |
| 83 | + |
| 84 | +_(out of scope for the current version)_ |
| 85 | + |
| 86 | +To optimise disk space and avoid including development dependencies from other |
| 87 | +locally linked packages, inspect the `package.json` manifest looking for dependencies |
| 88 | +referring to local file paths (can be identified as they start with `.` or `file:`), |
| 89 | +then for each dependency recursively execute the packaging process |
| 90 | + |
| 91 | +Local dependencies may include other local dependencies themselves, this is a very |
| 92 | +common way of sharing configuration or development utilities such as linting or testing |
| 93 | +tools. This means that for each packaged local dependency this packager needs to |
| 94 | +recursively apply the packaging process. It also means that the packager needs to |
| 95 | +track local paths and avoid re-packaging directories it already visited. |
| 96 | + |
| 97 | +NPM produces a `tar` archive while packaging that can be directly included as a |
| 98 | +dependency. This will make NPM unpack and install a copy correctly. Once the |
| 99 | +packager produces all `tar` archives required by local dependencies, rewrite |
| 100 | +the manifest to point to `tar` files instead of the original location. |
| 101 | + |
| 102 | +#### Step 3: Install dependencies |
| 103 | + |
| 104 | +The packager should then run `npm install` to download an expand all dependencies to |
| 105 | +the local `node_modules` subdirectory. This has to be executed in the directory with |
| 106 | +a clean copy of the source files. |
| 107 | + |
| 108 | +Note that NPM can be configured to use proxies or local company repositories using |
| 109 | +a local file, `.npmrc`. The packaging process from step 1 normally excludes this file, so it may |
| 110 | +need to be copied additionally before dependency installation, and then removed. |
| 111 | +_(out of scope for the current version)_ |
| 112 | + |
| 113 | +Some users may want to exclude optional dependencies, or even include development dependencies. |
| 114 | +To avoid incompatible flags in the `sam` CLI, the packager should allow users to specify |
| 115 | +options for the `npm install` command using an environment variable. |
| 116 | +_(out of scope for the current version)_ |
| 117 | + |
| 118 | +To fully support dependencies that download or compile binaries for a target platform, this step |
| 119 | +needs to be executed inside a Docker image compatible with AWS Lambda. |
| 120 | +_(out of scope for the current version)_ |
| 121 | + |
0 commit comments