Skip to content

Commit 09d4190

Browse files
committed
added design document, as per #44 (comment)
1 parent fcd47a9 commit 09d4190

File tree

1 file changed

+121
-0
lines changed
  • aws_lambda_builders/workflows/nodejs_npm

1 file changed

+121
-0
lines changed
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
## NodeJS - NPM Lambda Builder
2+
3+
### Scope
4+
5+
This package is an effort to port the Claudia.JS packager to a library that can
6+
be used to handle the dependency resolution portion of packaging NodeJS code
7+
for use in AWS Lambda. The scope for this builder is to take an existing
8+
directory containing customer code, including a valid `package.json` manifest
9+
specifying third-party dependencies. The builder will use NPM to include
10+
production dependencies and exclude test resources in a way that makes them
11+
deployable to AWS Lambda.
12+
13+
### Challenges
14+
15+
NPM normally stores all dependencies in a `node_modules` subdirectory. It
16+
supports several dependency categories, such as development dependencies
17+
(usually third-party build utilities and test resources), optional dependencies
18+
(usually required for local execution but already available on the production
19+
environment, or peer-dependencies for optional third-party packages) and
20+
production dependencies (normally the minimum required for correct execution).
21+
All these dependency types are mixed in the same directory.
22+
23+
To speed up Lambda startup time and optimise usage costs, the correct thing to
24+
do in most cases is just to package up production dependencies. During development
25+
work we can expect that the local `node_modules` directory contains all the
26+
various dependency types, and NPM does not provide a way to directly identify
27+
just the ones relevant for production. To identify production dependencies,
28+
this packager needs to copy the source to a clean temporary directory and re-run
29+
dependency installation there.
30+
31+
It's reasonable to expect that some developers will not carefully separate
32+
production dependencies from test resources, so this packager will need to
33+
support overriding the categories of dependencies to include.
34+
35+
NPM also provides support for running user-defined scripts as part of the build
36+
process, so this packager needs to support standard NPM script execution.
37+
38+
NPM, since version 5, uses symbolic links to optimise disk space usage, so
39+
cross-project dependencies will just be linked to elsewhere on the local disk
40+
instead of included in the `node_modules` directory. This means that just copying
41+
the `node_modules` directory (even if symlinks would be resolved to actual paths)
42+
far from optimal to create a stand-alone module. Copying would lead to significantly
43+
larger packages than necessary, as sub-modules might still have test resources, and
44+
common references from multiple projects would be duplicated.
45+
46+
NPM also uses a locking mechanism (package-lock.json) that's in many ways more
47+
broken than functional, as it in some cases hard-codes locks to local disk
48+
paths, and gets confused by including the same package as a dependency
49+
throughout the project tree in different dependency categories
50+
(development/optional/production). Although the official tool recommends
51+
including this file in the version control, as a way to pin down dependency
52+
versions, when using on several machines with different project layout it can
53+
lead to uninstallable dependencies.
54+
55+
NPM dependencies are usually plain javascript libraries, but they may include
56+
native binaries precompiled for a particular platform, or require some system
57+
libraries to be installed. A notable example is `sharp`, a popular image
58+
manipulation library, that uses symbolic links to system libraries. Another
59+
notable example is `puppeteer`, a library to control a headless Chrome browser,
60+
that downloads a Chromium binary for the target platform during installation.
61+
62+
To fully deal with those cases, this packager may need to execute the
63+
dependency installation step on a Docker image compatible with the target
64+
Lambda environment.
65+
66+
### Implementation
67+
68+
The general algorithm for preparing a node package for use on AWS Lambda
69+
is as follows.
70+
71+
#### Step 1: Prepare a clean copy of the project source files
72+
73+
Execute `npm pack` to perform project-specific packaging using the supplied
74+
`package.json` manifest, which will automatically exclude temporary files,
75+
test resources and other source files unnecessary for running in a production
76+
environment.
77+
78+
This will produce a `tar` archive that needs to be unpacked into the artefacts directory.
79+
Note that the archive will actually contain a `package` subdirectory containing the files,
80+
so it's not enough to just directly unpack files.
81+
82+
#### Step 2: Rewrite local dependencies
83+
84+
_(out of scope for the current version)_
85+
86+
To optimise disk space and avoid including development dependencies from other
87+
locally linked packages, inspect the `package.json` manifest looking for dependencies
88+
referring to local file paths (can be identified as they start with `.` or `file:`),
89+
then for each dependency recursively execute the packaging process
90+
91+
Local dependencies may include other local dependencies themselves, this is a very
92+
common way of sharing configuration or development utilities such as linting or testing
93+
tools. This means that for each packaged local dependency this packager needs to
94+
recursively apply the packaging process. It also means that the packager needs to
95+
track local paths and avoid re-packaging directories it already visited.
96+
97+
NPM produces a `tar` archive while packaging that can be directly included as a
98+
dependency. This will make NPM unpack and install a copy correctly. Once the
99+
packager produces all `tar` archives required by local dependencies, rewrite
100+
the manifest to point to `tar` files instead of the original location.
101+
102+
#### Step 3: Install dependencies
103+
104+
The packager should then run `npm install` to download an expand all dependencies to
105+
the local `node_modules` subdirectory. This has to be executed in the directory with
106+
a clean copy of the source files.
107+
108+
Note that NPM can be configured to use proxies or local company repositories using
109+
a local file, `.npmrc`. The packaging process from step 1 normally excludes this file, so it may
110+
need to be copied additionally before dependency installation, and then removed.
111+
_(out of scope for the current version)_
112+
113+
Some users may want to exclude optional dependencies, or even include development dependencies.
114+
To avoid incompatible flags in the `sam` CLI, the packager should allow users to specify
115+
options for the `npm install` command using an environment variable.
116+
_(out of scope for the current version)_
117+
118+
To fully support dependencies that download or compile binaries for a target platform, this step
119+
needs to be executed inside a Docker image compatible with AWS Lambda.
120+
_(out of scope for the current version)_
121+

0 commit comments

Comments
 (0)