Skip to content

Derive nix profile from the generated flake #1692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

Conversation

savil
Copy link
Collaborator

@savil savil commented Dec 22, 2023

Summary

This PR seeks to unify our nix print-dev-env and nix profile steps. Other benefits:

  1. It will enable us to remove some hacks (example, this one introduced for glibc-patch)
  2. Re-enable some testscripts that were disabled (example, this one for php extension removal)

Previously, these would be two separate processes: we'd manually do nix profile install.
Now, we get the buildInputPaths from nix print-dev-env and install those, so the nix profile is derived from the flake.

Some highlights of code movements/changes:

  1. Adds devbox.syncNixProfile. Initially copied from [wip/poc] impl: use flake for profile packages #1588, and then modified quite a bit.
  2. Move devpkg.storePathParts to nix.StorePathParts so it can be called from syncFlakeToProfile
  3. Needed to modify nixprofile.ProfileInstall to work on Installables rather than package-names, so it can install /nix/store/<hash>-<pkg> buildPaths from the flake.
  4. Moved validatePackages from nixprofile.ProfileInstall to ensureStateIsUpToDate, since we must run it prior to computeEnv that evaluates nix print-dev-env.
  5. Added devbox.installNixPackagesToStore so that packages are locally installed in nix store when adding or removing or updating, and not in devbox-environment.

Behavior change:
This PR leads to more computeEnv calls that do nix print-dev-env in our Examples tests. The reason is that devbox add/rm/update will always invalidate the local.lock cache, leading to subsequent operations like devbox run/shell/shellenv to invoke computeEnv. Do note that this is more correct than before where the nix print-dev-env cache would be stale for devbox add/rm/update when not in a devbox environment.

It led to some Examples testscripts failing for nix 2.17.0 due to errors like https://gist.github.com/savil/03ae7f4e55c0796a60bdf0cfae588fac. It appears this was an issue with nix that was fixed, affecting nix operations that happen in parallel, which affects our cicd tests.

The issue was fixed in August 2023, so it doesn't fail for nix 2.19. It also doesn't fail for nix 2.12, so perhaps it was an issue introduced in between.

For now, I've dropped 2.17 from the nix versions that we run the cli-tests on. I think this is okay, since we still test with nix 2.12 and 2.19.

How was it tested?

  1. Automated tests should pass.
  2. Did devbox add and devbox rm with hello and cowsay and also with the process-compose flake.
  3. devbox add glibcLocales prints the correct error message and suggestion to use --exclude-platform. And then verified that devbox add glibcLocales --exclude-platform x86_64-darwin works as expected.

is there some edge case I may be missing?

Copy link
Collaborator Author

savil commented Dec 22, 2023

Current dependencies on/for this PR:

This stack of pull requests is managed by Graphite.

@savil
Copy link
Collaborator Author

savil commented Dec 22, 2023

fails with:

--- FAIL: TestScripts (0.05s)
    --- FAIL: TestScripts/php.test (319.64s)
        testscript.go:534: > exec devbox run php index.php
            [stderr]
            Ensuring packages are installed.
            Error: error running script "php" in Devbox: Command: /nix/var/nix/profiles/default/bin/nix print-dev-env $WORK/.devbox/gen/flake --extra-experimental-features ca-derivations --option experimental-features nix-command flakes fetch-closure --json: exit status 1

            Error: There was an internal error. Run with DEVBOX_DEBUG=1 for a detailed error message, and consider reporting it at https://github.com/jetpack-io/devbox/issues
            program not built with -cover
            [exit status 1]
            FAIL: languages/php.test.txt:1: unexpected command failure

    --- FAIL: TestScripts/unfree.test (353.82s)
        testscript.go:534: # Test ensures that we can add and remove "unfree" nix packages (1.499s)
            # we could test with slack and/or vscode. Using slack since it is lighter. (352.309s)
            > exec devbox add slack
            [stdout]
            Running /nix/var/nix/profiles/default/bin/nix eval --json $WORK/.devbox/gen/flake#devShells.x86_64-darwin.default.buildInputs --extra-experimental-features ca-derivations --option experimental-features nix-command flakes fetch-closure
            [stderr]
            Ensuring nixpkgs registry is downloaded.
            Downloaded 'github:NixOS/nixpkgs/fd04bea4cbf76f86f244b9e2549fca066db8ddff' to '/nix/store/pfixw35k0jwhz6mrhd6jw0vlppjjkamg-source' (hash 'sha256-imn+/Rj+bqagOSm7GmRDbrqkxtc7QnjY3Cu85gv46BU=').
            Ensuring nixpkgs registry is downloaded: Success
            Info: Adding package "slack@latest" to devbox.json
            program not built with -cover
            > stderr 'Installing package: slack.'
            FAIL: packages/unfree.test.txt:7: no match for `Installing package: slack.` found in stderr

FAIL
coverage: [no statements]
FAIL	go.jetpack.io/devbox/testscripts	438.322s
FAIL
Error: error running script "test" in Devbox: exit status 1

@savil savil force-pushed the savil/profile-from-flake branch 4 times, most recently from 4ee4487 to 9122399 Compare December 23, 2023 01:23
@savil
Copy link
Collaborator Author

savil commented Dec 23, 2023

Made some progress.

The unfree testscript needed a update for which output to look for. So fixed.

The php-testscript was failing due to nix profile install adding multiple buildInputs for the same package (in this case, composer is added by both the php plugin and as a devbox package). Made a change to reuse our current approach of setting --priority in the nix profile install.

So, this is very much usable. However, I feel there are some UX fixes to be made:

  1. We'll need to prefetch nixpkgs for when we are going to download them. This allows us to show a nicer message than when the flake.nix downloads them as a side-effect of being evaluated.
  2. devbox rm <pkg> needs UX improvement to print some nice messages. The current messages have gotten dropped.
  3. Adding a package to the nix profile currently displays the /nix/store path as in:
[1/1] /nix/store/c9mw21jbrpprfd8vh735zvangxmh7j5a-cowsay-3.7.0
[1/1] /nix/store/c9mw21jbrpprfd8vh735zvangxmh7j5a-cowsay-3.7.0: Success

We should parse the package name and version from this path, and display just that.

But this seems pretty doable and should simplify our logic quite a bit.

@savil savil force-pushed the savil/profile-from-flake branch 3 times, most recently from ee96124 to 391754f Compare January 3, 2024 00:29
@savil savil changed the title nix profile from flake Derive nix profile from the generated flake Jan 3, 2024
@savil savil force-pushed the savil/profile-from-flake branch 2 times, most recently from 3bcac9e to 391754f Compare January 3, 2024 01:28
@savil
Copy link
Collaborator Author

savil commented Jan 3, 2024

Currently, running into this CICD error:

       2024/01/03 01:37:52 running command: /nix/var/nix/profiles/default/bin/nix profile install --profile $WORK/.devbox/nix/profile/default --impure --priority 8 /nix/store/cyqwmgbir56zsn600z0sgnvdribnfgpj-process-compose-0.40.2 --extra-experimental-features ca-derivations --option experimental-features nix-command flakes fetch-closure
        don't know how to build these paths:
          /nix/store/cyqwmgbir56zsn600z0sgnvdribnfgpj-process-compose-0.40.2
        error: path '/nix/store/cyqwmgbir56zsn600z0sgnvdribnfgpj-process-compose-0.40.2' is required, but there is no substituter that can build it

@savil savil force-pushed the savil/profile-from-flake branch 15 times, most recently from 5fb249f to 25c6575 Compare January 5, 2024 01:06
@savil savil requested a review from mikeland73 January 5, 2024 01:23
@savil savil requested review from mikeland73 and gcurtis January 11, 2024 17:03
@savil
Copy link
Collaborator Author

savil commented Jan 11, 2024

@mikeland73 @gcurtis PTAL. I think i've addressed the perf concern.

For the scenario we were concerned about (devbox add/rm/update not in a devbox environment) I now skip computeEnv and syncNixProfile entirely, and install into nix store via nix build. We then do the rest of the operations and delete the local.lock file (simplest way to "mark it dirty"), so that the next time we do ensureStateIsUpToDate(mode=ensure) it will recompute the env and syncNixProfile.

This PR is getting rather large. Let me know if you'd like me to split it up.

@savil savil marked this pull request as ready for review January 11, 2024 18:22
// then we skip some operations below for speed.
// Remove the local.lock file so that we re-compute the project state when
// we are in the devbox environment again.
defer func() { _ = d.lockfile.RemoveLocal() }()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this should get moved to the else clause below (where we skip computeEnv and syncNixProfile)

@@ -248,10 +248,20 @@ const (
// The `mode` is used for:
// 1. Skipping certain operations that may not apply.
// 2. User messaging to explain what operations are happening, because this function may take time to execute.
func (d *Devbox) ensureStateIsUpToDate(ctx context.Context, mode installMode) error {
//
// linter:revive is turned off due to complaining about function complexity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really prefer not adding these. The complexity rule forces us to structure our code instead of just growing a function indefinitely.

I already once spent some time cleaning this function up to remove this, so unless we decide to turn this off, it's just shifting the work to a future developer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, some context:

  1. I wanted to see if the code would pass all cicd tests, and didn't want to do big refactoring before getting that information.
  2. I like this linter too, but its a bit vague about how its rules work. How does it decide what is complex? How do i refactor my code?
  3. I then didn't invest time in removing this, since I was awaiting feedback on the general approach from you and Greg.

Comment on lines +257 to +263
if mode != ensure && !d.IsEnvEnabled() {
// if mode is install/uninstall/update and we are not in a devbox environment,
// then we skip some operations below for speed.
// Remove the local.lock file so that we re-compute the project state when
// we are in the devbox environment again.
defer func() { _ = d.lockfile.RemoveLocal() }()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer logic for non clean up tasks makes code much harder to debug. Instead I would suggest just leaving local cache dirty. I'll add more details in other comments.

Comment on lines +328 to +331
// Else: if we are not in a devbox environment, and we are installing or updating
// then we must ensure the new nix packages are in the nix store. This way, the
// next time we enter a devbox environment, we will have the packages available locally.
if err := d.installNixPackagesToStore(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to validate and also install nix packages to store? Can't we just install nix packages above? (where you area validating?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we may be installing plugins, I thought it should come after the plugin manager create, and remove plugin invalid symlinks.

Will see if I can combine validatePackagesToBeInstalled with nix build and move that below the pluginmanager setup functions.

Comment on lines +310 to +325
// the correct set of packages are reflected in the nix-profile below.
env, err := d.computeEnv(ctx, false /*usePrintDevEnvCache*/)
if err != nil {
return err
}

// Ensure the nix profile has the packages from the flake.
buildInputs := []string{}
if env["buildInputs"] != "" {
// env["buildInputs"] can be empty string if there are no packages in the project
// if buildInputs is empty, then we don't want wantStorePaths to be an array with a single "" entry
buildInputs = strings.Split(env["buildInputs"], " ")
}
if err := d.syncNixProfile(ctx, buildInputs); err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like everything here should be part of syncNixProfile

Comment on lines +61 to +63
if args.Offline {
cmd.Args = append(cmd.Args, "--offline")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are substituters? and are we ok disabling them?

Comment on lines +280 to +282
// Validate packages. Must be run up-front and definitely prior to computeEnv
// and syncNixProfile below that will evaluate the flake and may give
// inscrutable errors if the package is uninstallable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This make sense for ensure cases, but not for add/remove cases. In that case we just run nix build which will presumably fail if a package is not valid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will see if that may work

Comment on lines +87 to +89
func (i *NixProfileListItem) StorePaths() []string {
return i.nixStorePaths
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just make this exported instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since everything in this struct is private and controlled, will prefer leaving it this way

Lockfile *lock.File
Package string
Installable string
Offline bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the offline option if we use nix build instead of profile install for individual packages? In that case I think offline is always true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nix build has its own --offline flag.

We do need the --offline flag for the reasons I've explained elsewhere

func (d *Devbox) ensureStateIsUpToDate(ctx context.Context, mode installMode) error {
//
// linter:revive is turned off due to complaining about function complexity
func (d *Devbox) ensureStateIsUpToDate(ctx context.Context, mode installMode) error { //nolint:revive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic an flow of this function can be improved (and maybe that can help split PRs) to make this lower risk. Proposal:

  • Split lockfile.Save() into lockfile.Save() and lock.UpdateAndSaveStateHash(). This should actually live in different packages, but baby steps. The first function saves the lockfile, the other only deals with the hash file.
  • define mode == ensure || d.IsEnvEnabled() as recomputeLocalState
  • if !recomputeLocalState then we only add and build packages, or remove.
  • if recomputeLocalState we syncFlake.
  • if recomputeLocalState we call lock.UpdateAndSaveStateHash()
  • we always save lockfile at the end.

In terms of splitting PRs we could split out the lockfile functions and do that first. Followed by most of the rest

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the lockfile.save function is:

func (f *File) Save() error {
	if err := cuecfg.WriteFile(lockFilePath(f.devboxProject), f); err != nil {
		return err
	}
	return updateLocal(f.devboxProject)
}

splitting this is a pretty small change. Here I'll do it in this comment itself:

func (f *File) Save() error {
	return cuecfg.WriteFile(lockFilePath(f.devboxProject), f)
}
func (f *File) UpdateAndSaveStateHash() error {
	return updateLocal(f.devboxProject)
}

It feels a bit unnecessary to make a separate PR just for this...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like the overall breakdown of the logic you present. Let me refactor into that.

@savil savil force-pushed the savil/profile-from-flake branch from 8b7d5a8 to 8521fa4 Compare January 12, 2024 00:04
@savil savil changed the base branch from main to savil/nix-storepathparts January 12, 2024 00:04
Base automatically changed from savil/nix-storepathparts to main January 12, 2024 00:18
savil added a commit that referenced this pull request Jan 12, 2024
## Summary

This should be in the `nix` package

Seeking to reduce the code in #1692.

## How was it tested?

compiles, and tests should pass
@savil
Copy link
Collaborator Author

savil commented Jan 13, 2024

re-written in #1724 and the PR stack below that.

@savil savil closed this Jan 13, 2024
savil added a commit that referenced this pull request Jan 16, 2024
…ompute state (#1723)

## Summary

Refactor the implementation of `ensureStateIsUpToDate` as per the
suggestion in this comment:
#1692 (comment)

1. Split `lockfile.Save` into `lockfile.Save` and
`lockfile.UpdateAndSaveStateHash`
2. For now, regardless of `recomputeState` (boolean variable), we always
call `recomputeState` (the function). In the next PR, this will change.

NOTE:
In `pluginManager.create`, I only do `lockfile.Save` and not
`lockfile.UpdateAndSaveLocal` (as before), because:
When I do `devbox add php81Extensions.ds` then the `lockfile.Save` in
`pluginManager.Create` here is resulting in the `local.lock` being
up-to-date so that the next `devbox run` is thinking everything is up to
date and early returning in `ensureStateIsUpToDate`.

## How was it tested?

```
devbox shell
devbox add hello
devbox rm hello
```

tests should pass.
savil added a commit that referenced this pull request Jan 17, 2024
## Summary

This PR is a re-write of #1692. See motivation and discussion there.

**Motivation**
This PR seeks to unify our nix print-dev-env and nix profile steps.
Other benefits:
- It will enable us to remove some hacks (example, this one [introduced
for
glibc-patch](https://github.com/jetpack-io/devbox/blob/d66f6ea5388adb3b90008c64a1b9b9abdc4550cd/internal/devbox/devbox.go#L938-L944))
- Re-enable some testscripts that were disabled (example, this [one for
php extension
removal](https://github.com/jetpack-io/devbox/blob/main/testscripts/languages/php.test.txt#L4-L15))

Previously, these would be two separate processes: we'd manually do nix
profile install.
Now, we get the `buildInputPaths` from `nix print-dev-env` and install
those, so the nix profile is derived from the flake.

**Implementation notes**
1. In `ensureStateIsUpToDate`, I had to modify the logic in #1724: call
`d.installPackages` that creates plugin files, installs runx and nix
packages. Also, move all lockfile operations to the end for _all_ modes,
instead of just for `recomputeState == true`.
2. The `nix build` operation now happens with the `[1/3] [email protected]:
Success` output that previously would happen with `nix profile install`.
3. Instead of `nixprofiles.NixProfileInstall`, we first install via `nix
build` to ensure the package is installed in the local nix store. We
then call `devbox.syncNixProfile` to get the `buildInputs` from
`devbox.computeEnv` and `nix profile install $buildInputs`.



## How was it tested?

- CICD tests pass
- Did basic sanity checking with `devbox add hello cowsay` and `devbox
rm hello cowsay`.
- Used the Devbox binary for development for a while.



To verify that we can install packages in `/nix/store` during `devbox
add`, and then use that during offline in `devbox shell`:
1. Do `devbox add [email protected]`
2. Do `devbox shell` and exit, or just `devbox install`.
3. Turn Wifi off, and then open `devbox shell` and verify that the newly
installed version of `python` is used.
BEFORE: we could skip step 2, because we directly installed packages in
the nix profile
AFTER: step 2 is required because we re-evaluate the flake's environment
and derive nix profile from it.
savil added a commit that referenced this pull request Jan 17, 2024
…1721)

## Summary

This PR reads the `outputs` field from the Config for a package.

For packages with non-default outputs in the config, we specify:
1. They are not to be taken from the binary cache
2. They are exempt from the `flakeInput.buildInputs()` that are
specified in the generated flake's devshell.buildInputs

Instead, we generate a [`symlinkJoin`
derivation](https://nixos.org/manual/nixpkgs/stable/#trivial-builder-symlinkJoin)
of the form:
```
            (pkgs.symlinkJoin {
              name = "prometheus-combined";
              paths = [
                # each output is printed here
                nixpkgs-fd04be-pkgs.prometheus.cli
                nixpkgs-fd04be-pkgs.prometheus.out
              ];
            })
```

This `prometheus-combined` derivation is useful so that when we derive
the nix-profile from the flake's buildInputs as in #1692, it is treated
as a single package. Without this, we could inline each output as a
distinct buildInput in the flake's devshell, but then the nix profile
would treat each output as its own package.

## How was it tested?


Added testscript unit-test. Also, ran the same steps inside the
testscript manually.

setup:
```
# add package with multiple outputs
devbox add prometheus --outputs cli,out

# add package for a specific nixpkgs commit hash (as a control)
devbox add github:nixos/nixpkgs/fd04bea4cbf76f86f244b9e2549fca066db8ddff#hello
```
then the generated flake is:
```
❯ cat .devbox/gen/flake/flake.nix
{
   description = "A devbox shell";

   inputs = {
     nixpkgs.url = "github:NixOS/nixpkgs/fd04bea4cbf76f86f244b9e2549fca066db8ddff";
     nixpkgs-fd04be.url = "github:NixOS/nixpkgs/fd04bea4cbf76f86f244b9e2549fca066db8ddff";
     gh-nixos-nixpkgs-fd04bea4cbf76f86f244b9e2549fca066db8ddff.url = "github:NixOS/nixpkgs/fd04bea4cbf76f86f244b9e2549fca066db8ddff";
   };

   outputs = {
     self,
     nixpkgs,
     nixpkgs-fd04be,
     gh-nixos-nixpkgs-fd04bea4cbf76f86f244b9e2549fca066db8ddff,
   }:
      let
        pkgs = nixpkgs.legacyPackages.x86_64-darwin;
        nixpkgs-fd04be-pkgs = (import nixpkgs-fd04be {
          system = "x86_64-darwin";
          config.allowUnfree = true;
          config.permittedInsecurePackages = [
          ];
        });
        gh-nixos-nixpkgs-fd04bea4cbf76f86f244b9e2549fca066db8ddff-pkgs = (import gh-nixos-nixpkgs-fd04bea4cbf76f86f244b9e2549fca066db8ddff {
          system = "x86_64-darwin";
          config.allowUnfree = true;
          config.permittedInsecurePackages = [
          ];
        });
      in
      {
        devShells.x86_64-darwin.default = pkgs.mkShell {
          buildInputs = [
            (builtins.fetchClosure {
              fromStore = "https://cache.nixos.org";
              fromPath = "/nix/store/0djljz0g4s6f55xcnw7fpzcy7af7rxid-go-1.21.4";
              inputAddressed = true;
            })




            (pkgs.symlinkJoin {
              name = "prometheus-combined";
              paths = [
                nixpkgs-fd04be-pkgs.prometheus.cli
                nixpkgs-fd04be-pkgs.prometheus.out
              ];
            })
            gh-nixos-nixpkgs-fd04bea4cbf76f86f244b9e2549fca066db8ddff-pkgs.hello
          ];
        };
      };
 }
```

this leads to `$buildInputs` in a `devbox shell` to be:
```
❯ echo $buildInputs
/nix/store/0djljz0g4s6f55xcnw7fpzcy7af7rxid-go-1.21.4 /nix/store/9dn3rzv04x63n0kz2jwpgz82rdlsa56h-prometheus-combined /nix/store/4xy6iv0ph2v6w7n06cw5ra7cmyvignkm-hello-2.12.1
```

where the `prometheus-combined` derivation has the `cli` and `out`
outputs combined:
```
❯ ls -al /nix/store/9dn3rzv04x63n0kz2jwpgz82rdlsa56h-prometheus-combined/bin/
total 0
dr-xr-xr-x 4 root wheel 128 Dec 31  1969 .
dr-xr-xr-x 4 root wheel 128 Dec 31  1969 ..
lrwxr-xr-x 1 root wheel  76 Dec 31  1969 prometheus -> /nix/store/hizpsf2f5gc7l810328382xicjb9gc73-prometheus-2.48.1/bin/prometheus
lrwxr-xr-x 1 root wheel  78 Dec 31  1969 promtool -> /nix/store/8x6psdqn3945pbs0ww5wanmfhvvz2iyl-prometheus-2.48.1-cli/bin/promtool
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants