Skip to content

[nar info cache] Only fillNarInfoCache in perf-sensitive code path #1468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 12, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion internal/boxcli/featureflag/remove_nixpkgs.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ package featureflag
// It leverages the search index to directly map <package>@<version> to
// the /nix/store/<hash>-<package>-<version> that can be fetched from
// cache.nixpkgs.org.
var RemoveNixpkgs = disable("REMOVE_NIXPKGS")
var RemoveNixpkgs = enable("REMOVE_NIXPKGS")
64 changes: 41 additions & 23 deletions internal/devpkg/narinfo_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ var isNarInfoInCache = struct {
}

// IsInBinaryCache returns true if the package is in the binary cache.
// ALERT: Callers must call FillNarInfoCache before calling this function.
// ALERT: Callers in a perf-sensitive code path should call FillNarInfoCache
// before calling this function.
func (p *Package) IsInBinaryCache() (bool, error) {

if eligible, err := p.isEligibleForBinaryCache(); err != nil {
Expand All @@ -47,41 +48,61 @@ func (p *Package) IsInBinaryCache() (bool, error) {

// Check if the narinfo is present in the binary cache
isNarInfoInCache.lock.RLock()
exists, ok := isNarInfoInCache.status[p.Raw]
status, statusExists := isNarInfoInCache.status[p.Raw]
isNarInfoInCache.lock.RUnlock()
if !ok {
return false, errors.Errorf(
"narInfo cache miss: %v. Call FillNarInfoCache before invoking IsInBinaryCache",
p.Raw,
)
if !statusExists {
// Fallback to synchronously filling the nar info cache
if err := p.fillNarInfoCache(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For potentially simpler code. By definition, fillNarInfoCache() only returns a nil error if it succeeded. So you could do:

err := p.fillNarInfoCache()
return err == nil, err

Additionally, if fillNarInfoCache does

if isNarInfoInCache.status[p.Raw] {
  return nil
}

(no need to lock because we never remove from cache)

Then this entire function could be:

func IsInBinaryCache()
  if eligible, err := p.isEligibleForBinaryCache(); err != nil {
    return false, err
  } else if !eligible {
    return false, nil
  }
  err := p.fillNarInfoCache()
  return err == nil, err
}

(maybe rename fillNarInfoCache to fillNarInfoCacheIfNeeded for clarity)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this doesn't quite work for two reasons:

  1. fillNarInfoCache stores the boolean of whether the value is stored or not. A false value for a key in isNarInfoInCache does not indicate an error.

  2. Golang maps are not safe for concurrent reads and writes, so we will need a lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@savil you are correct on (1) but not on (2).

For (2), reading a true is only unsafe if we remove and/or change values. Since we should never remove/change values within a single command, I think this is safe and improves code readability quite a bit.

For (1), we can modify fillNarInfoCache() to return (status, error). The early return continues to be safe because if something is set to true it never changes.

An additional benefit of this approach is that we limit cache access to a single function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarified why (2) is correct in #1473 (comment)

return false, err
}

// Check again
isNarInfoInCache.lock.RLock()
status, statusExists = isNarInfoInCache.status[p.Raw]
isNarInfoInCache.lock.RUnlock()
if !statusExists {
return false, errors.Errorf(
"narInfo cache miss: %v. Should be filled by now",
p.Raw,
)
}
}
return exists, nil
return status, nil
}

// FillNarInfoCache checks the remote binary cache for the narinfo of each
// package in the list, and caches the result.
// Callers of IsInBinaryCache must call this function first.
// Callers of IsInBinaryCache may call this function first as a perf-optimization.
func FillNarInfoCache(ctx context.Context, packages ...*Package) error {
if !featureflag.RemoveNixpkgs.Enabled() {
return nil
}

eligiblePackages := []*Package{}
for _, p := range packages {
// NOTE: isEligibleForBinaryCache also ensures the package is
// resolved in the lockfile, which must be done before the concurrent
// section in this function below.
isEligible, err := p.isEligibleForBinaryCache()
// If the package is not eligible or there is an error in determining that, then skip it.
if isEligible && err == nil {
eligiblePackages = append(eligiblePackages, p)
}
}
if len(eligiblePackages) == 0 {
return nil
}

// Pre-compute values read in fillNarInfoCache
// so they can be read from multiple go-routines without locks
_, err := nix.Version()
if err != nil {
return err
}
_ = nix.System()
for _, p := range packages {
_, err := p.lockfile.Resolve(p.Raw)
if err != nil {
return err
}
}

group, _ := errgroup.WithContext(ctx)
for _, p := range packages {
for _, p := range eligiblePackages {
// If the package's NarInfo status is already known, skip it
isNarInfoInCache.lock.RLock()
_, ok := isNarInfoInCache.status[p.Raw]
Expand All @@ -105,15 +126,11 @@ func FillNarInfoCache(ctx context.Context, packages ...*Package) error {
}

// fillNarInfoCache fills the cache value for the narinfo of this package,
// if it is eligible for the binary cache.
// assuming it is eligible for the binary cache. Callers are responsible
// for checking isEligibleForBinaryCache before calling this function.
//
// NOTE: this must be concurrency safe.
func (p *Package) fillNarInfoCache() error {
if eligible, err := p.isEligibleForBinaryCache(); err != nil {
return err
} else if !eligible {
return nil
}

sysInfo, err := p.sysInfoIfExists()
if err != nil {
return err
Expand Down Expand Up @@ -146,6 +163,7 @@ func (p *Package) fillNarInfoCache() error {
return nil
}

// isEligibleForBinaryCache returns true if the package is eligible for the binary cache.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, I'll improve this to be less self-referential.

func (p *Package) isEligibleForBinaryCache() (bool, error) {
sysInfo, err := p.sysInfoIfExists()
if err != nil {
Expand Down
3 changes: 0 additions & 3 deletions internal/devpkg/package.go
Original file line number Diff line number Diff line change
Expand Up @@ -427,9 +427,6 @@ func (p *Package) LegacyToVersioned() string {
// EnsureNixpkgsPrefetched will prefetch flake for the nixpkgs registry for the package.
// This is an internal method, and should not be called directly.
func EnsureNixpkgsPrefetched(ctx context.Context, w io.Writer, pkgs []*Package) error {
if err := FillNarInfoCache(ctx, pkgs...); err != nil {
return err
}
for _, input := range pkgs {
if err := input.ensureNixpkgsPrefetched(w); err != nil {
return err
Expand Down
34 changes: 3 additions & 31 deletions internal/impl/packages.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,39 +42,19 @@ func (d *Devbox) Add(ctx context.Context, platforms, excludePlatforms []string,
// replace it.
pkgs := devpkg.PackageFromStrings(lo.Uniq(pkgsNames), d.lockfile)

// Fill in narinfo cache for all packages, even if the package-names are bogus
// (we'll just not use the result later)
if err := devpkg.FillNarInfoCache(ctx, pkgs...); err != nil {
return err
}

// addedPackageNames keeps track of the possibly transformed (versioned)
// names of added packages (even if they are already in config). We use this
// to know the exact name to mark as allowed insecure later on.
addedPackageNames := []string{}
existingPackageNames := d.PackageNames()
newPackages := []*devpkg.Package{}
for _, pkg := range pkgs {
// If exact versioned package is already in the config, we can skip the
// next loop that only deals with newPackages.
if slices.Contains(existingPackageNames, pkg.Versioned()) {
// But we still need to add to addedPackageNames. See its comment.
addedPackageNames = append(addedPackageNames, pkg.Versioned())
} else {
newPackages = append(newPackages, pkg)
continue
}
}

// Fill in the narinfo cache for versioned newPackages as well
versionedPackages := map[string]*devpkg.Package{}
for _, pkg := range newPackages {
versionedPackages[pkg.Versioned()] = devpkg.PackageFromString(pkg.Versioned(), d.lockfile)
}
if err := devpkg.FillNarInfoCache(ctx, lo.Values(versionedPackages)...); err != nil {
return err
}

for _, pkg := range newPackages {

// On the other hand, if there's a package with same canonical name, replace
// it. Ignore error (which is either missing or more than one). We search by
Expand All @@ -89,7 +69,7 @@ func (d *Devbox) Add(ctx context.Context, platforms, excludePlatforms []string,

// validate that the versioned package exists in the search endpoint.
// if not, fallback to legacy vanilla nix.
versionedPkg := versionedPackages[pkg.Versioned()]
versionedPkg := devpkg.PackageFromString(pkg.Versioned(), d.lockfile)

packageNameForConfig := pkg.Raw
ok, err := versionedPkg.ValidateExists()
Expand Down Expand Up @@ -363,12 +343,7 @@ func (d *Devbox) removePackagesFromProfile(ctx context.Context, pkgs []string) e
return err
}

packages := devpkg.PackageFromStrings(pkgs, d.lockfile)
if err := devpkg.FillNarInfoCache(ctx, packages...); err != nil {
return err
}

for _, pkg := range packages {
for _, pkg := range devpkg.PackageFromStrings(pkgs, d.lockfile) {
index, err := nixprofile.ProfileListIndex(&nixprofile.ProfileListIndexArgs{
Lockfile: d.lockfile,
Writer: d.writer,
Expand Down Expand Up @@ -438,9 +413,6 @@ func (d *Devbox) pendingPackagesForInstallation(ctx context.Context) ([]*devpkg.
if err != nil {
return nil, err
}
if err := devpkg.FillNarInfoCache(ctx, packages...); err != nil {
return nil, err
}
for _, pkg := range packages {
_, err := nixprofile.ProfileListIndex(&nixprofile.ProfileListIndexArgs{
Items: items,
Expand Down
4 changes: 0 additions & 4 deletions internal/impl/update.go
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,6 @@ func (d *Devbox) Update(ctx context.Context, pkgs ...string) error {
}
}

if err := devpkg.FillNarInfoCache(ctx, pendingPackagesToUpdate...); err != nil {
return err
}

for _, pkg := range pendingPackagesToUpdate {
if _, _, isVersioned := searcher.ParseVersionedPackage(pkg.Raw); !isVersioned {
if err = d.attemptToUpgradeFlake(pkg); err != nil {
Expand Down
6 changes: 0 additions & 6 deletions internal/nix/nixprofile/profile.go
Original file line number Diff line number Diff line change
Expand Up @@ -201,12 +201,6 @@ type ProfileInstallArgs struct {
func ProfileInstall(ctx context.Context, args *ProfileInstallArgs) error {
input := devpkg.PackageFromString(args.Package, args.Lockfile)

// Fill in the narinfo cache for the input package. It's okay to call this for a single package
// because installing is a slow operation anyway.
if err := devpkg.FillNarInfoCache(ctx, input); err != nil {
return err
}

inCache, err := input.IsInBinaryCache()
if err != nil {
return err
Expand Down
9 changes: 2 additions & 7 deletions internal/shellgen/flake_input.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,18 +77,13 @@ func (f *flakeInput) BuildInputs() ([]string, error) {
// i.e. have a commit hash and always resolve to the same package/version.
// Note: inputs returned by this function include plugin packages. (php only for now)
// It's not entirely clear we always want to add plugin packages to the top level
func flakeInputs(ctx context.Context, packages []*devpkg.Package) ([]*flakeInput, error) {
func flakeInputs(ctx context.Context, packages []*devpkg.Package) []*flakeInput {
defer trace.StartRegion(ctx, "flakeInputs").End()

// Use the verbose name flakeInputs to distinguish from `inputs`
// which refer to `nix.Input` in most of the codebase.
flakeInputs := map[string]*flakeInput{}

// Fill the NarInfo Cache so we can check IsInBinaryCache() for each package, below.
if err := devpkg.FillNarInfoCache(ctx, packages...); err != nil {
return nil, err
}

packages = lo.Filter(packages, func(item *devpkg.Package, _ int) bool {
// Include packages (like local or remote flakes) that cannot be
// fetched from a Binary Cache Store.
Expand Down Expand Up @@ -120,5 +115,5 @@ func flakeInputs(ctx context.Context, packages []*devpkg.Package) ([]*flakeInput
}
}

return goutil.PickByKeysSorted(flakeInputs, order), nil
return goutil.PickByKeysSorted(flakeInputs, order)
}
8 changes: 6 additions & 2 deletions internal/shellgen/flake_plan.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,14 @@ func newFlakePlan(ctx context.Context, devbox devboxer) (*flakePlan, error) {
if err != nil {
return nil, err
}
flakeInputs, err := flakeInputs(ctx, packages)
if err != nil {

// Fill the NarInfo Cache concurrently as a perf-optimization, prior to invoking
// IsInBinaryCache in flakeInputs() and in the flake.nix template.
if err := devpkg.FillNarInfoCache(ctx, packages...); err != nil {
return nil, err
}

flakeInputs := flakeInputs(ctx, packages)
nixpkgsInfo := getNixpkgsInfo(devbox.Config().NixPkgsCommitHash())

// This is an optimization. Try to reuse the nixpkgs info from the flake
Expand Down
25 changes: 25 additions & 0 deletions testscripts/add/add.test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

exec devbox init

# Add a package that is not part of the Devbox Search index.
# This exercises the fallback codepath for adding packages.
exec devbox add stdenv.cc.cc.lib
json.superset devbox.json expected_devbox1.json

# Add regular packages. Even though this is the more common scenario,
# we test this later, because the source.path below removes "devbox"
# from the PATH.
! exec rg --version
! exec vim --version
exec devbox add ripgrep vim
Expand All @@ -10,9 +18,26 @@ exec devbox shellenv
source.path
exec rg --version
exec vim --version
json.superset devbox.json expected_devbox2.json

-- devbox.json --
{
"packages": [
]
}

-- expected_devbox1.json --
{
"packages": [
"stdenv.cc.cc.lib"
]
}

-- expected_devbox2.json --
{
"packages": [
"ripgrep@latest",
"vim@latest",
"stdenv.cc.cc.lib"
]
}