Skip to content

Bundle list routing + other bugfixes #34

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Mar 15, 2023
Merged

Bundle list routing + other bugfixes #34

merged 13 commits into from
Mar 15, 2023

Conversation

vdye
Copy link
Collaborator

@vdye vdye commented Mar 13, 2023

Fixes #32

While end-to-end testing, I ran into #32; while investigating that bug, I found a couple others. This PR fixes all of the ones I've found, and adds some unit test coverage for good measure.

  • Commit 1 stops the trace2 TraceLogger from suppressing the message intended to be logged to the console in Fatal() & Fatalf().
  • Commit 2 fixes a condition I had inverted; it was intended to prevent the trace2 logger from logging an "error" event in every logger.Error*() call up the call stack (and only log the first it sees), but instead it logged every error except the first one. 🤦‍♀️
  • Commits 3-4 do some refactoring to prepare for unit testing GitHelper
  • Commit 5 fixes an issue in GitHelper.CreateIncrementalBundle(). Specifically, the function was intended to return no error if the Git error message indicated that any new incremental bundle would be empty; that message wasn't captured, so CreateIncrementalBundle() would never see it and always return an error.
  • Commits 6-9 refactor BundleProvider.WriteBundleList to write two bundle lists: one with bundle paths relative to a request URI with a trailing slash (with relative paths like bundle-123456.bundle), and one (new) for URIs with no trailing slash (relative path like myrepo/bundle-123456.bundle). This is the first step to fixing how we serve bundle list requests with no trailing slash.
  • Commits 10-11 refactor bundleServer.serve() to serve the relative-to-org (rather than relative-to-repo) bundle file if there is no trailing slash on the request URI.
  • Commit 12 adds a step to first fetch from the remote when creating an incremental bundle in git-bundle-server update. Without that, no new incremental bundles will ever be generated.
  • Commit 13 adds the bundle.heuristic config to the bundle lists served by the bundle server, ensuring incremental bundles will be applied on fetches after an initial clone.

Other approaches to #32

I also considered generating the bundle list config contents on the fly (by reading the bundle-list.json and formatting the output), but the cost tradeoff of computation for every request vs. a bit of extra disk space seemed to favor the latter.

vdye added 8 commits March 10, 2023 16:25
The 'Trace2.Fatal*()' functions log the specified error to the Trace2
output, but not to the console (unlike 'log.Fatal()'). Replace the call to
'Trace2.Exit()' with 'Trace2.logExit()' + 'log.Fatal*()' to ensure the error
is shown properly.

Signed-off-by: Victoria Dye <[email protected]>
Change the condition determining whether to log the error from 'if isLogged'
to 'if !isLogged', since we want to log the error if it has *not* already
been logged.

Signed-off-by: Victoria Dye <[email protected]>
Make the 'settingType' values used to configure behavior in
'CommandExecutor.Run()' public so that they can be used outside the 'git'
package (namely, in unit tests added in a later patch).

Signed-off-by: Victoria Dye <[email protected]>
Add a 'Reduce()' function to collapse a given slice into a single
accumulator. 'Reduce()' loops over the slice in order and applies the
specified reduce function to the slice element and accumulator. Add unit
tests applying 'Reduce()' to various slice and accumulator types.

Signed-off-by: Victoria Dye <[email protected]>
In a609c0b (git: refactor into interface, 2023-02-24), the function
'GitCommandWithStdin()' was replaced with 'gitCommandWithStdin()'. In that
update, the STDERR log that was previously written to the error message was
removed. Downstream functions (namely 'CreateIncrementalBundle()') relied on
the content of that STDERR message to determine behavior. Add it back to the
error message to correct the behavor of 'CreateIncrementalBundle()'.

Additionally, add unit tests covering 'CreateIncrementalBundle()'.

Signed-off-by: Victoria Dye <[email protected]>
Extract leading directory creation into its own private function
('createLeadingDirs()') in 'filesystem.go'. Although 'WriteFile()' is still
the only usage of this functionality, it could prove useful for functions
added to 'fileSystem' in the future.

Signed-off-by: Victoria Dye <[email protected]>
Add 'WriteLockFileFunc()' to provide a mechanism for updating files
semi-atomically. The function takes a filename and a function (the logic
perform the write to a given file pointer, executes the write function on
'<filename>.lock', then returns a 'LockFile' object. That returned
'LockFile' can then either invoke 'Commit()' (to rename '<filename>.lock' to
'<filename>') or 'Rollback()' (to delete '<filename>.lock'). If
'WriteLockFileFunc()' encounters any error in its operation, it tries to
rollback the lockfile before returning the error.

This function is mainly useful for writing files that may be read
concurrently with the write operation. Writing the file to a lockfile, then
atomically renaming the file avoids avoids the possibility of another
process reading a half-written update to the file.

Example usage:

---
lockFile, err := fileSystem.WriteLockFileFunc("myfile.txt", func(f io.Writer) error {
    f.Write([]byte("my text"))
})
if err != nil {
    return err
}
lockFile.Commit()
---

Signed-off-by: Victoria Dye <[email protected]>
Create a 'NewBundle()' function, which creates a new 'Bundle' instance from
a given repository and timestamp. Replace all manual creation of 'Bundle{}'.

Note that this includes a minor functional change - previously, the base
bundle created by 'CollapseList()' was named 'base-<timestamp>.bundle'; now,
that bundle will be named 'bundle-<timestamp>.bundle', consistent with all
other instances of bundle creation. The name doesn't have any bearing on
bundle server functionality, so the naming convention can be updated without
breaking any existing bundle servers.

Signed-off-by: Victoria Dye <[email protected]>
@vdye vdye added this to the v1.0 milestone Mar 13, 2023
@vdye vdye requested a review from derrickstolee as a code owner March 13, 2023 22:43
@vdye vdye self-assigned this Mar 13, 2023
@vdye vdye requested a review from ldennington as a code owner March 13, 2023 22:43
@vdye
Copy link
Collaborator Author

vdye commented Mar 15, 2023

Upon doing more testing, I've found more bugs (e.g., git-bundle-server update never calls git fetch, so no new bundles will ever be created). I don't think that affects the other bugfixes (those can be reviewed as-is), but I will be pushing more things to this branch in the near future.

vdye added 4 commits March 14, 2023 18:42
The bundle list for a given 'git-bundle-web-server' is intended to be served
from the route 'http[s]://<host>:<port>/<org>/<repo>', with or without a
trailing slash. The bundles listed are provided as paths relative to
'/<org>/<repo>'. However, these relative paths only work when there's a
trailing slash after '<repo>' in the URI; otherwise '<repo>' is interpreted
as a file in directory '<org>'. For example:

- URI: https://localhost:8080/git/git/, bundle relpath: bundle-1.bundle
  -> https://localhost:8080/git/git/bundle-1.bundle
- URI: https://localhost:8080/git/git, bundle relpath: bundle-2.bundle
  -> https://localhost:8080/git/bundle-2.bundle

This behavior is baked into Git, and is consistent with RFC 1808 [1]. We
still want to be able to serve the bundle list when no trailing slash is
given, though, so write out a second bundle list that prepends '<repo>' to
the bundle filename. In a later commit, we will update the web server to
serve this file's contents when appropriate, fixing the routing bug.

As part of this refactor of 'BundleProvider.WriteBundleList()', switch to
using 'FileSystem.WriteLockFileFunc()' to write lock files. Doing so allows
us to mock filesystem writes and unit test 'WriteBundleList()', which is
done here as well.

[1] https://www.rfc-editor.org/rfc/rfc1808#section-4

Signed-off-by: Victoria Dye <[email protected]>
Use 'OpenFile()' + 'ServeContent()', rather than 'ReadFile()' + 'Write()' to
serve bundle and bundle list content. There are a few (fairly minor)
benefits to 'ServeContent()': it does not buffer the full file contents
before writing to the response writer, and it automatically derives the
'Content-Type' and other headers from the file content.

Signed-off-by: Victoria Dye <[email protected]>
Update 'bundleServer.serve()' to serve the correct bundle list file for the
request URI. If the URI has no trailing slash (e.g.
'https://<host>:<port>/<org>/<repo>'), serve the 'repo-bundle-list' file
contents, which prepends '<repo>' to the relative bundle paths. If the URI
has a trailing slash, serve 'bundle-list' as normal.

Also, prevent users from directly reading 'bundle-list' and
'repo-bundle-list' based on the URI; the latter's relative paths would be
incorrect, and in general these files are an internal implementation detail
that should not be visible to users. If a user requests one of these files,
return a 404.

Signed-off-by: Victoria Dye <[email protected]>
Add an invocation of 'git fetch' to 'bundles.CreateIncrementalBundle()'
before running 'git bundle' to ensure that the latest updates to the
repository are captured in the bundle. If this is not done, the repository
will never see any updates, and will never create an incremental bundle.

Signed-off-by: Victoria Dye <[email protected]>
@vdye vdye force-pushed the vdye/misc-bugfixes branch from f7b9aab to 62f35ce Compare March 15, 2023 01:53
Include the 'bundle.heuristic' config (value "creationToken" to denote use
of the creation token heuristic [1]) in the bundle list. Without this, Git
will not use the creation token heuristic to apply incremental bundle
updates on fetches after an initial clone.

Signed-off-by: Victoria Dye <[email protected]>
@vdye vdye force-pushed the vdye/misc-bugfixes branch from adcb3d1 to e53eca7 Compare March 15, 2023 05:01
Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially had some thoughts about the LockFile interface, but seeing its use (and your rollbackAll()) it makes more sense to me.

The commits are well organized and motivated. Looking forward to the tests that revealed all these bugs!

"github.com/stretchr/testify/mock"
)

var createIncrementalBundleTests = []struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests!

return nil
}

func (f *fileSystem) WriteLockFileFunc(filename string, writeFunc func(io.Writer) error) (LockFile, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this both creates the lock file and writes to it, should it also commit the lock file? That would avoid needing to have LockFile be public.

If instead we want to create LockFile in this method, then write to it separately, followed by a Commit() or Rollback() called by the consumer, then that would not need writeFunc.

The main reason to have LockFile public is so we can "take the lock" before spending effort constructing the file contents. We mostly need "WriteAtomicallyToFile()" for our current uses.

I think for our simple cases, the writeFunc approach is correct, but it should write the entire contents and handle the rename.

Comment on lines +141 to +151
var listLockFile, repoListLockFile, jsonLockFile common.LockFile
rollbackAll := func() {
if listLockFile != nil {
listLockFile.Rollback()
}
if repoListLockFile != nil {
repoListLockFile.Rollback()
}
if jsonLockFile != nil {
jsonLockFile.Rollback()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see here, you want to roll all of these back, if necessary. An interesting use of the LockFile interface. This does change my recommendation from my earlier comment.

` version = 1`,
` mode = all`,
``,
`[bundle "1"]`,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not seen this syntax before. TIL.

Comment on lines +363 to +364
// Fetch latest updates to repo
err := b.gitHelper.UpdateBareRepo(ctx, repo.RepoDir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! Don't know how I missed this before. Thanks for catching it.

@vdye vdye merged commit 6c259ad into main Mar 15, 2023
@vdye vdye deleted the vdye/misc-bugfixes branch March 15, 2023 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relative bundle paths do not resolve properly when --bundle-uri has no trailing slash
2 participants