Skip to content
This repository was archived by the owner on Mar 27, 2024. It is now read-only.

Try to use local rpm binary to query rpmdb #244

Merged
merged 2 commits into from
Jun 26, 2018

Conversation

davidcassany
Copy link
Contributor

With this PR the RPMAnalyzer first tries to execute local rpm, if any, to parse the image rpm database. In case of failure it fallbacks to running the image in a container to execute the rpm shipped within the image itself.

If running container-diff on a host with an rpm based distro the diff is going to be faster and without the need of a running docker daemon.

Signed-off-by: David Cassany [email protected]

With this commit the rpm differ first tries to execute local rpm
to parse the image rpm database. In case of failure it fallbacks to
running the image in a container to execute the rpm shipped within
the image itself.

Signed-off-by: David Cassany <[email protected]>
Copy link
Contributor

@nkubala nkubala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidcassany Thanks for the contribution! A couple of comments, but overall this looks good. Out of curiosity, did you do any benchmarking for the performance increase when the host has an RPM binary installed on it?

return rpmDataFromContainer(image)
packages, err := rpmDataFromImageFS(image)
if err != nil {
logrus.Warn("Trying to run the RPM binary of the image in a container")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's log this at Info level

return rpmDataFromContainer(image)
packages, err := rpmDataFromImageFS(image)
if err != nil {
logrus.Warn("Trying to run the RPM binary of the image in a container")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Running RPM binary from image in a container"

// rpmDBPath tries to get the RPM database path from the /usr/lib/rpm/macros
// file in the image rootfs.
func rpmDBPath(rootFSPath string) (string, error) {
rpmMacros, err := os.Open(filepath.Join(rootFSPath, "usr/lib/rpm/macros"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this path a constant

// rpmDBPath tries to get the RPM database path from the /usr/lib/rpm/macros
// file in the image rootfs.
func rpmDBPath(rootFSPath string) (string, error) {
rpmMacros, err := os.Open(filepath.Join(rootFSPath, "usr/lib/rpm/macros"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that the RPM database could have been put in a different location by the creator of the image? are there other common locations that we might want to check here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are looking for the value of %_dbpath macro which sets the path of the database.

is it possible that the RPM database could have been put in a different location by the creator of the image?

Yes, it could be the case that the user sets its own %_dbpath in a custom user defined macro. As far as I saw in fedora, opensuse and centos, the %_dbpath defined in /usr/lib/rpm/macros is not overwritten in any of them. According to the man pages RPM looks for macros in:

The default FILELIST is /usr/lib/rpm/macros:/usr/lib/rpm/macros.d/macros.*:/usr/lib/rpm/platform/%{_target}/macros:/usr/lib/rpm/fileattrs/*.attr:/usr/lib/rpm/redhat/macros:/etc/rpm/macros.*:/etc/rpm/macros:/etc/rpm/%{_target}/macros:~/.rpmmacros

The macros from the image could be loaded by using the --macros rpm flag and passing the appropriate paths list. However I encountered that this flag expects a colon separated list of paths and I could not manage to escape colons that they are part of the path we want to provide (i.e providing --macros opensuse@sha256:8e70aa9ec2/usr/lib/rpm/macros is understood as two separate paths 🤦‍♂️ and I could not manage to escape the colons).

That's the reason to manually parse the /usr/lib/rpm/macros file to figure out the rpmdb path. Note that then is verified the path exists and if the command fails it falls back to running the container. I tested this approach of parsing the main macros file in opensuse, centos and fedora with success.

line := strings.TrimSpace(scanner.Text())
if strings.HasPrefix(line, "%_dbpath") {
fields := strings.Fields(line)
if len(fields) < 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a comment with an example of what you're trying to match for here? maybe a regex would be more clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment with the example, I think this is the clearest.

@davidcassany
Copy link
Contributor Author

did you do any benchmarking for the performance increase when the host has an RPM binary installed on it?

I just did some tests with time container-diff analyze daemon://opensuse/leap --type=rpm resulted in

real    0m11.208s
user    0m10.651s
sys     0m1.626s

without the patch. Once the patch is applied it resulted in

real    0m4.457s
user    0m4.181s
sys     0m0.513s

Got similar improvements using fedora and centos images.

@davidcassany
Copy link
Contributor Author

@nkubala thanks much for the review! I think I addressed your concerns

Copy link
Contributor

@nkubala nkubala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@nkubala nkubala merged commit dfb276a into GoogleContainerTools:master Jun 26, 2018
@davidcassany davidcassany deleted the rpm_differ_update branch June 27, 2018 07:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants