-
Notifications
You must be signed in to change notification settings - Fork 237
feat: support emerge packages analyzer #337
feat: support emerge packages analyzer #337
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
4be029c
to
d886a6c
Compare
Hi, this PR has been widely tested and used in our production cluster. Hope someone could review this PR and merge it. If any concern, please let me know, thanks : ) |
} | ||
|
||
func getPkgSize(pkgPath string) int64 { | ||
var sizeFile *os.File |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I am missing something, but can this just use os.Stat to get the file size? https://gist.github.com/miguelmota/08bca5febf2ce460a289ef14c7e9af4f
If not, maybe a comment that says
// getPkgSize does X because it's special
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this SIZE file and created by emerge, since emerge will count the total size of a package and store it as a SIZE file in packages metadata directory.
So getPkgSize
reads the file to get the size of a packge.
And it's quite complex using os.Stat
to get the file size of a package, because all dependency files are scattered. For example, if installing dev-python/mysqlclient-3.0.0
, it will generate a link file /usr/lib/_mysql.so
, client codes in /usr/lib64/python3.7/site-packages/mysqlclient/
and python egg file in /usr/lib64/python3.7/
. Obviously, it's not easy to get size of all these files.
differs/emerge_diff.go
Outdated
fileBody, err := ioutil.ReadAll(sizeFile) | ||
if err != nil { | ||
logrus.Debugf("unable to read SIZE for pkg %s", pkgPath) | ||
return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Masking the error seems weird here. I think err should be returned rather than an inaccurate size given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense, I will fix this.
8a3c8f1
to
b41bcb2
Compare
Signed-off-by: Lexus Lee <[email protected]>
b41bcb2
to
676578a
Compare
@tstromberg Updated, any other comments on this PR? It would be appreciated if you could take a look, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @thehackercat, i'm almost embarrassed to be getting back to this PR so late 😬 our team doesn't work much on this project anymore so it's sometimes hard to find time to review PRs, but we really appreciate the contribution!
i've left several comments on here mostly around readability of the code, but the logic of this looks good. i'm going to merge this and then address a few of the changes in a follow up PR myself, and then i'll cut a v1.16.0 release so this will get shipped and you can use it in an official release!
thanks again for the contribution, and again very sorry we left you hanging for so long. we rely on contributors like you to keep this project going, so i just want to personally say thank you and i really appreciate it 🙏
@@ -0,0 +1,123 @@ | |||
/* | |||
Copyright 2018 Google, Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be 2020. i'll merge this and fix myself, so we don't have to go back and forth.
type EmergeAnalyzer struct { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
type EmergeAnalyzer struct { | |
} | |
type EmergeAnalyzer struct {} |
"github.com/sirupsen/logrus" | ||
) | ||
|
||
//Emerge package database location |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
//Emerge package database location | |
// Emerge package database location |
var path string | ||
path = image.FSPath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: in go, it's more idiomatic to say:
var path string | |
path = image.FSPath | |
path := image.FSPath |
switch path { | ||
case "": | ||
path = emergePkgFile | ||
default: | ||
path = filepath.Join(path, emergePkgFile) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this switch statement is really a two-case if statement in disguise, so it decreases the readability. i think it's easier to read if you just say:
switch path { | |
case "": | |
path = emergePkgFile | |
default: | |
path = filepath.Join(path, emergePkgFile) | |
} | |
if path == "" { | |
path = emergePkgFile | |
} else { | |
path = filepath.Join(path, emergePkgFile) | |
} |
return packages, err | ||
} | ||
|
||
for i := 0; i < len(contents); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in go, you can iterate directly over the items in a slice:
for i := 0; i < len(contents); i++ { | |
for _, content := range contents { |
Hi, we are using container-diff to diff our production images in every sdk release. Thanks that it helps a lot to us making our sdk image CHANGELOG.
But our images base on Gentoo/Linux, so most of packages are managed by
emerge
. According to Porage Uncyclo, packages installed byemerge
all locate at/var/db/pkg/
with same directory foramt{package-group}/{package-name}-{package-version}
, such as/var/db/pkg/app-crypt/gnupg-2.2.20
It isn't hard to get metadata of packages in
/var/db/pkg/
, like revision, size, build time etc. So following this make-your-own-differ suggestion, I add a emerge-differ.It works properly on gentoo-based images. Here is one of outputs of our image diffed by
emerge-differ
@nkubala hope this will help, any comments?