-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix #77565: Incorrect locator detection in ZIP-based phars #6507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We must not assume that the first end of central dir signature in a ZIP archive actually designates the end of central directory record, since the data in the archive may contain arbitrary byte patterns. Thus, we better search from the end of the data, what is also slightly more efficient.
@@ -161,6 +161,12 @@ static void phar_zip_u2d_time(time_t time, char *dtime, char *ddate) /* {{{ */ | |||
} | |||
/* }}} */ | |||
|
|||
static char *phar_find_eocd(const char *s, size_t n) | |||
{ | |||
const char *end = s + n + sizeof("PK\5\6") - 1 - sizeof(phar_zip_dir_end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is end not just s + n
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that marker string might (theoretically) be part of the directory record. This code makes sure that we really get the start of the end of central directory record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this really guarantees it either ... say it's at the start of a 255 byte trailing comment. The -sizeof(phar_zip_dir_end)
won't skip over that. Or am I misunderstanding what you mean here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, you're right! It seems to me that the only way to reliably detect the end of central directory header would be to read through all headers and data from the beginning of the file. Anyhow, I'm going to commit a mitigitation for the current approach; maybe this is reasonably sufficient? With that change, two tests fail due to different errors; these would need to be fixed, if we're going that route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks okay to me. Personally I'd start at end = s + n
and then check eocd_start + sizeof(phar_zip_dir_end) <= p + n
before accessing comment_len ... your current code is safe, but it took me a moment to understand that this is guaranteed due to the used start position.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see that might be confusing; I added a respective assertion, and also adapted the tests.
There is no way to detect the end of central directory signature by searching from the end of the ZIP archive with absolute certainty, since the signature could be part of the trailing comment. To mitigate, we check that the comment length fits to the found position, but that might still not be the correct position in rare cases.
We must not assume that the first end of central dir signature in a ZIP
archive actually designates the end of central directory record, since
the data in the archive may contain arbitrary byte patterns. Thus, we
better search from the end of the data, what is also slightly more
efficient.