-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb] Add Checksum to FileSpec #71457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb Author: Jonas Devlieghere (JDevlieghere) ChangesStore a Checksum in FileSpec. Its purpose is to store the MD5 hash added Full diff: https://github.com/llvm/llvm-project/pull/71457.diff 8 Files Affected:
diff --git a/lldb/include/lldb/Utility/Checksum.h b/lldb/include/lldb/Utility/Checksum.h
new file mode 100644
index 000000000000000..90a579b247636ac
--- /dev/null
+++ b/lldb/include/lldb/Utility/Checksum.h
@@ -0,0 +1,36 @@
+//===-- Checksum.h ----------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLDB_UTILITY_CHECKSUM_H
+#define LLDB_UTILITY_CHECKSUM_H
+
+#include "llvm/Support/MD5.h"
+
+namespace lldb_private {
+class Checksum {
+public:
+ static llvm::MD5::MD5Result sentinel;
+
+ Checksum(llvm::MD5::MD5Result md5 = sentinel);
+ Checksum(const Checksum &checksum);
+ Checksum &operator=(const Checksum &checksum);
+
+ explicit operator bool() const;
+ bool operator==(const Checksum &checksum) const;
+ bool operator!=(const Checksum &checksum) const;
+
+ std::string digest() const;
+
+private:
+ void SetMD5(llvm::MD5::MD5Result);
+
+ llvm::MD5::MD5Result m_checksum;
+};
+} // namespace lldb_private
+
+#endif // LLDB_UTILITY_CHECKSUM_H
diff --git a/lldb/include/lldb/Utility/FileSpec.h b/lldb/include/lldb/Utility/FileSpec.h
index f06a8543a056e87..29506b01c56d40b 100644
--- a/lldb/include/lldb/Utility/FileSpec.h
+++ b/lldb/include/lldb/Utility/FileSpec.h
@@ -13,15 +13,18 @@
#include <optional>
#include <string>
+#include "lldb/Utility/Checksum.h"
#include "lldb/Utility/ConstString.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/FormatVariadic.h"
+#include "llvm/Support/MD5.h"
#include "llvm/Support/Path.h"
#include <cstddef>
#include <cstdint>
+#include <optional>
namespace lldb_private {
class Stream;
@@ -72,7 +75,8 @@ class FileSpec {
/// The style of the path
///
/// \see FileSpec::SetFile (const char *path)
- explicit FileSpec(llvm::StringRef path, Style style = Style::native);
+ explicit FileSpec(llvm::StringRef path, Style style = Style::native,
+ const Checksum &checksum = {});
explicit FileSpec(llvm::StringRef path, const llvm::Triple &triple);
@@ -362,7 +366,11 @@ class FileSpec {
///
/// \param[in] style
/// The style for the given path.
- void SetFile(llvm::StringRef path, Style style);
+ ///
+ /// \param[in] checksum
+ /// The checksum for the given path.
+ void SetFile(llvm::StringRef path, Style style,
+ const Checksum &checksum = {});
/// Change the file specified with a new path.
///
@@ -420,6 +428,9 @@ class FileSpec {
/// The lifetime of the StringRefs is tied to the lifetime of the FileSpec.
std::vector<llvm::StringRef> GetComponents() const;
+ /// Return the checksum for this FileSpec or all zeros if there is none.
+ const Checksum &GetChecksum() const { return m_checksum; };
+
protected:
// Convenience method for setting the file without changing the style.
void SetFile(llvm::StringRef path);
@@ -427,6 +438,7 @@ class FileSpec {
/// Called anytime m_directory or m_filename is changed to clear any cached
/// state in this object.
void PathWasModified() {
+ m_checksum = Checksum();
m_is_resolved = false;
m_absolute = Absolute::Calculate;
}
@@ -443,6 +455,9 @@ class FileSpec {
/// The unique'd filename path.
ConstString m_filename;
+ /// The optional MD5 checksum of the file.
+ Checksum m_checksum;
+
/// True if this path has been resolved.
mutable bool m_is_resolved = false;
diff --git a/lldb/source/Utility/CMakeLists.txt b/lldb/source/Utility/CMakeLists.txt
index 16afab1113a970c..a3b0a405b4133f6 100644
--- a/lldb/source/Utility/CMakeLists.txt
+++ b/lldb/source/Utility/CMakeLists.txt
@@ -29,6 +29,7 @@ add_lldb_library(lldbUtility NO_INTERNAL_DEPENDENCIES
Args.cpp
Baton.cpp
Broadcaster.cpp
+ Checksum.cpp
CompletionRequest.cpp
Connection.cpp
ConstString.cpp
diff --git a/lldb/source/Utility/Checksum.cpp b/lldb/source/Utility/Checksum.cpp
new file mode 100644
index 000000000000000..70167e497a526c4
--- /dev/null
+++ b/lldb/source/Utility/Checksum.cpp
@@ -0,0 +1,46 @@
+//===-- Checksum.cpp ------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "lldb/Utility/Checksum.h"
+#include "llvm/ADT/SmallString.h"
+
+using namespace lldb_private;
+
+Checksum::Checksum(llvm::MD5::MD5Result md5) { SetMD5(md5); }
+
+Checksum::Checksum(const Checksum &checksum) { SetMD5(checksum.m_checksum); }
+
+Checksum &Checksum::operator=(const Checksum &checksum) {
+ SetMD5(checksum.m_checksum);
+ return *this;
+}
+
+void Checksum::SetMD5(llvm::MD5::MD5Result md5) {
+ std::uninitialized_copy_n(md5.begin(), 16, m_checksum.begin());
+}
+
+Checksum::operator bool() const {
+ return !std::equal(m_checksum.begin(), m_checksum.end(), sentinel.begin());
+}
+
+bool Checksum::operator==(const Checksum &checksum) const {
+ return std::equal(m_checksum.begin(), m_checksum.end(),
+ checksum.m_checksum.begin());
+}
+
+bool Checksum::operator!=(const Checksum &checksum) const {
+ return !std::equal(m_checksum.begin(), m_checksum.end(),
+ checksum.m_checksum.begin());
+}
+
+std::string Checksum::digest() const {
+ return std::string(m_checksum.digest().str());
+}
+
+llvm::MD5::MD5Result Checksum::sentinel = {0, 0, 0, 0, 0, 0, 0, 0,
+ 0, 0, 0, 0, 0, 0, 0, 0};
diff --git a/lldb/source/Utility/FileSpec.cpp b/lldb/source/Utility/FileSpec.cpp
index eb34ef97cea0b2f..4bbfbb7c1fab5e6 100644
--- a/lldb/source/Utility/FileSpec.cpp
+++ b/lldb/source/Utility/FileSpec.cpp
@@ -68,8 +68,9 @@ void Denormalize(llvm::SmallVectorImpl<char> &path, FileSpec::Style style) {
FileSpec::FileSpec() : m_style(GetNativeStyle()) {}
// Default constructor that can take an optional full path to a file on disk.
-FileSpec::FileSpec(llvm::StringRef path, Style style) : m_style(style) {
- SetFile(path, style);
+FileSpec::FileSpec(llvm::StringRef path, Style style, const Checksum &checksum)
+ : m_checksum(checksum), m_style(style) {
+ SetFile(path, style, checksum);
}
FileSpec::FileSpec(llvm::StringRef path, const llvm::Triple &triple)
@@ -171,9 +172,11 @@ void FileSpec::SetFile(llvm::StringRef pathname) { SetFile(pathname, m_style); }
// Update the contents of this object with a new path. The path will be split
// up into a directory and filename and stored as uniqued string values for
// quick comparison and efficient memory usage.
-void FileSpec::SetFile(llvm::StringRef pathname, Style style) {
+void FileSpec::SetFile(llvm::StringRef pathname, Style style,
+ const Checksum &checksum) {
Clear();
m_style = (style == Style::native) ? GetNativeStyle() : style;
+ m_checksum = checksum;
if (pathname.empty())
return;
diff --git a/lldb/unittests/Utility/CMakeLists.txt b/lldb/unittests/Utility/CMakeLists.txt
index 5c7003a156813dc..097dae860b15911 100644
--- a/lldb/unittests/Utility/CMakeLists.txt
+++ b/lldb/unittests/Utility/CMakeLists.txt
@@ -4,6 +4,7 @@ add_lldb_unittest(UtilityTests
OptionsWithRawTest.cpp
ArchSpecTest.cpp
BroadcasterTest.cpp
+ ChecksumTest.cpp
ConstStringTest.cpp
CompletionRequestTest.cpp
DataBufferTest.cpp
diff --git a/lldb/unittests/Utility/ChecksumTest.cpp b/lldb/unittests/Utility/ChecksumTest.cpp
new file mode 100644
index 000000000000000..7537d30b5ff5b84
--- /dev/null
+++ b/lldb/unittests/Utility/ChecksumTest.cpp
@@ -0,0 +1,57 @@
+//===-- ChecksumTest.cpp --------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "gtest/gtest.h"
+
+#include "lldb/Utility/Checksum.h"
+
+using namespace lldb_private;
+
+static llvm::MD5::MD5Result hash1 = {0, 1, 2, 3, 4, 5, 6, 7,
+ 8, 9, 10, 11, 12, 13, 14, 15};
+
+static llvm::MD5::MD5Result hash2 = {0, 1, 2, 3, 4, 5, 6, 7,
+ 8, 9, 10, 11, 12, 13, 14, 15};
+
+static llvm::MD5::MD5Result hash3 = {8, 9, 10, 11, 12, 13, 14, 15,
+ 0, 1, 2, 3, 4, 5, 6, 7};
+
+TEST(ChecksumTest, TestConstructor) {
+ Checksum checksum1;
+ EXPECT_FALSE(static_cast<bool>(checksum1));
+ EXPECT_EQ(checksum1, Checksum());
+
+ Checksum checksum2 = Checksum(hash1);
+ EXPECT_EQ(checksum2, Checksum(hash1));
+
+ Checksum checksum3(checksum2);
+ EXPECT_EQ(checksum3, Checksum(hash1));
+}
+
+TEST(ChecksumTest, TestCopyConstructor) {
+ Checksum checksum1;
+ EXPECT_FALSE(static_cast<bool>(checksum1));
+ EXPECT_EQ(checksum1, Checksum());
+
+ Checksum checksum2 = checksum1;
+ EXPECT_EQ(checksum2, checksum1);
+
+ Checksum checksum3(checksum1);
+ EXPECT_EQ(checksum3, checksum1);
+}
+
+TEST(ChecksumTest, TestMD5) {
+ Checksum checksum1(hash1);
+ EXPECT_TRUE(static_cast<bool>(checksum1));
+
+ // Make sure two checksums with the same underlying hashes are the same.
+ EXPECT_EQ(Checksum(hash1), Checksum(hash2));
+
+ // Make sure two checksums with different underlying hashes are different.
+ EXPECT_NE(Checksum(hash1), Checksum(hash3));
+}
diff --git a/lldb/unittests/Utility/FileSpecTest.cpp b/lldb/unittests/Utility/FileSpecTest.cpp
index 2a62f6b1e76120f..ebe7bde7d9c2169 100644
--- a/lldb/unittests/Utility/FileSpecTest.cpp
+++ b/lldb/unittests/Utility/FileSpecTest.cpp
@@ -28,8 +28,8 @@ TEST(FileSpecTest, FileAndDirectoryComponents) {
FileSpec fs_windows("F:\\bar", FileSpec::Style::windows);
EXPECT_STREQ("F:\\bar", fs_windows.GetPath().c_str());
- // EXPECT_STREQ("F:\\", fs_windows.GetDirectory().GetPath().c_str()); // It returns
- // "F:/"
+ // EXPECT_STREQ("F:\\", fs_windows.GetDirectory().GetPath().c_str()); // It
+ // returns "F:/"
EXPECT_STREQ("bar", fs_windows.GetFilename().GetCString());
FileSpec fs_posix_root("/", FileSpec::Style::posix);
@@ -297,45 +297,18 @@ TEST(FileSpecTest, FormatFileSpec) {
TEST(FileSpecTest, IsRelative) {
llvm::StringRef not_relative[] = {
- "/",
- "/a",
- "/a/",
- "/a/b",
- "/a/b/",
- "//",
- "//a/",
- "//a/b",
- "//a/b/",
- "~",
- "~/",
- "~/a",
- "~/a/",
- "~/a/b",
- "~/a/b/",
- "/foo/.",
- "/foo/..",
- "/foo/../",
- "/foo/../.",
+ "/", "/a", "/a/", "/a/b", "/a/b/", "//", "//a/",
+ "//a/b", "//a/b/", "~", "~/", "~/a", "~/a/", "~/a/b",
+ "~/a/b/", "/foo/.", "/foo/..", "/foo/../", "/foo/../.",
};
- for (const auto &path: not_relative) {
+ for (const auto &path : not_relative) {
SCOPED_TRACE(path);
EXPECT_FALSE(PosixSpec(path).IsRelative());
}
llvm::StringRef is_relative[] = {
- ".",
- "./",
- ".///",
- "a",
- "./a",
- "./a/",
- "./a/",
- "./a/b",
- "./a/b/",
- "../foo",
- "foo/bar.c",
- "./foo/bar.c"
- };
- for (const auto &path: is_relative) {
+ ".", "./", ".///", "a", "./a", "./a/",
+ "./a/", "./a/b", "./a/b/", "../foo", "foo/bar.c", "./foo/bar.c"};
+ for (const auto &path : is_relative) {
SCOPED_TRACE(path);
EXPECT_TRUE(PosixSpec(path).IsRelative());
}
@@ -421,7 +394,6 @@ TEST(FileSpecTest, Match) {
EXPECT_TRUE(Match("", "/foo/bar"));
EXPECT_TRUE(Match("", ""));
-
}
TEST(FileSpecTest, TestAbsoluteCaching) {
@@ -534,3 +506,15 @@ TEST(FileSpecTest, TestGetComponents) {
EXPECT_EQ(file_spec.GetComponents(), pair.second);
}
}
+
+TEST(FileSpecTest, TestChecksum) {
+ Checksum checksum({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15});
+ FileSpec file_spec("/foo/bar", FileSpec::Style::posix, checksum);
+ EXPECT_TRUE(static_cast<bool>(file_spec.GetChecksum()));
+ EXPECT_EQ(file_spec.GetChecksum(), checksum);
+
+ FileSpec copy = file_spec;
+
+ EXPECT_TRUE(static_cast<bool>(copy.GetChecksum()));
+ EXPECT_EQ(file_spec.GetChecksum(), copy.GetChecksum());
+}
|
Depends on #71456. I considered an alternative where we don't store the A checksum is 16 bytes. Please let me know if you think that the overhead of that is too much and outweighs the churn and code duplication to adopt a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You pushed a commit from a different PR btw
explicit FileSpec(llvm::StringRef path, Style style = Style::native, | ||
const Checksum &checksum = {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the doxygen
I personally like the current implementation, I think it makes sense to have the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with comments.
lldb/include/lldb/Utility/FileSpec.h
Outdated
#include "lldb/Utility/ConstString.h" | ||
|
||
#include "llvm/ADT/StringRef.h" | ||
#include "llvm/Support/FileSystem.h" | ||
#include "llvm/Support/FormatVariadic.h" | ||
#include "llvm/Support/MD5.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you shouldn't need this since it's already included in Checksum.h
protected: | ||
// Convenience method for setting the file without changing the style. | ||
void SetFile(llvm::StringRef path); | ||
|
||
/// Called anytime m_directory or m_filename is changed to clear any cached | ||
/// state in this object. | ||
void PathWasModified() { | ||
m_checksum = Checksum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to recompute the checksum if the path was changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the FileSpec is a system independent path description (i.e. the file doesn't need to exist on the host) there's no way to recompute it automatically, although the user can specify a new checksum if the path changes.
So I debugged lldb with lldb and adding a counter into the FileSpec constructor and came up a max of 721833 files after trying to set a breakpoint in Driver.cpp, so that storage is about 16.5MB which isn't too bad. This goes up to 27.5MB with the FileSpec being 40 bytes, so not too bad. Hopefully this won't overwhelm some larger targets in LLDB memory wise |
@clayborg I think it would be worth having that stat for FileSpecs in-tree so everyone can have access to it using |
I agree. Easy to add a static variable into FileSpec to do the counting and then hookup to stats. Might be nice to make a mix in class that any internal objects can inherit from that wouldn't change the size of any objects, but on construction would add to a count, and on destruction would decrement. It would be nice to track FileSpec, Symbol and other common objects in a debugger process. |
Store a Checksum in FileSpec. Its purpose is to store the MD5 hash that was added to the DWARF 5 line table. This increases the size of a FileSpec from 24 to 40 bytes. The alternative is to introduce a new SupportFile abstraction for a FileSpec + Checksum but that would require a corresponding SupportFileList class. During review we decided that wasn't worth it, but that's something we can revisit in the future.
1882662
to
0a4b349
Compare
Store a Checksum in FileSpec. Its purpose is to store the MD5 hash added
to the DWARF 5 line table.