Skip to content

Support hermetic index store data via path remappings #4207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion clang/include/clang/Index/IndexDataStore.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/Chrono.h"
#include <functional>
#include <map>
#include <memory>
#include <string>

Expand All @@ -26,9 +27,14 @@ class IndexDataStore {
~IndexDataStore();

static std::unique_ptr<IndexDataStore>
create(StringRef IndexStorePath, std::string &Error);
create(StringRef IndexStorePath,
std::map<std::string, std::string, std::greater<std::string>>
PrefixMap,
Copy link

@bnbarham bnbarham Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to extract this out into a separate data structure. IIUC there's no need for this to be a map other than that it's convenient (though not really) to pass around. So if we had a separate class for this it could internally just store a SmallVector of <struct with original and remap path>.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and as an input to the API I think conceptually it should be an array of string pairs. Whether then the underlying implementation uses a std::map or a llvm::StringMap for the lookups, it would be an implementation detail.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't think a "lookup" is ever actually performed, we only ever iterate over them all and check whether there's a prefix match (hence my suggestion of a SmallVector internally).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally those wouldn't be maps either 😅. So these are checked in alphabetical order rather than the order they were given on the CLI?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's my understanding. I think it would make sense to clean up, but not sure it's in scope for here. Thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it's worth making whatever you add usable in those cases, but we'd want to fix that upstream anyway (so no, not in scope).

std::string &Error);

StringRef getFilePath() const;
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>>
getPrefixMap() const;
bool foreachUnitName(bool sorted,
llvm::function_ref<bool(StringRef unitName)> receiver);

Expand Down
6 changes: 5 additions & 1 deletion clang/include/clang/Index/IndexUnitReader.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/Chrono.h"
#include <map>

namespace clang {
namespace index {
Expand All @@ -29,9 +30,12 @@ class IndexUnitReader {

static std::unique_ptr<IndexUnitReader>
createWithUnitFilename(StringRef UnitFilename, StringRef StorePath,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap,
std::string &Error);
static std::unique_ptr<IndexUnitReader>
createWithFilePath(StringRef FilePath, std::string &Error);
createWithFilePath(StringRef FilePath,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap,
std::string &Error);

static Optional<llvm::sys::TimePoint<>>
getModificationTimeForUnit(StringRef UnitFilename, StringRef StorePath,
Expand Down
10 changes: 9 additions & 1 deletion clang/include/clang/Index/IndexUnitWriter.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "clang/Basic/LLVM.h"
#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/SmallString.h"
#include <map>
#include <string>
#include <vector>

Expand Down Expand Up @@ -57,6 +58,8 @@ class IndexUnitWriter {
std::string TargetTriple;
std::string WorkDir;
std::string SysrootPath;
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>>
PrefixMap;
std::function<writer::ModuleInfo(writer::OpaqueModule,
SmallVectorImpl<char> &Scratch)> GetInfoForModuleFn;
struct FileInclude {
Expand Down Expand Up @@ -87,6 +90,8 @@ class IndexUnitWriter {
/// \param MainFile the main file for a compiled source file. This should be
/// null for PCH and module units.
/// \param IsSystem true for system module units, false otherwise.
/// \param PrefixMap Mapping to use to standardize file paths to make them
/// hermetic/reproducible. This applies to all paths emitted in the unit file.
IndexUnitWriter(FileManager &FileMgr,
StringRef StorePath,
StringRef ProviderIdentifier, StringRef ProviderVersion,
Expand All @@ -98,6 +103,8 @@ class IndexUnitWriter {
bool IsDebugCompilation,
StringRef TargetTriple,
StringRef SysrootPath,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>>
PrefixMap,
writer::ModuleInfoWriterCallback GetInfoForModule);
~IndexUnitWriter();

Expand All @@ -120,7 +127,8 @@ class IndexUnitWriter {
Optional<bool> isUnitUpToDateForOutputFile(StringRef FilePath,
Optional<StringRef> TimeCompareFilePath,
std::string &Error);
static void getUnitNameForAbsoluteOutputFile(StringRef FilePath, SmallVectorImpl<char> &Str);
static void getUnitNameForAbsoluteOutputFile(StringRef FilePath, SmallVectorImpl<char> &Str,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> &PrefixMap);
static bool initIndexDirectory(StringRef StorePath, std::string &Error);

private:
Expand Down
20 changes: 16 additions & 4 deletions clang/include/indexstore/IndexStoreCXX.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallString.h"
#include <map>

namespace indexstore {
using llvm::ArrayRef;
Expand Down Expand Up @@ -101,10 +102,19 @@ class IndexStore {
friend class IndexUnitReader;

public:
IndexStore(StringRef path, std::string &error) {
IndexStore(StringRef path,
std::map<std::string, std::string, std::greater<std::string>> prefix_map,
std::string &error) {
llvm::SmallString<64> buf = path;
indexstore_error_t c_err = nullptr;
obj = indexstore_store_create(buf.c_str(), &c_err);
std::vector<std::string> mapping_storage;
std::vector<const char *> c_strs;
for (const auto &Mapping : prefix_map)
mapping_storage.push_back(Mapping.first + "=" + Mapping.second);
for (const auto &Str : mapping_storage)
c_strs.push_back(Str.c_str());
obj = indexstore_store_create_with_prefix_mapping(buf.c_str(), c_strs.data(),
c_strs.size(), &c_err);
if (c_err) {
error = indexstore_error_get_description(c_err);
indexstore_error_dispose(c_err);
Expand All @@ -119,8 +129,10 @@ class IndexStore {
indexstore_store_dispose(obj);
}

static IndexStoreRef create(StringRef path, std::string &error) {
auto storeRef = std::make_shared<IndexStore>(path, error);
static IndexStoreRef create(StringRef path,
std::map<std::string, std::string, std::greater<std::string>> prefix_map,
std::string &error) {
auto storeRef = std::make_shared<IndexStore>(path, prefix_map, error);
if (storeRef->isInvalid())
return nullptr;
return storeRef;
Expand Down
5 changes: 5 additions & 0 deletions clang/include/indexstore/indexstore.h
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,11 @@ typedef void *indexstore_t;
INDEXSTORE_PUBLIC indexstore_t
indexstore_store_create(const char *store_path, indexstore_error_t *error);

INDEXSTORE_PUBLIC indexstore_t
indexstore_store_create_with_prefix_mapping(
const char *store_path, const char **PrefixMappings,
size_t NumMappings,indexstore_error_t *error);

INDEXSTORE_PUBLIC void
indexstore_store_dispose(indexstore_t);

Expand Down
57 changes: 44 additions & 13 deletions clang/lib/Index/IndexUnitReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,18 @@ using namespace llvm;

namespace {

static std::string remapPath(std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>>
&PrefixMap, llvm::StringRef Path) {
if (PrefixMap.empty())
return Path.str();

SmallString<256> P = Path;
for (const auto &Entry : PrefixMap)
if (llvm::sys::path::replace_path_prefix(P, Entry.first, Entry.second))
break;
return P.str().str();
}

typedef function_ref<bool(const IndexUnitReader::DependencyInfo &)> DependencyReceiver;
typedef function_ref<bool(const IndexUnitReader::IncludeInfo &)> IncludeReceiver;

Expand All @@ -43,14 +55,15 @@ class IndexUnitReaderImpl {
bool IsSystemUnit;
bool IsModuleUnit;
bool IsDebugCompilation;
StringRef WorkingDir;
StringRef OutputFile;
StringRef SysrootPath;
SmallString<128> WorkingDir;
SmallString<128> OutputFile;
SmallString<128> SysrootPath;
StringRef ModuleName;
SmallString<128> MainFilePath;
StringRef Target;
std::vector<FileBitPath> Paths;
StringRef PathsBuffer;
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap;

struct ModuleInfo {
unsigned NameOffset;
Expand All @@ -60,15 +73,16 @@ class IndexUnitReaderImpl {
StringRef ModuleNamesBuffer;

bool init(std::unique_ptr<MemoryBuffer> Buf, sys::TimePoint<> ModTime,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap,
std::string &Error);

StringRef getProviderIdentifier() const { return ProviderIdentifier; }
StringRef getProviderVersion() const { return ProviderVersion; }

sys::TimePoint<> getModificationTime() const { return ModTime; }
StringRef getWorkingDirectory() const { return WorkingDir; }
StringRef getOutputFile() const { return OutputFile; }
StringRef getSysrootPath() const { return SysrootPath; }
StringRef getWorkingDirectory() const { return WorkingDir.str(); }
StringRef getOutputFile() const { return OutputFile.str(); }
StringRef getSysrootPath() const { return SysrootPath.str(); }
StringRef getTarget() const { return Target; }

StringRef getModuleName() const { return ModuleName; }
Expand All @@ -88,6 +102,10 @@ class IndexUnitReaderImpl {
return PathsBuffer.substr(Offset, Size);
}

std::string getAndRemapPathFromBuffer(size_t Offset, size_t Size) {
return remapPath(PrefixMap, getPathFromBuffer(Offset, Size));
}

void constructFilePath(SmallVectorImpl<char> &Path, int PathIndex);

StringRef getModuleName(int ModuleIndex);
Expand Down Expand Up @@ -205,9 +223,9 @@ class IndexUnitBitstreamVisitor : public BitstreamVisitor<IndexUnitBitstreamVisi
break;
case UNIT_PATH_BUFFER:
Reader.PathsBuffer = Blob;
Reader.WorkingDir = Reader.getPathFromBuffer(WorkDirOffset, WorkDirSize);
Reader.OutputFile = Reader.getPathFromBuffer(OutputFileOffset, OutputFileSize);
Reader.SysrootPath = Reader.getPathFromBuffer(SysrootOffset, SysrootSize);
Reader.WorkingDir = Reader.getAndRemapPathFromBuffer(WorkDirOffset, WorkDirSize);
Reader.OutputFile = Reader.getAndRemapPathFromBuffer(OutputFileOffset, OutputFileSize);
Reader.SysrootPath = Reader.getAndRemapPathFromBuffer(SysrootOffset, SysrootSize);

// now we can populate the main file's path
Reader.constructFilePath(Reader.MainFilePath, MainPathIndex);
Expand Down Expand Up @@ -270,9 +288,12 @@ class IndexUnitBlockBitstreamVisitor : public BitstreamVisitor<IndexUnitBlockBit
} // anonymous namespace

bool IndexUnitReaderImpl::init(std::unique_ptr<MemoryBuffer> Buf,
sys::TimePoint<> ModTime, std::string &Error) {
sys::TimePoint<> ModTime,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap,
std::string &Error) {
this->ModTime = ModTime;
this->MemBuf = std::move(Buf);
this->PrefixMap = PrefixMap;
llvm::BitstreamCursor Stream(*MemBuf);

if (Stream.AtEndOfStream()) {
Expand Down Expand Up @@ -374,6 +395,13 @@ void IndexUnitReaderImpl::constructFilePath(SmallVectorImpl<char> &PathBuf,
sys::path::append(PathBuf,
getPathFromBuffer(Path.Dir.Offset, Path.Dir.Size),
getPathFromBuffer(Path.Filename.Offset, Path.Filename.Size));
if (Path.PrefixKind == UNIT_PATH_PREFIX_NONE && !PrefixMap.empty()) {
SmallString<256> PathStr;
PathStr.assign(PathBuf);
std::string Remapped = remapPath(PrefixMap, PathStr.str());
PathBuf.clear();
PathBuf.append(Remapped.begin(), Remapped.end());
}
}

StringRef IndexUnitReaderImpl::getModuleName(int ModuleIndex) {
Expand All @@ -391,15 +419,18 @@ StringRef IndexUnitReaderImpl::getModuleName(int ModuleIndex) {
std::unique_ptr<IndexUnitReader>
IndexUnitReader::createWithUnitFilename(StringRef UnitFilename,
StringRef StorePath,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap,
std::string &Error) {
SmallString<128> PathBuf = StorePath;
appendUnitSubDir(PathBuf);
sys::path::append(PathBuf, UnitFilename);
return createWithFilePath(PathBuf.str(), Error);
return createWithFilePath(PathBuf.str(), PrefixMap, Error);
}

std::unique_ptr<IndexUnitReader>
IndexUnitReader::createWithFilePath(StringRef FilePath, std::string &Error) {
IndexUnitReader::createWithFilePath(StringRef FilePath,
std::map<llvm::StringRef, llvm::StringRef, std::greater<llvm::StringRef>> PrefixMap,
std::string &Error) {
int FD;
std::error_code EC = sys::fs::openFileForRead(FilePath, FD);
if (EC) {
Expand Down Expand Up @@ -435,7 +466,7 @@ IndexUnitReader::createWithFilePath(StringRef FilePath, std::string &Error) {

std::unique_ptr<IndexUnitReaderImpl> Impl(new IndexUnitReaderImpl());
bool Err = Impl->init(std::move(*ErrOrBuf), FileStat.getLastModificationTime(),
Error);
PrefixMap, Error);
if (Err)
return nullptr;

Expand Down
Loading