Skip to content

[libSyntax] Add a reference counted version of OwnedString #18677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 14, 2018

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Aug 13, 2018

We cannot use unowned strings for token texts of incrementally parsed syntax trees since the source buffer to which reused nodes refer will have been freed for reused nodes. Always copying the token text whenever OwnedString is passed is too expensive. A reference counted copy of the string allows us to keep the token's string alive across incremental parses while eliminating unnecessary copies.

@ahoppen
Copy link
Member Author

ahoppen commented Aug 13, 2018

@swift-ci Please smoke test

@ahoppen ahoppen force-pushed the ref-counted-owned-string branch from 6a34c28 to f38e33e Compare August 13, 2018 18:52
@harlanhaskins
Copy link
Contributor

After this change, where do we use the Copied representation? It may be beneficial to move OwnedString to a purely refcounted/unowned model, using llvm::IntrustiveRefCntPtr. Doing that, we could also use llvm::TrailingObjects to move the string allocation into the owned representation directly, avoiding another level of indirection.

@ahoppen ahoppen force-pushed the ref-counted-owned-string branch from f38e33e to f9e417c Compare August 13, 2018 21:33
@ahoppen
Copy link
Member Author

ahoppen commented Aug 13, 2018

@swift-ci Please smoke test

assert(substring && "expected successful malloc of copy");
public:
static TextOwner *make(StringRef Text) {
auto size = totalSizeToAlloc<char>(Text.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be NUL-terminated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since StringRef already contains a length, there is no need to NUL terminate them.

/// \c OwnedString will not take ownership of that buffer and will assume that
/// the buffer outlives its lifetime.
static OwnedString makeUnowned(StringRef Str) {
return OwnedString(Str.data(), /*OwnedPtr=*/nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to lose the length, which might be incorrect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. That carried over from the old implementation. Fixed now.

return makeUnowned(Str);
} else {
llvm::IntrusiveRefCntPtr<TextOwner> OwnedPtr(TextOwner::make(Str));
return OwnedString(StringRef(OwnedPtr->getText(), Str.size()), OwnedPtr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: using std::move here will avoid a retain/release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thanks!

We cannot use unowned strings for token texts of incrementally parsed
syntax trees since the source buffer to which reused nodes refer will
have been freed for reused nodes. Always copying the token text whenever
OwnedString is passed is too expensive. A reference counted copy of the
string allows us to keep the token's string alive across incremental
parses while eliminating unnecessary copies.
@ahoppen ahoppen force-pushed the ref-counted-owned-string branch from f9e417c to ac512d4 Compare August 13, 2018 22:38
@ahoppen
Copy link
Member Author

ahoppen commented Aug 13, 2018

@swift-ci Please smoke test

presence, /*Arena=*/nullptr, nodeId);
value = swift::RawSyntax::make(
tokenKind, swift::OwnedString::makeRefCounted(text), leadingTrivia,
trailingTrivia, presence, /*Arena=*/nullptr, nodeId);
Copy link
Member

@rintaro rintaro Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about make RawSyntax::make() to receive StringRef, then construct OwnedString in it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you want to control Owned/Unowned in call site. OK then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was my idea behind it.

@@ -454,7 +454,8 @@ struct MappingTraits<swift::syntax::TriviaPiece> {
% else:
StringRef text;
in.mapRequired("value", text);
return swift::syntax::TriviaPiece(kind, text);
return swift::syntax::TriviaPiece(
kind, swift::OwnedString::makeRefCounted(text));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for TriviaPiece (receive StringRef).


memcpy(substring, Data, Length);
substring[Length] = '\0';
const char *getText() const { return getTrailingObjects<char>(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any specific reasons to put char* as a trailing object instead of a regular member?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows us to allocate the storage with a single allocation instead of two which should be less overhead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, OK. The patch LGTM!

@harlanhaskins
Copy link
Contributor

LGTM now!

@ahoppen ahoppen merged commit 79e9113 into swiftlang:master Aug 14, 2018
@ahoppen ahoppen deleted the ref-counted-owned-string branch August 14, 2018 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants