-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[libSyntax] Add a unsafe but fast SyntaxDataRef version of SyntaxData #36250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instead of having a heap-allocated RefCountedBox to store a SyntaxData's parent, reference-count SyntaxData itself. This has a couple of advantages: - When passing SyntaxData around, only a pointer needs to be passed instead of the entire struct contents. This is faster. - We can later introduce a SyntaxDataRef, which behaves similar to SyntaxData, but delegates the responsibility that the parent stays alive to the user. While sacrificing guaranteed memory safety, this means that SyntaxData can then be stack-allocated without any ref-counting overhead.
In contrast to SyntaxData, SyntaxDataRef is not memory-safe, but designed to be fast. In particular, the following guarantees from SyntaxData are being dropped: - SyntaxDataRef does not retain the SyntaxArena containing its RawSyntax. The user of SyntaxDataRef has to provide that guarantee. However, that's usually pretty easily done by just retaining the SyntaxArena of the tree's root node. - The parent of a SyntaxDataRef must outlive the child node. This is the more tricky constraint, but if a tree is just walked top to bottom with nodes stored on the stack, this is given by the way the stack is being unrolled.
@swift-ci Please smoke test |
I feel this class hierarchy is safer. WDYT? class SyntaxDataBase {
const AbsoluteRawSyntax AbsoluteRaw;
protected:
SyntaxDataBase(const AbsoluteRawSyntax & AbsoluteRaw)
public:
// ... accessors to 'AbsoluteRaw'
}
class SyntaxDataRef: public SyntaxDataBase {
const SyntaxDataRef *Parent;
SyntaxDataRef(const AbsoluteRawSyntax &AbsoluteRaw, const SyntaxDataRef *Parent)
public:
const SyntaxDataRef *getParent() const;
}
class SyntaxData: public llvm::ThreadSafeRefCountedBase<SyntaxData>, public SyntaxDataBase {
RC<const SyntaxData> Parent;
RC<SyntaxArena> Arena;
SyntaxData(const AbsoluteRawSyntax &AbsoluteRaw, const RC<const SyntaxData> &Parent)
SyntaxData(const AbsoluteRawSyntax &AbsoluteRaw)
public:
RC<const SyntaxData> getParent() const;
} |
The beauty of the the current design is that upcasting Foreshadowing to some future PRs, I’ll have a I agree that your design is definitely safer, but would argue that the unsafe bits are pretty localized and I wouldn’t like to loose the property described above. Do you agree or would you argue in favor of safety? |
Zero cost demotion is attractive. But I'm still not convinced that's worth it. |
It’s a fair point. Especially since we’re not using it (and I don’t see that we need to for the parsing parsing). Maybe we should go with your safer approach for now and migrate to this if it turns out that
/// If \c this node has a child at \p Index, write the child's \c
/// SyntaxDataRef to \p DataMem and return \c true. If no child exists at \p
/// Index, leave \p DataMem untouched and return \c false.
bool getChildRef(AbsoluteSyntaxPosition::IndexInParentType Index,
SyntaxDataRef *DataMem) const {
auto AbsoluteRaw = getAbsoluteRaw().getChild(Index);
if (AbsoluteRaw) {
new (DataMem) SyntaxDataRef(*AbsoluteRaw,
/*Parent=*/const_cast<SyntaxDataRef *>(this));
return true;
} else {
return false;
}
} |
Thinking about it some more, I think the current design has two issues (please correct me if you see more drawbacks @rintaro):
I’m personally not too fuzzed about the manual ref-counting of And I think we can eliminate all the issues around With this, I think I would like to keep the zero-cost demotion around as an option and stick with the current design. |
@swift-ci Please smoke test |
Optional<SyntaxData> getPreviousNode() const; | ||
#ifndef NDEBUG | ||
virtual bool isRef() const { return true; } | ||
virtual ~SyntaxDataRef() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a dumb C++ question. Could you explain why this is in #ifndef
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it’s #ifndef NDEBUG
everywhere else as well. I always assumed this was to make the default a debug build.
If you use #ifdef DEBUG
, the default is a non-debug build unless you define DEBUG
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't clear. I meant why the destructor must be guarded by the #ifndef
?
/// Get the child at \p Index if it exists. If the node does not have a child | ||
/// at \p Index, return \c None. Asserts that \p Index < \c NumChildren | ||
Optional<AbsoluteRawSyntax> | ||
getChild(AbsoluteSyntaxPosition::IndexInParentType Index) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return isNull()
AbsoluteRawSyntax instead of Optional
? If not, what the purpose of isNull()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the isNull()
AbsoluteRawSyntax
is so we can construct a null SyntaxDataRef
(which has a AbsoluteRawSyntax
member). We need this for the getChildRef
method, I already described here, which is used as follows:
SyntaxRef result;
someNode.getChildRef(1, result);
Technically, result
just needs to be uninitialized memory. If you know how to create uninitialized stack memory, we can remove isNull
altogether.
I though about returning a null AbsoluteRawSyntax
here, but decided against it, because I expect there will be a whole category of bugs where we would be accessing an optional child without check if it is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m looking into whether we can make Optional<AbsoluteRawSyntax>
a zero-cost wrapper around AbsoluteRawSyntax
whose null value is internally backed by isNull()
and make isNull()
private. Then we would have the compile time safety of Optional
but the runtime behavior as if we returned an isNull()
AbsoluteRawSyntax
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the isNull
function and made Optional<AbsoluteRawSyntax>
a zero-cost wrapper. Since the change wasn't trivial (~250 LOC), I created a follow-up PR for it here.
Merged as part of #36313. |
In contrast to
SyntaxData
,SyntaxDataRef
is not memory-safe, but designed to be fast. In particular, the following guarantees fromSyntaxData
are being dropped:SyntaxDataRef
does not retain theSyntaxArena
containing itsRawSyntax
. The user ofSyntaxDataRef
has to make sure the arena stays alive while theSyntaxDataRef
is being used. However, that's usually pretty easily done by just retaining theSyntaxArena
of the tree's root node.SyntaxDataRef
must outlive the child node. This is the more tricky constraint, but if a tree is just walked top to bottom with nodes stored on the stack, this is given by the way the stack is being unrolled.