Skip to content

[TypeChecker] Issue error message when accessing subscript using keyword #5537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Oct 30, 2016

Trying to access a subscript using foo.subscript(key) currently crashes the compiler. This PR creates an error message for it.

I actually wanted to provide a fixit replacing foo.subscript(key) with foo[key] but that would require knowledge about whether the result of foo.subscript will be called or not, which AFAIK is not present at the point where the member lookup was done.

Resolves SR-2575.

@ahoppen
Copy link
Member Author

ahoppen commented Oct 30, 2016

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 6a9364c740b326f657deb8d2317a085a4acfc731
Test requested by - @ahoppen

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 6a9364c740b326f657deb8d2317a085a4acfc731
Test requested by - @ahoppen

@CodaFi
Copy link
Contributor

CodaFi commented Oct 30, 2016

I'm not sure Sema is the right place to diagnose this. I think there is a way to do this in Parse and provide the fixit. Last time I took a look at this the problem I needed a backtracking scope in some awkward place or other to find the bounds of the bracketed expr, but it's been a while so maybe it will be easier this time

@ahoppen
Copy link
Member Author

ahoppen commented Oct 30, 2016

Are subscripts actually fundamentally different than functions? If not, we could maybe just allow foo.subscript(key) and foo.subscript to refer to the subscript function itself.

@CodaFi
Copy link
Contributor

CodaFi commented Oct 30, 2016

They are not valid member refs in the surface language so I don't think we should try to translate them to anything; they should just be banned by diagnostic like you've got here. Take a look at how I disabled .dynamicType in Parse and we should be able to diagnose thereabouts. As an aside you should run ./swift/utils/resolve-crashes.py

@rintaro
Copy link
Member

rintaro commented Oct 31, 2016

I don't think this is the right way to go.
Please consider:

struct S {
    func `subscript`(x: Int) { }
}
let s = S()
s.subscript(x: 1)

This is definitely a valid Swift code.

Instead, I think what we should do is to change the "internal" identifier for subscript members.

@ahoppen ahoppen force-pushed the SR-2575-subscript-reference branch 2 times, most recently from 4ad3cb5 to 97db331 Compare October 31, 2016 10:01
@ahoppen
Copy link
Member Author

ahoppen commented Oct 31, 2016

Looking at the issues above, I get the feeling that the root cause is that subscripts are internally named subscript which is a valid identifier (even though it may need to be used in backticks). Any barrier we try to build to prevent this feels artificial. Wouldn't it be better to internally use an identifier for he subscript function that isn't available in the surface language like $subscript (similar to how $match is used in switch statements)? Without having implemented it so far, I think this should:

  • Make foo.subscript produce a diagnostic that foo has no member called subscript (which it really hasn't)
    • We may add a slightly modified diagnostic, hinting to access the subscript using []
    • I don't see an easy way for a fixit but would guess that it's not a common error anyway
  • Allow the above code sample to be written func subscript(x: Int) (no need for backticks anymore)
    • To avoid confusion, we could add a warning here that suggests removing the func keyword or wrapping subscript in backticks to silence the warning. But that also would be a purely cosmetic warning

I'll try this path unless somebody knows for sure that this is a no-go, because subscripts are by construction required to be named subscript (in SILGen maybe?)

@jrose-apple
Copy link
Contributor

@DougGregor and I were talking about this recently (for subscript, init, and deinit). Doug's idea was that we'd have special DeclNames for these special names, but continue using the usual Identifiers. We might have to change a few places in the compiler still using Identifier instead of DeclName, but there probably aren't many of those.

Oh, and the reason why you can't just use a subscript (or property) as a method is because it's unclear whether you mean the getter or the setter. I suppose you could figure that out from the argument list, but it's still not something we're interested in something that would have to go through the Swift Evolution Process if it were to change.

@rudkx
Copy link
Contributor

rudkx commented Oct 31, 2016

These used to fail look-up, e.g. in 7.3:

error: expected member name following '.'
String(str.characters.subscript(
                      ^

I wonder what changed here in 8.0 that results in us now finding the member function?

@rintaro
Copy link
Member

rintaro commented Oct 31, 2016

@rudkx It's SE-0071

@ahoppen
Copy link
Member Author

ahoppen commented Oct 31, 2016

@jrose-apple Just to make sure I understand you correctly: So you propose to have a special DeclNames for subscripts that don't carry an identifier but instead have some kind of flag set to identify them as subscript references?

@jrose-apple
Copy link
Contributor

Pretty much. My first thought had been to still have them carry an Identifier so that they would print right by default, but @DougGregor points out that DeclNames already control their own printing, and not doing that will help flush out bugs faster.

This is a preparation to use special DeclNames that don't carry
identifiers for subscripts, initializers, and destructors.
Once special DeclNames are used, their identity can no longer be
determined by the identifier they carry and thus cannot be encoded as
an opaque pointer anymore.
Use a special DeclName that doesn't have an identifier as the name of
subscripts. This prevents lookup issues where a subscript decl is found
when looking for a function named 'subscript'.

This fixes SR-2575
Provide a real error message when trying to access a subscript using
foo.subscript(key) instead of a generic one, saying that foo has no
member named 'subscript'.
@ahoppen
Copy link
Member Author

ahoppen commented Dec 8, 2016

Sorry for the delay on this PR, I was rather busy with uni last month and the changes were bigger than I expected.

In addition to this PR, PR #104 must be merged in the LLDB repo at the same time to apply the changes from Identifiers to DeclNames and to consider special DeclNames.

Changes

I changed a lot of places in the compiler to use DeclNames instead of Identifiers to represent a declaration's name, so that special a special DeclName, that is not backed by an identifier, can be used for subscripts. I plan on introducing special DeclNames for init and deinit in later pull requests.

Thus lookup for an identifier with name "subscript" no longer returns the subscript declaration, fixing the underlying issue of SR-2575.

I should note that since I changed the return value of ValueDecl::getBaseName() from Identifier to DeclName this PR will probably affect other PRs but the necessary changes to get them working again should be trivial.

Possible issues

I currently see three questions/issues that may need to be fixed to implement everything properly. All of these issues also apply to init and deinit once we represent them by special DeclNames as well.

Serialization

Serialization of modules (i.e. the thing done by DeclTableInfo in ModuleFile.cpp and Serialization.cpp, and BaseNameToEntitiesTableWriterInfo in SwiftLookupTable.cpp) currently serializes names as strings to disk. To avoid clashes with function of name subscript, I chose to represent subscripts as $subscript. Since this is not a valid identifier in the surface language, no collisions can occur.

I actually wanted to serialize the DeclName as its kind possibly followed by the identifier but couldn't work out how the serialization process works.

Since this changes the way modules are serialized, module caches may need to be emptied before compiling this PR.

Representation of subscripts in SIL

Subscripts are AFAIK currently represented in SIL as <Type>.subscript!(getter|setter). If subscripts no longer occupy the subscript identifier, I think this can lead to ambiguity if someone also defines a function named subscript (but am not quite sure). So, should subscripts be represented by a different syntax (we could again use the $subscript method even though it's not pretty)? Since I don't really know how SIL works, could maybe someone have a look at this.

Name mangling of subscripts

Currently, subscripts receive a mangled name of a getter function with name subscript. I don't really know anything about how the name mangling is supposed to work but if subscripts no longer occupy the subscript identifier in the compiler, it feels weird to have subscript baked into the mangled name and it would feel cleaner to me to have a special flag for them in the mangled name. Again, could maybe someone have a look at this.

@jrose-apple
Copy link
Contributor

Wow! I didn't expect you to go all the way on this.

I'll look at this more later, but I don't think we want to change the behavior of getName. Instead, we should just migrate that over to the existing getBaseName where it makes sense, and delete the old thing / sink it down to VarDecl or similar where we always have a real name. (I assume you got these two mixed up.)

@jrose-apple
Copy link
Contributor

Also, I think we do want to continue being able to treat DeclName as an opaque pointer value; we should be able to do that by jamming the "special kind" data in where an Identifier would have gone. And ObjCSelector shouldn't be net-affected at all by this change; it's never going to have one of these special kinds in it.

@ahoppen
Copy link
Member Author

ahoppen commented Dec 10, 2016

Maybe a short rationale for the changes I did and why I did them:

ValueDecl already stores its name as a DeclName but since all DelcNames currently have an underlying Identifier it was safe to return an Identifier in getName. Now that there are special DeclNames, I decided to whitelist a few hand-picked classes (TypeDecl, ParamDecl, EnumElementDecl) that are still guaranteed to have an identifier because their initializers take an Identifier. To these, I added a getIdentifier method that behaves exactly as the current getName method (we could maybe rename it to getName to keep it more consistent with the current API, I chose to rename it to highlight that an Identifier and no DeclName is returned). I deliberately chose not to include VarDecls in the whitelisted list since I thought that the current $match variable in pattern matching and $generator variable in foreach statements might actually be better represented by special DeclNames. (A quick scan tells me that there also seems to be $error and $defer which might also be candidates for special DeclNames in the future).

Whenever a ValueDecl can be guaranteed to have an identifier because of special knowledge, the identifier can be retrieved as getBaseName().getIdentifier() which will crash if the DeclName is special (e.g. pieces of an ObjC selector, operator precedence lookup). This needs to be currently done in 15 places in the compiler which I think is reasonable.

As to the opaque pointer value of DeclNames, I must admit that I didn't really understand what it is used for (one place used SmallPtrSet to store a set of DeclNames but that could be easily changed to SmallSet) so I decided to delete it. Could you maybe enlighten me on the purpose of opaque pointer values?

@jrose-apple
Copy link
Contributor

Could you maybe enlighten me on the purpose of opaque pointer values?

It's mostly an efficiency thing. A usual program has plenty of DeclNames, word-sized integers are faster to manipulate, etc etc. It also means DeclNames can themselves be put into PointerIntPairs, though I guess not if we use up all the spare bits.

I realize I misinterpreted your change; I thought getBaseName already returned a DeclName, and that you had actually changed getName. Removing it in favor of targeted getIdentifier seems reasonable.

This is of course a huge change and so it will take me/us a while to get through reviewing it. But thank you!

@ahoppen
Copy link
Member Author

ahoppen commented Dec 14, 2016

The only places this was used, though, was for the ObjC name of ObjC attributes and as an intermediate data structure in ImportName.cpp:mergeOverriddenNames. Both don't seem to be hotspots to me.

OK, should I try to keep the merge conflicts at a minimum and push new commits every now and then or would you like to review the commits in one chunk?

@jrose-apple
Copy link
Contributor

I'm going to review commit-by-commit anyway for a change like this, so maybe it makes sense to split it up into multiple PRs.

I really do want to keep DeclName to a single pointer, though. I think it's doable.

@ahoppen
Copy link
Member Author

ahoppen commented Dec 14, 2016

OK, then I'd suggest the following procedure:

  • We keep this PR as a reference for the general scheme of things but close it
  • I create a new PR that only contains the first commit (and will perform a final test that it works on its own before that, expect it in two days or so)
  • I keep that single-commit PR up-to-date and resolve merge conflicts as quickly as possible
  • Once the first PR is merged, I create a second one for the next commit and we do the same again
  • When we reach the commit where I removed the opaque pointer logic, you implement a new logic for it since you seem to have a general idea on how to do it and I don't ;-)

@jrose-apple
Copy link
Contributor

Sounds good to me. Thanks again for working on this, Alex!

@ahoppen
Copy link
Member Author

ahoppen commented Dec 15, 2016

First commit -> #6300
Second commit -> #6305

@ahoppen ahoppen deleted the SR-2575-subscript-reference branch June 26, 2017 17:30
MaxDesiatov pushed a commit that referenced this pull request Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants