Skip to content

Fix caches of info that depends on inherited classes #2425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 29, 2017

Conversation

odersky
Copy link
Contributor

@odersky odersky commented May 13, 2017

Change from the ground up the way we treat inherited information. We get rid of:

  • superId structures,
  • fingerprints
  • Frozen and FullyCompleted flags,
  • restrictions on when we can enter new symbols in scopes
  • most manual cache invalidation

We treat systematically the problem of cache dependencies by tracking reverse dependencies with weak hashmaps. We need cooperation from phases to declare what kind of inherited info they can change.

There are two new kinds of caches:

  • MemberNames contains various kinds of member name sets indexed by name filter
  • BaseData contains baseClasses (i.e. base class linearization), and BaseClassSet, which
    is a sorted-array-backed set for fast subclass checking. BaseClassSets replace the
    superId maps we had before.

The PR fixes several problems with inherited info. This manifested itself in the past in a fingerprint problem when function specialization is turned on. But fingerprints are just where the problem arises first; it is also possible to hit problems with the other caches.

odersky added 7 commits May 12, 2017 13:39
Also introduce dummy onBehalfs as sentinels to avoid
the need for `null` or `Option`.
After erasure all basedata computations were marked as provisional
because Object has empty parents then. This caused slowdowns of
up to 100% (~100% was observed for stdlib).
odersky added 12 commits May 13, 2017 17:54
 - transformers caches as long as they are valid in new phase
 - avoid state in invalid dummy caches, so that they don't need
   to be put in contexts
 - refactor caches into traits and implementation classes
The lgoxi is much nicer without it.
No longer needed because we have proper cache invalidation now
It's wow subject to the same invalidation as base data caches
(except that there's no need for recursive invalidation of caches
in subclasses)
@odersky odersky changed the title WIP: Fix inherited caches Fix caches of info that depends on inherited classes May 14, 2017
@odersky odersky requested a review from DarkDimius May 14, 2017 09:29
(reverted from commit c1254ed)

Last data over whole legacy test suite (running time: 337s):

BaseData cache fills per phase:

  0, 670675, 0, 0, 2648, 0, 12, 0, 2, 548, 1153, 1189, 190, 115, 0, 0, 0, 0, 28, 68, 0, 0, 129, 267, 0, 0, 0, 0, 0, 4, 0, 38, 0, 3, 13, 1, 6857, 809, 167147, 59961, 671, 246, 5417, 169, 114, 615, 17, 0, 2, 0, 1753, 0, 27, 0, 0, 8032, 26338, 0, 0, 736, 0, 0, 0, 4851, 69224, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0

Cache invalidations: 80

final def isValid(implicit ctx: Context): Boolean = valid && isValidAt(ctx.phase)

def invalidate(): Unit =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing an (implicit ctx: Context)

@smarter
Copy link
Member

smarter commented May 14, 2017

Nice that we can unify logic for all these caches! Did you measure any performance difference compared to master?

@DarkDimius
Copy link
Contributor

@smarter, I'm reviewing this PR now and there are things that may affect performance. I'd propose to benchmark after review, but before merging.

Copy link
Contributor

@DarkDimius DarkDimius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very nice systematization!

The high level comment is that changesMembers is not checked on all the paths. I guess it is because checking it is very slow.
I'd suggest to add such checks guarder by a Config flag, as it may save us a lot of time in case we discover subtle bugs.

Otherwise(and modulo @smarter's comment) looks very good to me.

@@ -331,6 +339,8 @@ object Phases {
final def refChecked = myRefChecked // Phase is after RefChecks
final def symbolicRefs = mySymbolicRefs // Phase is after ResolveSuper, newly generated TermRefs should be symbolic
final def labelsReordered = myLabelsReordered // Phase is after LabelDefs, labels are flattened and owner chains don't mirror this
final def membersGroup = myMembersGroup // group id for phases where all symbols have the same non-private members
Copy link
Contributor

@DarkDimius DarkDimius May 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either the doc & implementation are out of sync~, or this is a bug.

The implementation counts how many times where parents changed, it does not use the actual phase-ids.
I've had a look through usages, this value seems to only be used for equality comparisons, so the "number of times members could have changed" would behave correctly, but I'd still propose to make it return a PhaseID to ease understanding.

Same with parentsGroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll change to phase ids, and use a more descriptive name.

val info1 = if (info != null) info else this.info
if (ctx.isAfterTyper && changedClassParents(info, info1, completersMatter = false))
assert(ctx.phase.changesParents, i"undeclared parent change at ${ctx.phase} for $this, was: $info, now: $info1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a check that also checks correctness of changesMembers assumption here?
It may be slow, so I'd propose to guard it with config flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is in enteredAfter. I think it would make little sense checking in info_= because typically we set the info with a clone of the previous scope and then add symbols later using enteredAfter.

@@ -1209,6 +1230,68 @@ object SymDenotations {

import util.LRUCache

// ----- caches -------------------------------------------------------

private[this] var myTypeParams: List[TypeSymbol] = null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to myTypeParamsCache for consistency?

private var memberNamesCache: MemberNames = MemberNames.None

private def memberCache(implicit ctx: Context): LRUCache[Name, PreDenotation] = {
if (myMemberCachePeriod != ctx.period) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose to check for inclusion instead of equality: !myMemberCachePeriod.contains(ctx.period)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that mini-phase driver runs with ctx.period set to multiple phases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But myMemberCachePeriod is always set to a single phase period, right?

}

private def baseTypeRefCache(implicit ctx: Context): BaseTypeRefMap = {
if (myBaseTypeRefCachePeriod != ctx.period &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, contains instead of equality.

super.info_=(tp)
if (changedClassParents(infoOrCompleter, tp, completersMatter = true))
invalidateBaseDataCache()
invalidateMemberNamesCache()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose to add a check similar to changedClassParents for members, which is guarder by a Config flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

member names are relatively infrequently needed, so I think we lose little to invalidate them more aggressively than strictly needed. Also, names will usually be entered later using enteredAfter.

myMemberFingerPrint.include(sym.name)
if (myMemberCache != null)
myMemberCache invalidate sym.name
if (myMemberCache != null) myMemberCache.invalidate(sym.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.

unforcedDecls.openForMutations.replace(prev, replacement)
if (myMemberCache != null)
myMemberCache invalidate replacement.name
if (myMemberCache != null) myMemberCache.invalidate(replacement.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here)

info.decls.openForMutations.unlink(sym)
myMemberFingerPrint = FingerPrint.unknown
if (myMemberCache != null) myMemberCache invalidate sym.name
if (myMemberCache != null) myMemberCache.invalidate(sym.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is caching private members worth it?

}

private def invalidateBaseDataCache() = {
baseDataCache.invalidate()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need both call to invalidate and assignment to an empty cache?

In copyCaches you copy over the refernce from previous cache. It means that invalidating a future cache will also invalidate an old cache, which is still correct.

Same in other invalidation methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It was originally to save space, but I don't think that matters in practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact no. It does matter. IsValid at only checks that the creation period is compatible with the current one. Invalidation does not affect the creation period. So we need another way to signal that the cache is nowhere valid. That's what the assignment does.

@odersky odersky removed the stat:wip label May 16, 2017
@felixmulder felixmulder merged commit 2b34272 into scala:master May 29, 2017
@allanrenucci allanrenucci deleted the fix-inherited-caches branch December 14, 2017 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants