Use getMetadataLayout offset instead of manual gep #29204

zoecarver · 2020-01-14T23:27:16Z

This patch updates getOffsetForIndex to use StructMetadataLayout::getFieldOffset instead of emitting a swift type metadata call. In some cases, this generates smaller/faster assembly because there is now only an alloc and gep instead of also having the previous swift type metadata calls.

All the IRGen tests succeed but I'm not very familiar with this part of the codebase so, there may be things I am missing (tests, other methods that need to be updated, etc.).

compnerd · 2020-01-15T00:34:25Z

This looks right to me. Thanks for the cleanup!

compnerd · 2020-01-15T00:34:30Z

@swift-ci please test

swift-ci · 2020-01-15T01:22:40Z

Build failed
Swift Test OS X Platform
Git Sha - e19f40a

zoecarver · 2020-01-15T01:29:14Z

@compnerd thanks!

I'll update the broken tests. Could you also run the benchmarks?

swift-ci · 2020-01-15T01:47:36Z

Build failed
Swift Test Linux Platform
Git Sha - e19f40a

zoecarver · 2020-01-15T05:22:07Z

I spent a bit of time debugging this. At least some of the tests are failing because TestSuite seems to be "broken." I'm not familiar with the standard library unit test framework, but the function _childProcess splits an input string and expects that there are at least two elements in that array. Somehow, this change made it so that there are no longer two elements in the input.

Anyway, I'll keep debugging, but I'm curious if anyone has ideas as to the source of the issue.

compnerd · 2020-01-15T19:57:57Z

@swift-ci please benchmark

zoecarver · 2020-01-16T21:31:43Z

I originally misdiagnosed the problem. I'm still not quite sure what the issue is but, somehow the standard library was miscompiled.

Here are a few examples of failures, both these fail when being jitted but not compiled. The following triggers the assertion: Can't form Range with upperBound < lowerBound (even when I switch the bounds).

let foo = (0 ..< 10).map { $0 }

And the following hits a bad access in _swift_retain_.

struct Foo { var x : Int }
let mirror = Mirror(reflecting: Foo(x: 0))

Still trying to find a reproducer that doesn't make use of the miscompiled standard library.

jckarter · 2020-01-16T21:35:13Z

You could try having the compiler generate both the old and new code patterns, and insert an assertion function call that the offsets match, to help track down where the behavior is changing.

zoecarver · 2020-01-16T21:41:58Z

@jckarter thanks for the suggestion! I tried doing something similar. I dumped the LLVM IR modules and looked at a diff. I couldn't spot any issues and they both compiled and ran fine. It seems the issue is only when they're executed by the JIT. Is there a way to give the swift JIT an LLVM IR file?

jckarter · 2020-01-16T22:32:27Z

IIRC you can pass the frontend -use-jit to force it into JIT mode, which could let you inspect whether changes in IR generation are affecting behavior. I don't think there's a way that's already implemented to feed an IR file back into the JIT, but you could probably hack one up.

jckarter · 2020-01-16T22:33:56Z

What I was thinking was that you could change the compiler to emit a runtime assertion that the two codegen paths produce equivalent results, which you could then use to run the test suite and hopefully catch a trap near the point where the runtime behavior diverges.

zoecarver · 2020-01-17T05:38:35Z

@jckarter thanks, the -use-jit did produce a binary that errored and I was able to track down the issue to the following bit of IR:

define i32 @main(i32, i8**) {
  %8 = alloca swifterror %swift.error*, align 8
  store %swift.error* null, %swift.error** %8, align 8
  ; ...
  %40 = call swiftcc %swift.bridge* @"$sSlsE3mapySayqd__Gqd__7ElementQzKXEKlF"(i8* bitcast (void (%TSi*, %TSi*, %swift.refcounted*, %swift.error**)* @"$sS2is5Error_pIgydzo_S2isAA_pIegnrzo_TRTA" to i8*), %swift.opaque* %32, %swift.type* %37, %swift.type* @"$sSiN", i8** %38, %swift.opaque* noalias nocapture swiftself %39, %swift.error** noalias nocapture swifterror dereferenceable(8) %8)
  %41 = load %swift.error*, %swift.error** %8, align 8
  %42 = icmp ne %swift.error* %41, null
  br i1 %42, label %bar, label %foo

foo:
  ; ...
  ret i32 0

bar:
  store %swift.error* null, %swift.error** %8, align 8
  call void @llvm.stackrestore(i8* %30)
  unreachable
}

The program errors because unreachable is reached meaning Collection.map updated %8.

I'll try adding Collection.map to my source file to see if I can get further.

Sorry, but I still don't quite understand what you are suggesting with the runtime assertion. How would you suggest I test the codegen paths?

jckarter · 2020-01-17T17:45:41Z

For example, you could emit the old way, take the result, emit the new way using getMetadataLayout, take that result, then emit a call void noreturn @__swift_assert_equal(i8* %old, i8* %new) afterward, and write a small function in the runtime that aborts if the values aren't equal. Sounds like you might've already tracked down the issue, though.

zoecarver · 2020-01-17T18:05:16Z

@jckarter ahh, I see! That's a really good idea. Now I see what you mean. I'll try that.

zoecarver · 2020-01-20T05:16:43Z

The assertion was extremely helpful when debugging this.

@jckarter What is StructMetadataLayout::getFieldOffsetVectorOffset() supposed to return? For the below example, it returns 40 for LazyMapSequence._transform which seems odd to me.

Example (from the standard library):

public func asHex<T : FixedWidthInteger>(_ x: T) -> String {
  return "0x\(String(x, radix: 16))"
}

public func asHex<S : Sequence>(_ x: S) -> String
  where
  S.Element : FixedWidthInteger {
  return "[ \(x.lazy.map { asHex($0) }.joined(separator: ", ")) ]"
}

Edit: I should add that the old code returned 16.

jckarter · 2020-01-21T19:02:18Z

getFieldOffsetVectorOffset returns the offset within the metadata record at which the field offset vector begins. You would still need to add the index of the particular field you're looking for to that base offset, and load the offset from the metadata record at that indexed offset, to get the field offset for a specific generic instance.

In the case of LazyMapSequence, the type layout is:

public struct LazyMapSequence<Base: Sequence, Element> {
  internal var _base: Base
  internal let _transform: (Base.Element) -> Element
}

so a metadata record layout is going to be like this:

struct LazyMapSequenceMetadata {
  // Common value type metadata header
  uintptr_t kind = MetadataKind::Struct;
  TypeContextDescriptor *descriptor = &`type context descriptor for LazyMapSequence`;
  
  // Generic arguments
  Metadata *Base;
  Metadata *Element;
  WitnessTable *Base_Sequence;

  // Start of field offsets, where getFieldOffsetVectorOffset refers to
  uint32_t _base, _transform;
}

so 40 looks to be right for getFieldOffsetVectorOffset to return. The offset for _transform would have to be loaded at metadata offset 44, since it's the second field.

zoecarver · 2020-01-21T19:31:36Z

@jckarter thanks a lot for that explanation. Very helpful.

I understand the problem now, I thought that I could get the _transform offset directly from StructMetadataLayout, but in reality, I have to keep the call to emitTypeMetadataRefForLayout and load it from there.

Anyway, I'll update the code later today, which should fix it. Thanks again for all the help/explaining.

zoecarver · 2020-01-22T03:54:13Z

All fixed. The tests are passing now.

@jckarter thanks again for the help.

Edit: could someone re-trigger the bots?

jckarter · 2020-01-22T17:42:25Z

@swift-ci Please test

swift-ci · 2020-01-22T17:44:34Z

Build failed
Swift Test OS X Platform
Git Sha - e19f40a

swift-ci · 2020-01-22T17:44:52Z

Build failed
Swift Test Linux Platform
Git Sha - e19f40a

zoecarver · 2020-01-22T20:22:32Z

This patch eliminates one GEP instruction in the llvm-ir. Do you think it would be worth running the benchmarks too? I know the original patch had code-size improvements but, I think those may not exist anymore (now that the codegen is correct).

jckarter · 2020-01-22T20:32:41Z

@swift-ci Please benchmark

swift-ci · 2020-01-22T21:28:18Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
RandomShuffleLCG2	416	464	+11.5%	0.90x (?)
ArrayPlusEqualFiveElementCollection	4255	4736	+11.3%	0.90x (?)

Improvement	OLD	NEW	DELTA	RATIO
Calculator	155	136	-12.3%	1.14x (?)
Chars2	3400	3050	-10.3%	1.11x (?)
NSStringConversion.LongUTF8	320	292	-8.7%	1.10x (?)

Code size: -O

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
ArrayLiteral2	102	117	+14.7%	0.87x (?)
ArrayPlusEqualFiveElementCollection	4218	4736	+12.3%	0.89x (?)
RemoveWhereMoveInts	18	20	+11.1%	0.90x
DistinctClassFieldAccesses	180	197	+9.4%	0.91x
IterateData	870	948	+9.0%	0.92x (?)
ArrayInClass	845	915	+8.3%	0.92x (?)
Set.isStrictSubset.Int.Empty	49	53	+8.2%	0.92x (?)
Hanoi	2220	2400	+8.1%	0.93x
ArraySetElement	262	283	+8.0%	0.93x

Improvement	OLD	NEW	DELTA	RATIO
Calculator	155	137	-11.6%	1.13x
ObjectiveCBridgeStubToNSDate2	1070	950	-11.2%	1.13x (?)
PrefixWhileCountableRangeLazy	18	16	-11.1%	1.12x (?)
Chars2	3400	3050	-10.3%	1.11x

Code size: -Osize

Performance: -Onone

Improvement	OLD	NEW	DELTA	RATIO
DataSubscriptMedium	58	54	-6.9%	1.07x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

zoecarver · 2020-01-22T21:48:16Z

Intresting-- those results are not at all what I would have expected. The Osize performance regressions are especially unexpected. I wish there was also a table for Onone code size. Anyway, at least there's a small improvement in one of the Onone benchmarks.

Should I be worried about any of the regressions? Is there any way they're incorrect?

jckarter · 2020-01-22T22:05:07Z

The performance numbers all look like noise to me. I wouldn't worry about them. I think this is good to go.

zoecarver · 2020-01-22T22:05:57Z

Sounds good. Thanks for all the help with this patch!

Use getMetadataLayout offset instead of manual gep

e19f40a

zoecarver requested review from eeckstein and gottesmm January 14, 2020 23:29

Modify to catch issue

7720a62

stash

7f5f1c3

zoecarver added 4 commits January 21, 2020 16:04

Load offset from metadata instead of returning offset directly

bd3d743

Remove debug helpers

f99b53d

Fix tests

ed3442a

Remove newRet

b3a305b

jckarter merged commit ea1e219 into swiftlang:master Jan 22, 2020

Use getMetadataLayout offset instead of manual gep #29204

Use getMetadataLayout offset instead of manual gep #29204

Uh oh!

Conversation

zoecarver commented Jan 14, 2020

Uh oh!

compnerd commented Jan 15, 2020

Uh oh!

compnerd commented Jan 15, 2020

Uh oh!

swift-ci commented Jan 15, 2020

Uh oh!

zoecarver commented Jan 15, 2020

Uh oh!

swift-ci commented Jan 15, 2020

Uh oh!

zoecarver commented Jan 15, 2020

Uh oh!

compnerd commented Jan 15, 2020

Uh oh!

zoecarver commented Jan 16, 2020

Uh oh!

jckarter commented Jan 16, 2020

Uh oh!

zoecarver commented Jan 16, 2020

Uh oh!

jckarter commented Jan 16, 2020

Uh oh!

jckarter commented Jan 16, 2020

Uh oh!

zoecarver commented Jan 17, 2020

Uh oh!

jckarter commented Jan 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoecarver commented Jan 17, 2020

Uh oh!

zoecarver commented Jan 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jckarter commented Jan 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoecarver commented Jan 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoecarver commented Jan 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jckarter commented Jan 22, 2020

Uh oh!

swift-ci commented Jan 22, 2020

Uh oh!

swift-ci commented Jan 22, 2020

Uh oh!

zoecarver commented Jan 22, 2020

Uh oh!

jckarter commented Jan 22, 2020

Uh oh!

swift-ci commented Jan 22, 2020

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

zoecarver commented Jan 22, 2020

Uh oh!

jckarter commented Jan 22, 2020

Uh oh!

zoecarver commented Jan 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jckarter commented Jan 17, 2020 •

edited

Loading

zoecarver commented Jan 20, 2020 •

edited

Loading

jckarter commented Jan 21, 2020 •

edited

Loading

zoecarver commented Jan 21, 2020 •

edited

Loading

zoecarver commented Jan 22, 2020 •

edited

Loading

zoecarver commented Jan 22, 2020 •

edited

Loading