Skip to content

differentiate struct_extract instructions using VJP #21567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 31, 2018

Conversation

marcrasi
Copy link

Differentiates struct_extract instructions using the VJP of the corresponding getter.

Float._value has no VJP defined so AD can't differentiate public static prefix func - (x: ${Self}). This could be fixed by defining a VJP for Float._value or by adding a custom adjoint for -. The latter sounded less invasive so I did that.

@marcrasi marcrasi requested review from rxwei and dan-zheng December 30, 2018 01:58
@marcrasi
Copy link
Author

@swift-ci please test tensorflow

2 similar comments
@marcrasi
Copy link
Author

@swift-ci please test tensorflow

@marcrasi
Copy link
Author

@swift-ci please test tensorflow

Copy link
Contributor

@rxwei rxwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

SmallVector<SILValue, 8> args;
auto seed = getAdjointValue(sei);
auto *seedBuf = builder.createAllocStack(loc, seed.getType());
materializeAdjointIndirectHelper(seed, seedBuf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of always allocating stack and emitting a load when the adjoint is direct, could you call materializeAdjointDirect directly? It'd simplify both your logic and possibly generated code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool, in fact, I should always be able to materializeAdjointDirect because struct_extract only works on object types. This simplifies everything.

SmallVector<SILValue, 8> args;
auto seed = getAdjointValue(sei);
auto *seedBuf = builder.createAllocStack(loc, seed.getType());
materializeAdjointIndirectHelper(seed, seedBuf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper materializeAdjointIndirectHelper is not supposed to be used outside of materializeAdjointIndirect. It expects the buffer argument to be the result of begin_access. Please switch to materializeAdjointIndirect if that does what you need.

void visitStructExtractInst(StructExtractInst *sei) {
auto *structDecl = sei->getStructDecl();
Copy link
Contributor

@rxwei rxwei Dec 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have derived conformances to Differentiable, it'd be much more efficient to use the original struct_extract differentiation logic when we know that the cotangent type equals the original type and that the property vjp is compiler-synthesized (this can be checked via Decl::isImplicit(), I think).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, this gives me an idea. The original struct_extract differentiation logic does not need getter VJPs. So if I can get the original struct_extract differentiation logic working on structs with derived conformance to Differentiable (even those with cotangent type not equal to self), then I don't need to write any code that synthesizes getter VJPs.

I think that the original struct_extract logic can be made to work even on derived conformances to Differentiable with cotangent type not equal to self, because it should be possible to determine which fields correspond.

I will try out this approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable for internal structs. Public structs need property VJPs in any case because of resilience requirements.

loc, getterVJPRef, /*substitutionMap*/ {},
/*args*/ {getMappedValue(sei->getOperand())}, /*isNonThrowing*/ false);

// Get the VJP results (original results and pullback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
// Get the VJP results (original results and pullback)
// Get the VJP results (original results and pullback).

assert(getterDecl);
auto *getterFn = getContext().getModule().lookUpFunction(
SILDeclRef(getterDecl, SILDeclRef::Kind::Func));
if (!getterFn) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will a getter not exist?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One example is the case that I added to test/AutoDiff/autodiff_diagnostics.swift.

I don't know why the getter doesn't exist in that case, and if I put the same code in a file an compile it with a plain call to swiftc, the getter does exist and the code ends up triggering the case where the getter exists but doesn't have a VJP.

I haven't investigated any farther than that, so I don't really know what's going on but the logic seems to handle all the cases I can think of correctly.

mapValue(sei, originalDirectResult);

// Checkpoint the original results.
getPrimalInfo().addStaticPrimalValueDecl(sei);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder to self: must add a retain here

    getBuilder().createRetainValue(loc, originalDirectResult,
                                   getBuilder().getDefaultAtomicity());

public static prefix func - (x: ${Self}) -> ${Self} {
return ${Self}(Builtin.fneg_FPIEEE${bits}(x._value))
}
}

// SWIFT_ENABLE_TENSORFLOW
extension ${Self} {
@inlinable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think inlinable is necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a SILVerifier error about referencing a non-ABI-public function from a fragile function if I remove inlinable. (simple_math.swift triggers it)

I think what's happening is that _transparent doesn't make the function ABI public, and mandatory inlining isn't inlining the function because we're partially applying it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. If you just need it to be ABI-public without needing it serializable, just use @usableFromInline.

@marcrasi
Copy link
Author

@swift-ci please test tensorflow

2 similar comments
@marcrasi
Copy link
Author

@swift-ci please test tensorflow

@marcrasi
Copy link
Author

@swift-ci please test tensorflow

@marcrasi marcrasi merged commit ba9b37e into swiftlang:tensorflow Dec 31, 2018
@marcrasi marcrasi deleted the ad-struct-extract branch December 31, 2018 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants