Skip to content

Extended graph_op IRGen to support calling C APIs directly. #19524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 26, 2018

Conversation

mhong
Copy link

@mhong mhong commented Sep 25, 2018

In this PR, we call the TF C API TFE_RunConstOp() instead of the previous
compiler runtime entry point @_silgen_name("_swift_tfc_RunEagerConstTest"), and
then call compiler runtime entry point
@_silgen_name("_swift_tfc_CreateFloatTensorHandleFromCTensorHandle")to wrap it
into a TensorHandle.

The main changes are:

  1. To create an llvm::Function object based on a function name such as
    "_swift_tfc_GetGlobalEagerContext", we first call silModule.findFunction() to
    get the SILFunction, and then call IGM.getAddrOfSILFunction() to get the
    llvm::Function.

To make this work, we fixed a bug in SILDeserializer::readSILFunctionChecked(),
which prevents a function decl (not body) from being deserialized in the IRGen stage.

  1. We obtain from llvm::Function objects the LLVM type objects for C data types
    TFE_Context* and TFE_TensorHandle*, and then generate bitcast instructions to
    put the function params into the right types, before issuing the relevant
    function call.

Next steps:

  1. Replace the TFE_RunConstOp() call with a sequence of TF eager C API calls.
  2. Generalize the graph_op decoding logic to handle graph_op's other than
    Const.
  3. Support generic tf datatype T instead of the hard-coded Float.
  4. Figure out a way to call do scalar promotion even in the case of -Onone,
    since Tensor(1.0) becomes a pseudo graph_op "tfc.scalarToTensor", which
    gets should be turn into a "Const" graph_op.

@mhong
Copy link
Author

mhong commented Sep 25, 2018

@swift-ci please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Sep 25, 2018

From PR description:

which pervious a function decl

Did you mean "prevents"?

@@ -427,8 +427,9 @@ SILDeserializer::readSILFunctionChecked(DeclID FID, SILFunction *existingFn,
break;

case SILStage::Lowered:
llvm_unreachable("cannot deserialize into a module that has entered "
"Lowered stage");
if (!declarationOnly) // SWIFT_ENABLE_TENSORFLOW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not super obvious to me this is a bug. Which line crashed when you didn't change this? Is it SILModule::findFunction?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. The crash stack trace is

(gdb) bt
#0  SignalHandler (Sig=6)
    at /usr/local/google/home/hongm/ssd_part/git/swift-base/llvm/lib/Support/Unix/Signals.inc:314
#1  <signal handler called>
#2  0x00007fffe39fcfcf in raise () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fffe39fe3fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x0000000004c6083b in llvm::llvm_unreachable_internal (msg=<optimized out>, 
    file=0x555206e "/usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/DeserializeSIL.cpp", line=431)
    at /usr/local/google/home/hongm/ssd_part/git/swift-base/llvm/lib/Support/ErrorHandling.cpp:189
#5  0x0000000002218aeb in swift::SILDeserializer::readSILFunctionChecked (this=0x8de05b0, FID=..., 
    existingFn=0x0, name="_swift_tfc_StartTensorComputation", declarationOnly=true, 
    errorIfEmptyBody=true)
    at /usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/DeserializeSIL.cpp:430
#6  0x0000000002230346 in swift::SILDeserializer::lookupSILFunction (this=0x8de05b0, 
    name="_swift_tfc_StartTensorComputation", declarationOnly=true)
    at /usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/DeserializeSIL.cpp:2561
#7  0x000000000218d7b8 in swift::SerializedSILLoader::lookupSILFunction (this=0x8c89630, 
    Name="_swift_tfc_StartTensorComputation", declarationOnly=true, 
    Linkage=llvm::Optional is initialized = {...})
    at /usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/SerializedSILLoader.cpp:67

SerializedSILLoader::lookupSILFunction() is called by SILModule::findFunction()

I think it's a bug because the comment above says:

We can't deserialize function bodies after ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is expected behavior since it's in IRGen, which turns the SIL stage to Lowered. Maybe SILModule::findFunction should not be used, because it'll try to deserialize a function when it's not in the current module.

Have you tried creating the declaration of _swift_tfc_GetGlobalEagerContext directly? This seems to be the standard approach in IRGen. See how IRGen creates the declaration of the runtime entry point for lowering retain_value_addr.
https://github.com/apple/swift/blob/c12db28b164e1207e918f6bd3f7119f80f0cdaa1/lib/IRGen/Outlining.cpp#L299

You could define a IRGenModule::getOrCreateTFCGetGlobalEagerContextFunction (same for other entry points), and getTypeAndGenericSignatureForManglingOutlineFunction seems to be able to handle mangling for you. Right, it's a lot of code defining a getter for each entry point, but it seems to be the standard way in the compiler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Richard suggests, I think this is expected behavior in IRGen.

As an alternative to creating declarations explicitly, can we remember the relevant SILFunctions somewhere in the context of GraphOp and use them in getAddrOfSILFunction?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers. I'd like to avoid explicit declarations if at all possible -- that would increase code complexity, and also maintenance overhead (when the Swift function signature of _swift_tfc_GetGlobalEagerContext is changed, we need to change the irgen compiler code accordingly as well).

Also, while we only have ~5 compiler runtime entry points for now, we'll be calling ~20 TF eager C APIs (see TF repo tensorflow/tensorflow/c/eager/c_api.h), and creating explicit decls for them could involve a lot more code complexity and maintenance overhead.

The referenced code for getOrCreateRetainFunction() (I also studied swift/include/swift/Runtime/RuntimeFunctions.def) is of a different scenario -- these runtime functions do not have their impls already defined as external C APIs or swift functions (via silgen names).

I thought deserializing the SIL function decl (not body) in IRGen phase would still be safe. Do you have a specific concern?

@bgogul, it'd be possible to cache the SILFunction* at an earlier compiler phase (before we get into IRGen), but if the cached version is still valid to used, I believe the deserialized version (just the decl) would be safe to use as well, and the latter would be somewhat simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one concern with deserializing during IRGen phase is that we don't have a chance to run some of the passes that would have run if we deserialized that function at an early stage. Even though you are only using it here for getting the declaration, it goes against separation of concerns. Perhaps others can jump in if this is not a major concern.

Isn't the case that we check that TensorFlow module is imported at some point? Given that these functions are part of TensorFlow, why can't we cache them then and make sure they are valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we need to codegen any TensorFlow C API from IRGen directly? Can they be moved and concentrated to one or two entry points in the runtime?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. I'll submit this patch, and start a discussion with core team on this "SILFunction lookup" based approach.

@mhong
Copy link
Author

mhong commented Sep 25, 2018

Correct "pervious" to "prevents". Thanks.

auto tensorHandle = Builder.CreateCall(createHandleFn, {cTensorHandleTyped});

LLVM_DEBUG(
llvm::dbgs() << "Done with IRGen for graph_op; setting explosion.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all LGTM

@@ -1166,3 +1166,20 @@ public func _GetGlobalEagerContext() -> CTFEContext {
return _ExecutionContext.global.eagerContext
}

// TODO: replace these functions with generic ones that do not hard-code Float.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will be more straight-forward to do when there is a non-generic superclass of TensorHandle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Will refresh this once I use Richard's patch.

Copy link
Contributor

@lattner lattner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Mingsheng Hong added 2 commits September 25, 2018 16:53
In this PR, we call the TF C API TFE_RunConstOp() instead of the previous
compiler runtime entry point @_silgen_name("_swift_tfc_RunEagerConstTest"), and
then call compiler runtime entry point
@_silgen_name("_swift_tfc_CreateFloatTensorHandleFromCTensorHandle")to wrap it
into a TensorHandle<Float>.

The main changes are:

1. To create an llvm::Function object based on a function name such as
"_swift_tfc_GetGlobalEagerContext", we first call silModule.findFunction() to
get the SILFunction, and then call IGM.getAddrOfSILFunction() to get the
llvm::Function.

To make this work, we fixed a bug in SILDeserializer::readSILFunctionChecked(),
which pervious a function decl (not body) from being deserialized in the IRGen stage.

2. We obtain from llvm::Function objects the LLVM type objects for C data types
TFE_Context* and TFE_TensorHandle*, and then generate bitcast instructions to
put the function params into the right types, before issuing the relevant
function call.

Next steps:
1. Replace the TFE_RunConstOp() call with a sequence of TF eager C API calls.
2. Generalize the graph_op decoding logic to handle graph_op's other than
Const.
3. Support generic tf datatype T instead of the hard-coded Float.
4. Figure out a way to call do scalar promotion even in the case of -Onone,
since Tensor<Float>(1.0) becomes a pseudo graph_op "tfc.scalarToTensor", which
gets should be turn into a "Const" graph_op.
@mhong
Copy link
Author

mhong commented Sep 26, 2018

@swift-ci please test tensorflow

@mhong mhong merged commit ad7def2 into swiftlang:tensorflow Sep 26, 2018
@mhong mhong deleted the irgen_3_capi_call branch September 26, 2018 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants