-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Extended graph_op IRGen to support calling C APIs directly. #19524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci please test tensorflow |
From PR description:
Did you mean "prevents"? |
@@ -427,8 +427,9 @@ SILDeserializer::readSILFunctionChecked(DeclID FID, SILFunction *existingFn, | |||
break; | |||
|
|||
case SILStage::Lowered: | |||
llvm_unreachable("cannot deserialize into a module that has entered " | |||
"Lowered stage"); | |||
if (!declarationOnly) // SWIFT_ENABLE_TENSORFLOW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not super obvious to me this is a bug. Which line crashed when you didn't change this? Is it SILModule::findFunction
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. The crash stack trace is
(gdb) bt
#0 SignalHandler (Sig=6)
at /usr/local/google/home/hongm/ssd_part/git/swift-base/llvm/lib/Support/Unix/Signals.inc:314
#1 <signal handler called>
#2 0x00007fffe39fcfcf in raise () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fffe39fe3fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000004c6083b in llvm::llvm_unreachable_internal (msg=<optimized out>,
file=0x555206e "/usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/DeserializeSIL.cpp", line=431)
at /usr/local/google/home/hongm/ssd_part/git/swift-base/llvm/lib/Support/ErrorHandling.cpp:189
#5 0x0000000002218aeb in swift::SILDeserializer::readSILFunctionChecked (this=0x8de05b0, FID=...,
existingFn=0x0, name="_swift_tfc_StartTensorComputation", declarationOnly=true,
errorIfEmptyBody=true)
at /usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/DeserializeSIL.cpp:430
#6 0x0000000002230346 in swift::SILDeserializer::lookupSILFunction (this=0x8de05b0,
name="_swift_tfc_StartTensorComputation", declarationOnly=true)
at /usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/DeserializeSIL.cpp:2561
#7 0x000000000218d7b8 in swift::SerializedSILLoader::lookupSILFunction (this=0x8c89630,
Name="_swift_tfc_StartTensorComputation", declarationOnly=true,
Linkage=llvm::Optional is initialized = {...})
at /usr/local/google/home/hongm/ssd_part/git/swift-base/swift/lib/Serialization/SerializedSILLoader.cpp:67
SerializedSILLoader::lookupSILFunction()
is called by SILModule::findFunction()
I think it's a bug because the comment above says:
We can't deserialize function bodies after ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is expected behavior since it's in IRGen, which turns the SIL stage to Lowered
. Maybe SILModule::findFunction
should not be used, because it'll try to deserialize a function when it's not in the current module.
Have you tried creating the declaration of _swift_tfc_GetGlobalEagerContext
directly? This seems to be the standard approach in IRGen. See how IRGen creates the declaration of the runtime entry point for lowering retain_value_addr
.
https://github.com/apple/swift/blob/c12db28b164e1207e918f6bd3f7119f80f0cdaa1/lib/IRGen/Outlining.cpp#L299
You could define a IRGenModule::getOrCreateTFCGetGlobalEagerContextFunction
(same for other entry points), and getTypeAndGenericSignatureForManglingOutlineFunction
seems to be able to handle mangling for you. Right, it's a lot of code defining a getter for each entry point, but it seems to be the standard way in the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Richard suggests, I think this is expected behavior in IRGen.
As an alternative to creating declarations explicitly, can we remember the relevant SILFunctions somewhere in the context of GraphOp and use them in getAddrOfSILFunction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointers. I'd like to avoid explicit declarations if at all possible -- that would increase code complexity, and also maintenance overhead (when the Swift function signature of _swift_tfc_GetGlobalEagerContext is changed, we need to change the irgen compiler code accordingly as well).
Also, while we only have ~5 compiler runtime entry points for now, we'll be calling ~20 TF eager C APIs (see TF repo tensorflow/tensorflow/c/eager/c_api.h), and creating explicit decls for them could involve a lot more code complexity and maintenance overhead.
The referenced code for getOrCreateRetainFunction() (I also studied swift/include/swift/Runtime/RuntimeFunctions.def) is of a different scenario -- these runtime functions do not have their impls already defined as external C APIs or swift functions (via silgen names).
I thought deserializing the SIL function decl (not body) in IRGen phase would still be safe. Do you have a specific concern?
@bgogul, it'd be possible to cache the SILFunction* at an earlier compiler phase (before we get into IRGen), but if the cached version is still valid to used, I believe the deserialized version (just the decl) would be safe to use as well, and the latter would be somewhat simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one concern with deserializing during IRGen phase is that we don't have a chance to run some of the passes that would have run if we deserialized that function at an early stage. Even though you are only using it here for getting the declaration, it goes against separation of concerns. Perhaps others can jump in if this is not a major concern.
Isn't the case that we check that TensorFlow module is imported at some point? Given that these functions are part of TensorFlow, why can't we cache them then and make sure they are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to codegen any TensorFlow C API from IRGen directly? Can they be moved and concentrated to one or two entry points in the runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. I'll submit this patch, and start a discussion with core team on this "SILFunction lookup" based approach.
Correct "pervious" to "prevents". Thanks. |
auto tensorHandle = Builder.CreateCall(createHandleFn, {cTensorHandleTyped}); | ||
|
||
LLVM_DEBUG( | ||
llvm::dbgs() << "Done with IRGen for graph_op; setting explosion.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all LGTM
@@ -1166,3 +1166,20 @@ public func _GetGlobalEagerContext() -> CTFEContext { | |||
return _ExecutionContext.global.eagerContext | |||
} | |||
|
|||
// TODO: replace these functions with generic ones that do not hard-code Float. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will be more straight-forward to do when there is a non-generic superclass of TensorHandle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Will refresh this once I use Richard's patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
In this PR, we call the TF C API TFE_RunConstOp() instead of the previous compiler runtime entry point @_silgen_name("_swift_tfc_RunEagerConstTest"), and then call compiler runtime entry point @_silgen_name("_swift_tfc_CreateFloatTensorHandleFromCTensorHandle")to wrap it into a TensorHandle<Float>. The main changes are: 1. To create an llvm::Function object based on a function name such as "_swift_tfc_GetGlobalEagerContext", we first call silModule.findFunction() to get the SILFunction, and then call IGM.getAddrOfSILFunction() to get the llvm::Function. To make this work, we fixed a bug in SILDeserializer::readSILFunctionChecked(), which pervious a function decl (not body) from being deserialized in the IRGen stage. 2. We obtain from llvm::Function objects the LLVM type objects for C data types TFE_Context* and TFE_TensorHandle*, and then generate bitcast instructions to put the function params into the right types, before issuing the relevant function call. Next steps: 1. Replace the TFE_RunConstOp() call with a sequence of TF eager C API calls. 2. Generalize the graph_op decoding logic to handle graph_op's other than Const. 3. Support generic tf datatype T instead of the hard-coded Float. 4. Figure out a way to call do scalar promotion even in the case of -Onone, since Tensor<Float>(1.0) becomes a pseudo graph_op "tfc.scalarToTensor", which gets should be turn into a "Const" graph_op.
3c83853
to
5020fae
Compare
@swift-ci please test tensorflow |
In this PR, we call the TF C API TFE_RunConstOp() instead of the previous
compiler runtime entry point @_silgen_name("_swift_tfc_RunEagerConstTest"), and
then call compiler runtime entry point
@_silgen_name("_swift_tfc_CreateFloatTensorHandleFromCTensorHandle")to wrap it
into a TensorHandle.
The main changes are:
"_swift_tfc_GetGlobalEagerContext", we first call silModule.findFunction() to
get the SILFunction, and then call IGM.getAddrOfSILFunction() to get the
llvm::Function.
To make this work, we fixed a bug in SILDeserializer::readSILFunctionChecked(),
which prevents a function decl (not body) from being deserialized in the IRGen stage.
TFE_Context* and TFE_TensorHandle*, and then generate bitcast instructions to
put the function params into the right types, before issuing the relevant
function call.
Next steps:
Const.
since Tensor(1.0) becomes a pseudo graph_op "tfc.scalarToTensor", which
gets should be turn into a "Const" graph_op.