-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Adding TPU + eager support to CompilerRuntime. #21782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Can you add to PR description an example unit test (focusing on how to set up the grpc address)?
if let idx = address.firstIndex(of: ":") { | ||
if let endIdx = address.index(idx, offsetBy: 3, limitedBy:address.endIndex) { | ||
if address[idx..<endIdx] == "://" { | ||
let `protocol` = address[address.startIndex..<idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what values can protocol
take? Please add as a comment.
also, if there is a small set of legal values, consider CHECK on them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Publicly only grpc, but I want to be able to add other values for development.
@swift-ci please test tensorflow |
let `protocol` = address[address.startIndex..<idx] | ||
let target = address[endIdx..<address.endIndex] | ||
_RuntimeConfig.session = .remote(serverDef: """ | ||
cluster { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation of the string depends on the closing """
. This code block should be indented to align with the rest of the code.
// SWIFT_TENSORFLOW_SERVER_ADDRESS, the TPUs will all be on task 1. | ||
self.tpuDeviceNamePrefix = "/job:localhost/replica:0/task:1/device:TPU:" | ||
} else { | ||
self.tpuDeviceNamePrefix = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for this else
block. Optional members are nil
by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the same, but this is a let struct member, so this is needed or it complains that not all paths initialize tpuDeviceNamePrefix.
@@ -311,11 +337,12 @@ public final class _ExecutionContext { | |||
defer { TF_DeleteDeviceList(devices!) } | |||
|
|||
// Sanity check and gather/log device info. When `gpuCount` > 0, set | |||
// `self.gpuDeviceNamePrefix`. | |||
// `self.gpuDeviceNamePrefix`. Likewise with tpuCount. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Add backquotes around tpuCount
?
guard address.prefix(7) == "grpc://" else { | ||
fatalError("SWIFT_TENSORFLOW_SERVER_ADDRESS must start with 'grpc://'.") | ||
if let idx = address.firstIndex(of: ":") { | ||
if let endIdx = address.index(idx, offsetBy: 3, limitedBy:address.endIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nested if-let chains are not great. This should use a single if
and multiple lets separated by a comma.
guard address.prefix(7) == "grpc://" else { | ||
fatalError("SWIFT_TENSORFLOW_SERVER_ADDRESS must start with 'grpc://'.") | ||
if let idx = address.firstIndex(of: ":") { | ||
if let endIdx = address.index(idx, offsetBy: 3, limitedBy:address.endIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space after limitedBy:
Eager mode takes a cluster config instead of a single server address. Construct this cluster config as a string based on the provided protocol://address string.
Example:
SWIFT_TENSORFLOW_SERVER_ADDRESS=grpc://127.0.0.1:1534 ./main