Add Recurrent Layers #71

tanmayb123 · 2019-04-02T01:01:10Z

#52

tanmayb123 · 2019-04-02T01:05:18Z

Would we prefer to have forgetWeight and forgetBias or forgetW and forgetB?

dan-zheng · 2019-04-02T05:22:13Z

Would we prefer to have forgetWeight and forgetBias or forgetW and forgetB?

The full names Weight and Bias are preferable.

tanmayb123 · 2019-04-02T05:31:26Z

Done.

Sources/DeepLearning/Layer.swift

dan-zheng

Please add a test to Tests/DeepLearningTests/LayerTests.swift!

Sources/DeepLearning/Layer.swift

eaplatanios · 2019-04-02T12:21:12Z

It would also be nice if we can have an RNN cell protocol, so that all valid cells have to conform to it and can be used in code that expects RNN cells (e.g., simple RNN, bi-directional RNN, etc).

tanmayb123 · 2019-04-02T15:29:22Z

@rxwei What do you think such a protocol would look like? i.e. What kind of functionality would it define?

rxwei · 2019-04-02T19:13:05Z

Just a heads-up: I probably won't have time to review this or make suggestions until late evening.

rxwei · 2019-04-06T16:05:22Z

Here's a sketch:

public struct RNNInput<StepInput: Differentiable, State: RNNState>: Differentiable {
    public var stepInput: StepInput
    public var previousState: State
    public init(stepInput: StepInput, previousState: State) {
        self.stepInput = stepInput
        self.previousState = previousState
    }
}

public struct RNNState<CellState: Differentiable, HiddenState: Differentiable>: Differentiable {
    public var cell: CellState
    public var hidden: HiddenState
    public init(cell: CellState, hidden: HiddenState) {
        self.cell = cell
        self.hidden = hidden
    }
}

public protocol RNNCell: Layer
    where Input == RNNInput<StepInput, State>, Output == RNNState<CellState, HiddenState> {
    associatedtype StepInput: Differentiable
    associatedtype CellState: Differentiable
    associatedtype HiddenState: Differentiable
    typealias State = Output

    init(inputSize: Int, hiddenSize: Int)
    var zeroState: State { get }
}

public extension RNNCell {
    @differentiable
    func applied(to stepInput: StepInput, _ previousState: State) -> State {
        return applied(to: Input(stepInput: stepInput, previousState: previousState))
    }
}

rxwei · 2019-04-06T13:39:50Z

Sources/DeepLearning/Layer.swift

+        return State(cell: newCellState, hidden: newHiddenState)
+    }
+
+    public func zeroState() -> State {


It would be more Swifty to define this as a computed property, since the computational complexity of this is relatively trivial.

I think you might want to have zeroState (or maybe initialState) accept batchSize as the inputs are likely to be batched.

rxwei · 2019-04-06T13:41:05Z

Sources/DeepLearning/Layer.swift

+        let forgetGate = sigmoid(matmul(gateInput, forgetWeight) + forgetBias)
+        let outputGate = sigmoid(matmul(gateInput, outputWeight))
+
+        let newCellState = (input.state.cell * forgetGate + inputGate * updateGate)


Remove redundant parentheses.

rxwei · 2019-04-06T13:42:13Z

Sources/DeepLearning/Layer.swift

+        self.forgetWeight = Tensor(glorotUniform: gateWeightShape)
+        self.forgetBias = Tensor(zeros: [Int32(hiddenSize)])
+        self.outputWeight = Tensor(glorotUniform: gateWeightShape)
+        self.stateShape = TensorShape([1, concatenatedInputSize])


Always prefer literal conversion when a contextual type exists. In this case the contextual type is TensorShape and it conforms to ExpressibleByArrayLiteral.

Suggested change

self.stateShape = TensorShape([1, concatenatedInputSize])

self.stateShape = [1, concatenatedInputSize]

tanmayb123 · 2019-04-06T20:34:51Z

Love the sketch, Richard - thanks :)
Quick question: only LSTMs have a "cell state". Others, like GRU and SimpleRNN only have a "hidden state". Therefore, their "state" wouldn't be a struct, rather just a single Tensor. That's why I was a bit confused on how to create that protocol.

eaplatanios · 2019-04-06T21:19:45Z

Sorry I wasn't able to respond earlier because I've been kept busy. I was thinking of something a bit more abstract where RNNState does not need to consist of two parts but is rather a generic type itself (called just State should be fine), as it can vary between different RNN cells.

rxwei · 2019-04-07T00:52:59Z

Yeah, the cell state is definitely weird. I think that part may not need a fixed structure, like @eaplatanios said it can just be a generic type.

public struct RNNInput<StepInput: Differentiable, State: Differentiable>: Differentiable {
    public var stepInput: StepInput
    public var previousState: State
    public init(stepInput: StepInput, previousState: State) {
        self.stepInput = stepInput
        self.previousState = previousState
    }
}

public protocol RNNCell: Layer where Input == RNNInput<StepInput, State> {
    associatedtype StepInput: Differentiable
    typealias State = Output

    init(inputSize: Int, hiddenSize: Int)
    var zeroState: State { get }
}

public extension RNNCell {
    @differentiable
    func applied(to stepInput: StepInput, _ previousState: State) -> State {
        return applied(to: Input(stepInput: stepInput, previousState: previousState))
    }
}

rxwei · 2019-04-07T04:43:00Z

I'm going to check the protocol in. If you can make these new layers work with the new protocol, that'd be really great!

tanmayb123 · 2019-04-08T20:23:10Z

I'll do that shortly - thanks Richard!

superbobry · 2019-04-09T15:41:31Z

Sources/DeepLearning/Layer.swift

+        return State(cell: newCellState, hidden: newHiddenState)
+    }
+
+    public func zeroState() -> State {


I think you might want to have zeroState (or maybe initialState) accept batchSize as the inputs are likely to be batched.

superbobry · 2019-04-09T15:53:15Z

Sources/DeepLearning/Layer.swift

+        self.inputWeight = Tensor(glorotUniform: gateWeightShape)
+        self.updateWeight = Tensor(glorotUniform: gateWeightShape)
+        self.forgetWeight = Tensor(glorotUniform: gateWeightShape)
+        self.forgetBias = Tensor(zeros: [Int32(hiddenSize)])


A rule of a thumb is to initialize the forget bias with ~1, see http://proceedings.mlr.press/v37/jozefowicz15.pdf

Also, is there a reason for not having a bias for other gates?

Sources/DeepLearning/Layer.swift

rxwei · 2019-04-16T02:44:55Z

Hi @tanmayb123, we are preparing a Swift for TensorFlow v0.3 release this Wednesday. Would you like to update this PR and check it in so that it can be part of the release?

tanmayb123 · 2019-04-16T06:37:02Z

Sure thing Richard, working on that now. Quick concern that I didn't notice before: RNNCellOutput has both an Output and State, but that's a bit problematic for the cells since they don't have any outputs per time-step - their output is the state. What should I do in that case? Should I pass the new state to Output and State?

Sources/DeepLearning/Layer.swift

Tests/DeepLearningTests/LayerTests.swift

rxwei · 2019-04-17T01:12:47Z

I didn't see James's comment. Definitely address his concerns first :)

Co-Authored-By: tanmayb123 <[email protected]>

Sources/DeepLearning/Layer.swift

tanmayb123 · 2019-04-17T01:27:34Z

Also, related to this PR, I've opened #91 to see how to handle sequential inputs and keeping track of hidden states over time automatically.

tanmayb123 · 2019-04-17T01:39:39Z

The build is failing, but not on any of the Recurrent Cell code.

Sources/DeepLearning/Layer.swift

rxwei · 2019-04-17T02:00:02Z

I'm going to merge it once tests pass.

rxwei · 2019-04-17T03:10:19Z

Tests are failing because your branch is still old. Could you pull and merge?

tanmayb123 · 2019-04-17T03:47:30Z

Sure thing.

merge

rxwei · 2019-04-17T04:08:11Z

Merged. Thanks for iterating on this!

tanmayb123 · 2019-04-17T05:25:56Z

Of course :)

This reverts commit 6dc373a.

This reverts commit 6dc373a. It exposed a differentiation crash (TF-440) and is blocking progress.

This reverts commit f75c5e0.

* Revert "Revert "Add Recurrent Layers (#71)" (#94)" This reverts commit f75c5e0. * Remove `@differentiable` from `zeroState`. * Fix axis of concatenation.

tanmayb123 added 2 commits April 1, 2019 21:00

Add LSTM Cell

e6aa346

Rename forgetBias

43cd01e

Full Weight/Bias name in Variables

e90277b

rxwei reviewed Apr 2, 2019

View reviewed changes

dan-zheng reviewed Apr 2, 2019

View reviewed changes

Sources/DeepLearning/Layer.swift Outdated Show resolved Hide resolved

Add Feedback: phase 1

23e9a2a

rxwei reviewed Apr 2, 2019

View reviewed changes

tanmayb123 added 2 commits April 2, 2019 12:45

Address Feedback on LSTM + add zeroState()

f9825d5

Make all members public

acd87ad

tanmayb123 changed the title ~~Add LSTM Cell~~ Add Recurrent Layers Apr 2, 2019

tanmayb123 added 3 commits April 2, 2019 13:28

Add Simple RNN Cell

c39e97b

make LSTM compile

f4e2f51

Change State argument names

d587362

rxwei reviewed Apr 6, 2019

View reviewed changes

rxwei mentioned this pull request Apr 7, 2019

Add RNNCell protocol. #80

Merged

superbobry reviewed Apr 9, 2019

View reviewed changes

review

ae889bb

jekbradbury suggested changes Apr 17, 2019

View reviewed changes

Sources/DeepLearning/Layer.swift Outdated Show resolved Hide resolved

Sources/DeepLearning/Layer.swift Show resolved Hide resolved

rxwei approved these changes Apr 17, 2019

View reviewed changes

Tests/DeepLearningTests/LayerTests.swift Outdated Show resolved Hide resolved

tanmayb123 and others added 2 commits April 16, 2019 21:15

add bias to other LSTM gates

9286746

inferring -> applied

aacaf36

Co-Authored-By: tanmayb123 <[email protected]>

rxwei reviewed Apr 17, 2019

View reviewed changes

Sources/DeepLearning/Layer.swift Outdated Show resolved Hide resolved

Cleanup weights/biases

7d72ab0

rxwei approved these changes Apr 17, 2019

View reviewed changes

Sources/DeepLearning/Layer.swift Outdated Show resolved Hide resolved

jekbradbury approved these changes Apr 17, 2019

View reviewed changes

make stateShape public

3962f46

rxwei reviewed Apr 17, 2019

View reviewed changes

Sources/DeepLearning/Layer.swift Outdated Show resolved Hide resolved

Turn stateShape into a computed property

6bef9bb

Merge pull request #2 from tensorflow/master

694277d

merge

rxwei merged commit 6dc373a into tensorflow:master Apr 17, 2019

dan-zheng mentioned this pull request Apr 17, 2019

Explicitly qualify LearningPhase.inference to prevent ambiguity err… #92

Merged

dan-zheng added a commit that referenced this pull request Apr 17, 2019

Revert "Add Recurrent Layers (#71)"

6bf9235

This reverts commit 6dc373a.

dan-zheng mentioned this pull request Apr 17, 2019

Revert "Add Recurrent Layers" #94

Merged

dan-zheng added a commit that referenced this pull request Apr 17, 2019

Revert "Add Recurrent Layers (#71)" (#94)

f75c5e0

This reverts commit 6dc373a. It exposed a differentiation crash (TF-440) and is blocking progress.

dan-zheng added a commit that referenced this pull request Apr 18, 2019

Revert "Revert "Add Recurrent Layers (#71)" (#94)"

7b2d77f

This reverts commit f75c5e0.

rxwei pushed a commit that referenced this pull request Apr 18, 2019

Revert "Revert "Add Recurrent Layers"" (#97)

904dd50

* Revert "Revert "Add Recurrent Layers (#71)" (#94)" This reverts commit f75c5e0. * Remove `@differentiable` from `zeroState`. * Fix axis of concatenation.

rxwei mentioned this pull request Apr 18, 2019

Implement Recurrent Layers #52

Open

tanmayb123 mentioned this pull request Apr 20, 2019

Add RNN wrapper for Cells #100

Closed

	self.stateShape = TensorShape([1, concatenatedInputSize])
	self.stateShape = [1, concatenatedInputSize]

Add Recurrent Layers #71

Add Recurrent Layers #71

Uh oh!

Conversation

tanmayb123 commented Apr 2, 2019 • edited by rxwei Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tanmayb123 commented Apr 2, 2019

Uh oh!

dan-zheng commented Apr 2, 2019

Uh oh!

tanmayb123 commented Apr 2, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dan-zheng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eaplatanios commented Apr 2, 2019

Uh oh!

tanmayb123 commented Apr 2, 2019

Uh oh!

rxwei commented Apr 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Apr 6, 2019

Uh oh!

rxwei Apr 6, 2019

Choose a reason for hiding this comment

Uh oh!

superbobry Apr 9, 2019

Choose a reason for hiding this comment

Uh oh!

rxwei Apr 6, 2019

Choose a reason for hiding this comment

Uh oh!

rxwei Apr 6, 2019

Choose a reason for hiding this comment

Uh oh!

tanmayb123 commented Apr 6, 2019

Uh oh!

eaplatanios commented Apr 6, 2019

Uh oh!

rxwei commented Apr 7, 2019

Uh oh!

rxwei commented Apr 7, 2019

Uh oh!

tanmayb123 commented Apr 8, 2019

Uh oh!

superbobry Apr 9, 2019

Choose a reason for hiding this comment

Uh oh!

superbobry Apr 9, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rxwei commented Apr 16, 2019

Uh oh!

tanmayb123 commented Apr 16, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rxwei commented Apr 17, 2019

Uh oh!

Uh oh!

tanmayb123 commented Apr 17, 2019

Uh oh!

tanmayb123 commented Apr 17, 2019

Uh oh!

tanmayb123 commented Apr 2, 2019 •

edited by rxwei

Loading

rxwei commented Apr 2, 2019 •

edited

Loading