[WIP] Equations for losses #579

joaogui1 · 2019-12-12T10:58:07Z

As discussed in the google group I'm adding equations for the losses, but I'm afraid that without LaTeX support it may be a little bit unreadable, what do you think?
I have added equations for the most common losses as of now, after feedback I can write the rest.

Sources/TensorFlow/Loss.swift

Co-Authored-By: Brad Larson <[email protected]>

joaogui1 · 2019-12-13T02:28:14Z

Ok, so I guess that's all of the non-categorical losses, but I'm a bit confused about how to describe the categorical hinge and categorical crossentropy, can someone shine a light on this?

8bitmp3 · 2019-12-13T03:38:59Z

Hi @joaogui1 @eaplatanios @BradLarson Just discussed this thread and the one from the Swift Google group about API docs with @saeta at an event we're attending offline.

I think this is awesome. I would like to suggest something to improve user experience for the growing community. Perhaps, a logical approach is to follow very closely or word-for-word, where applicable, both descriptions and equations in the TF 2.x API docs, if they exist already.

Otherwise, reinventing the wheel can introduce inconsistency, which TF/S4TF users will notice, since www.tensorflow.org has one search.

Example:

In TF 2.0 tf.keras.losses docs, the squared hinge loss is defined as follows (source):

class SquaredHinge(LossFunctionWrapper):
"""Computes the squared hinge loss between `y_true` and `y_pred`.
  `loss = square(maximum(1 - y_true * y_pred, 0))`
  `y_true` values are expected to be -1 or 1. If binary (0 or 1) labels are
  provided we will convert them to -1 or 1.
...

vs what is in one of the latest commits here:

/// Returns the squared hinge loss between predictions and expectations.
/// Given the predicted and expected, the Hinge loss is computed as follows:
///  `reduction(max(0, 1 - predicted * expected)^2)` 
///
/// - Parameters:
/// ...
public func squaredHingeLoss<Scalar: TensorFlowFloatingPoint>(
...) ...
}

The use of y_true and y_pred is pretty consistent and understandable across the TF 2.x APIs and because the S4TF docs are hosted on the same .org site and under /tensorflow/ on github, I think we should reuse what's been reviewed and edited by @lamberta and the rest of the docs team. If something is more Swifty rather than Pythonic, we could certainly amend it on a case-by-case basis.

Maybe I'm wrong, let me know what you all think. cc @dynamicwebpaige

Can I also draw the S4TF community's attention to the TensorFlow docs style guide
https://www.tensorflow.org/community/contribute/docs_style as well as the Google dev docs style guide https://developers.google.com/style

joaogui1 · 2019-12-13T05:36:24Z

I changed the names to be compatible, (so it's y_true and y_pred now) but didn't copy the documentation, do you guys think it would be better to have identical docs?
Also there's another difference, since we have a combined softmax + crossentropy loss calling the logits y_pred would be misleading (the real y_preds should be softmax(logits), do you think the name should be changed anyway?

joaogui1 · 2019-12-18T02:49:35Z

Pinging @bartchr808 @8bitmp3

Sources/TensorFlow/Loss.swift

bartchr808 · 2019-12-18T04:10:31Z

Firstly thanks @joaogui1 for doing this and @8bitmp3 for bringing up the idea of "follow[ing] very closely...both descriptions and equations in the TF 2.x API docs"! 😄

do you guys think it would be better to have identical docs?

Personally, I agree with @8bitmp3 and I feel trying to have as close of a 1:1 matching between documentation would be ideal. As with code style, Python and Swift have fairly different styles in the way you define functions. Thus, our API naming cannot be 1:1 in order to stay Swifty. However, when it comes to writing the documentation, I think this is where we could try and match the documentation whenever possible as the exiting docs are generally TensorFlow specific and language agnostic. So then, it can make anyone transitioning from Python to Swift (either a straight switch or progressively through Python interop) have an easier time possibly recalling what certain functions do (e.g. you Google search some sentence from some documentation in Python TensorFlow, and it shows matches in Swift for TensorFlow docs).

Additionally, I think this can make maintaining docs much easier for the community, @lamberta and the rest of the docs team.

As a small disclaimer, I've been back and busy with school since September, so I haven't been able to follow along with the progress being made in the community, so I may sometimes miss out on some details here and there! 😆

Co-Authored-By: Bart Chrzaszcz <[email protected]>

BradLarson · 2019-12-18T19:54:38Z

The general consensus seems to be that we would prefer to closely mirror the documentation from the Python implementations, but replace the variables such that they reflect the Swift code. For example, in the case that @8bitmp3 cites above, that would turn into:

/// Computes the squared hinge loss between `expected` and `predicted`.
///   `loss = square(maximum(1 - expected * predicted, 0))`

(I'm not sure if the subsequent two lines in the Keras implementation about the parameters apply to how we handle this, but if they do, we could bring those over too with parameter names replaced.)

8bitmp3 · 2019-12-20T16:03:46Z

Thumbs up and cheers @bartchr808 @BradLarson for your feedback and thanks @joaogui1 for kick-starting this process. To sum up, so that we're on the same page:

Mirror TensorFlow API docs as much as possible to help Python users and maintain consistency.
Amend variables and other bits, where applicable, to keep it Swifty.

And, for additional guidance, these Google docs/code style guides should also be useful:

TensorFlow docs style guide: https://www.tensorflow.org/community/contribute/docs_style
Google dev docs style guide: https://developers.google.com/style
Google Swift style guide: https://google.github.io/swift/

(I'll also conduct some deep exploration since the project is already at v0.6 and it's very cold outside.)

8bitmp3 · 2019-12-20T16:05:36Z

@saeta since you mentioned it in https://groups.google.com/a/tensorflow.org/forum/#!topic/swift/tQKFYL1Ykac

Update: I can add a few bits to CONTRIBUTING.md, since it already exists, if y'all don't mind. Will raise via a separate PR fyi @BradLarson @bartchr808 @eaplatanios @dan-zheng.

And if you guys are ok with it, it can built upon and expanded in already existing CONTRIBUTING.md https://github.com/tensorflow/swift-apis/blob/master/CONTRIBUTING.md :

An example of soon-to-be-proposed additions:

### Contribution guidelines and standards
...
#### General guidelines for Swift for TensorFlow API docs contribution

* Closely mirror Swift for TensorFlow API docs with implementations in [tensorflow/tensorflow/python](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/python) to maintain consistency across platforms.
  - If required, replace variable names from Python docs to reflect the Swift code. For example: `y_true` and `y_predict` in TF 2.x API docs would become `expected` and `predicted` in Swift for TensorFlow, respectively.
* When you contribute a new feature to Swift for TensorFlow, the maintenance burden is (by default) transferred to the Swift to TensorFlow team. This means that the benefit of the contribution must be compared against the cost of maintaining the feature.

For your reference, this is what it says in the TensorFlow 2.x API docs under https://github.com/tensorflow/tensorflow/blob/master/CONTRIBUTING.md:

### Contribution guidelines and standards
...
#### General guidelines and philosophy for contribution
...
* When you contribute a new feature to TensorFlow, the maintenance burden is (by default) transferred to the TensorFlow team. This means that the benefit of the contribution must be compared against the cost of maintaining the feature.

BradLarson · 2020-01-08T18:34:46Z

It's been a little while since we last discussed this, are you still interested in working on these documentation additions? We're revisiting this after the holidays, and wanted to check back in.

joaogui1 · 2020-01-08T22:12:39Z

Hi, the idea was to finish it after the holidays, but I've been quite sick. If it's not a problem I will finish it once I've recovered

BradLarson · 2020-01-08T22:58:28Z

Please take your time, hope you feel better soon. We were just checking in on the list of pending pull requests to make sure we weren't neglecting anything. Whenever you feel ready to work on this, we'll be glad to review.

RahulBhalley · 2020-01-11T16:09:36Z

Great to see that API documentation is now being discussed seriously! Very much 👍 for this effort.

RahulBhalley · 2020-01-11T16:12:58Z

Sources/TensorFlow/Loss.swift

@@ -262,6 +282,8 @@ func _vjpSoftmaxCrossEntropyHelper<Scalar: TensorFlowFloatingPoint>(
 }

 /// Returns the sigmoid cross entropy (binary cross entropy) between logits and labels.
+/// Given the logits and probabilites, the sigmoid cross entropy computes `reduction(-sigmoid(logits) * log(p))`
+/// Where sigmoid(x) = `1 / (1 + exp(-x))`


Will equations like these be shown as LaTex does?

No, code enclosed in backticks appear with inline code formatting, just like in Markdown.
The Jazzy API documentation generator doesn't support LaTex formatting.

I think having no LaTex—at least for now—is good for user experience. @fchollet wrote an entire (best-selling) book on deep learning without any LaTex:

"...we’ll steer away from mathematical notation, which can be off-putting for those without any mathematics background and isn’t strictly necessary to explain things well."

@8bitmp3 you just made me feel nervous 😬 because my book will include equations.

#suntorytime

joaogui1 · 2020-01-18T09:08:08Z

So, there are some points worthy discussing about the hinge losses:

In tf/keras it seems they convert the values from binary to -1/1 if the input is not -1/1, and swift doesn't do that, should we do it?
Our implementation of CategoricalHingeLoss seems to be both the opposite and somewhat different from tf/keras, to clarify swift's positive is keras' negative and swift's negative is a little bit different from keras' positive, but pretty similar. Should we worry?

joaogui1 · 2020-01-18T09:40:09Z

Oks, sorry again for the delay, I believe that's about it, feedback is welcome as always.

joaogui1 · 2020-01-27T09:01:57Z

Pinging @BradLarson

Sources/TensorFlow/Loss.swift

Co-Authored-By: Brad Larson <[email protected]>

joaogui1 · 2020-02-03T21:53:40Z

So, there are some points worthy discussing about the hinge losses:

In tf/keras it seems they convert the values from binary to -1/1 if the input is not -1/1, and swift doesn't do that, should we do it?

Our implementation of CategoricalHingeLoss seems to be both the opposite and somewhat different from tf/keras, to clarify swift's positive is keras' negative and swift's negative is a little bit different from keras' positive, but pretty similar. Should we worry?

And what do you guys think of this @8bitmp3 @BradLarson ?

BradLarson

Sorry it took me so long to review this, I wanted to get to the bottom of the categorical hinge loss question. Near as I can tell, our implementation matches that of both Keras and tf.keras:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/losses.py#L975

and it looks like the documentation for tf.keras might actually be wrong. If you submitted a PR correcting that on the tf.keras side, I bet they'd be happy to look into that.

Regarding converting binary values, let's pull that out as a separate issue to be discussed so that the rest of these documentation improvements can be integrated. Thank you again for your work on this.

Draft equations for common losses

8ad5848

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

BradLarson reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

eaplatanios reviewed Dec 12, 2019

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

joaogui1 and others added 7 commits December 13, 2019 10:18

Better wording

b44a1b6

Co-Authored-By: Brad Larson <[email protected]>

Fix typo

d18c889

Co-Authored-By: Brad Larson <[email protected]>

Fix typo

7dbb9b9

Co-Authored-By: Brad Larson <[email protected]>

Fix typo

f161677

Co-Authored-By: Brad Larson <[email protected]>

Fix typo

37ad5d5

Co-Authored-By: Brad Larson <[email protected]>

Fix typo

79ac0aa

Co-Authored-By: Brad Larson <[email protected]>

All non categorical losses

ddb8de3

Combatibility with tf2.x documentation

aa89f60

joaogui1 changed the title ~~[Draft] Equations for losses~~ [WIP] Equations for losses Dec 13, 2019

bartchr808 reviewed Dec 18, 2019

View reviewed changes

joaogui1 and others added 5 commits December 18, 2019 13:22

hinge -> Hinge

f75ca73

Co-Authored-By: Bart Chrzaszcz <[email protected]>

Fix typo

cde3807

Co-Authored-By: Bart Chrzaszcz <[email protected]>

Hinge -> hinge

f2778d2

Co-Authored-By: Bart Chrzaszcz <[email protected]>

Remove extra space

6278909

Co-Authored-By: Bart Chrzaszcz <[email protected]>

Add spaces for consistency

fc3e677

Co-Authored-By: Bart Chrzaszcz <[email protected]>

Add spaces for consistency

f462241

Co-Authored-By: Bart Chrzaszcz <[email protected]>

dan-zheng assigned BradLarson Dec 18, 2019

8bitmp3 mentioned this pull request Dec 20, 2019

Update CONTRIBUTING.md for Swift for TensorFlow /swift-apis #589

Merged

RahulBhalley reviewed Jan 11, 2020

View reviewed changes

joaogui1 added 3 commits January 18, 2020 15:16

Merge remote-tracking branch 'upstream/master'

5aaa656

Merge branch 'master' of github.com:joaogui1/swift-apis

364dcf8

L losses draft and Hinge losses

3c744a5

joaogui1 added 4 commits January 18, 2020 18:16

mean errors

929b4f4

cosh and poisson

12faf07

crossentropies and KL divergence

2e0a50b

L* losses and hubber

a6aa230

joaogui1 requested a review from BradLarson January 21, 2020 08:08

BradLarson reviewed Jan 29, 2020

View reviewed changes

Sources/TensorFlow/Loss.swift Outdated Show resolved Hide resolved

Typo

6f5cccd

Co-Authored-By: Brad Larson <[email protected]>

BradLarson added the kokoro:force-run label Feb 5, 2020

kokoro-team removed the kokoro:force-run label Feb 5, 2020

BradLarson approved these changes Feb 5, 2020

View reviewed changes

BradLarson mentioned this pull request Feb 5, 2020

Should we convert binary (0, 1) labels to -1 or 1 within categoricalHingeLoss? #661

Open

BradLarson merged commit 9a47e3c into tensorflow:master Feb 5, 2020

[WIP] Equations for losses #579

[WIP] Equations for losses #579

Uh oh!

Conversation

joaogui1 commented Dec 12, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joaogui1 commented Dec 13, 2019

Uh oh!

8bitmp3 commented Dec 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joaogui1 commented Dec 13, 2019

Uh oh!

joaogui1 commented Dec 18, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bartchr808 commented Dec 18, 2019

Uh oh!

BradLarson commented Dec 18, 2019

Uh oh!

8bitmp3 commented Dec 20, 2019

Uh oh!

8bitmp3 commented Dec 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BradLarson commented Jan 8, 2020

Uh oh!

joaogui1 commented Jan 8, 2020

Uh oh!

BradLarson commented Jan 8, 2020

Uh oh!

RahulBhalley commented Jan 11, 2020

Uh oh!

RahulBhalley Jan 11, 2020

Choose a reason for hiding this comment

Uh oh!

dan-zheng Jan 11, 2020

Choose a reason for hiding this comment

Uh oh!

8bitmp3 Jan 22, 2020

Choose a reason for hiding this comment

Uh oh!

RahulBhalley Jan 30, 2020

Choose a reason for hiding this comment

Uh oh!

8bitmp3 Jan 30, 2020

Choose a reason for hiding this comment

Uh oh!

joaogui1 commented Jan 18, 2020

Uh oh!

joaogui1 commented Jan 18, 2020

Uh oh!

joaogui1 commented Jan 27, 2020

Uh oh!

Uh oh!

joaogui1 commented Feb 3, 2020

Uh oh!

BradLarson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

8bitmp3 commented Dec 13, 2019 •

edited

Loading

8bitmp3 commented Dec 20, 2019 •

edited

Loading