-
Notifications
You must be signed in to change notification settings - Fork 137
Conversation
Co-Authored-By: Brad Larson <[email protected]>
Co-Authored-By: Brad Larson <[email protected]>
Co-Authored-By: Brad Larson <[email protected]>
Co-Authored-By: Brad Larson <[email protected]>
Co-Authored-By: Brad Larson <[email protected]>
Co-Authored-By: Brad Larson <[email protected]>
Ok, so I guess that's all of the non-categorical losses, but I'm a bit confused about how to describe the categorical hinge and categorical crossentropy, can someone shine a light on this? |
Hi @joaogui1 @eaplatanios @BradLarson Just discussed this thread and the one from the Swift Google group about API docs with @saeta at an event we're attending offline. I think this is awesome. I would like to suggest something to improve user experience for the growing community. Perhaps, a logical approach is to follow very closely or word-for-word, where applicable, both descriptions and equations in the TF 2.x API docs, if they exist already. Otherwise, reinventing the wheel can introduce inconsistency, which TF/S4TF users will notice, since www.tensorflow.org has one search. Example: In TF 2.0 tf.keras.losses docs, the squared hinge loss is defined as follows (source): class SquaredHinge(LossFunctionWrapper):
"""Computes the squared hinge loss between `y_true` and `y_pred`.
`loss = square(maximum(1 - y_true * y_pred, 0))`
`y_true` values are expected to be -1 or 1. If binary (0 or 1) labels are
provided we will convert them to -1 or 1.
... vs what is in one of the latest commits here: /// Returns the squared hinge loss between predictions and expectations.
/// Given the predicted and expected, the Hinge loss is computed as follows:
/// `reduction(max(0, 1 - predicted * expected)^2)`
///
/// - Parameters:
/// ...
public func squaredHingeLoss<Scalar: TensorFlowFloatingPoint>(
...) ...
} The use of Maybe I'm wrong, let me know what you all think. cc @dynamicwebpaige Can I also draw the S4TF community's attention to the TensorFlow docs style guide |
I changed the names to be compatible, (so it's y_true and y_pred now) but didn't copy the documentation, do you guys think it would be better to have identical docs? |
Pinging @bartchr808 @8bitmp3 |
Firstly thanks @joaogui1 for doing this and @8bitmp3 for bringing up the idea of "follow[ing] very closely...both descriptions and equations in the TF 2.x API docs"! 😄
Personally, I agree with @8bitmp3 and I feel trying to have as close of a 1:1 matching between documentation would be ideal. As with code style, Python and Swift have fairly different styles in the way you define functions. Thus, our API naming cannot be 1:1 in order to stay Swifty. However, when it comes to writing the documentation, I think this is where we could try and match the documentation whenever possible as the exiting docs are generally TensorFlow specific and language agnostic. So then, it can make anyone transitioning from Python to Swift (either a straight switch or progressively through Python interop) have an easier time possibly recalling what certain functions do (e.g. you Google search some sentence from some documentation in Python TensorFlow, and it shows matches in Swift for TensorFlow docs). Additionally, I think this can make maintaining docs much easier for the community, @lamberta and the rest of the docs team. As a small disclaimer, I've been back and busy with school since September, so I haven't been able to follow along with the progress being made in the community, so I may sometimes miss out on some details here and there! 😆 |
Co-Authored-By: Bart Chrzaszcz <[email protected]>
Co-Authored-By: Bart Chrzaszcz <[email protected]>
Co-Authored-By: Bart Chrzaszcz <[email protected]>
Co-Authored-By: Bart Chrzaszcz <[email protected]>
Co-Authored-By: Bart Chrzaszcz <[email protected]>
Co-Authored-By: Bart Chrzaszcz <[email protected]>
The general consensus seems to be that we would prefer to closely mirror the documentation from the Python implementations, but replace the variables such that they reflect the Swift code. For example, in the case that @8bitmp3 cites above, that would turn into: /// Computes the squared hinge loss between `expected` and `predicted`.
/// `loss = square(maximum(1 - expected * predicted, 0))` (I'm not sure if the subsequent two lines in the Keras implementation about the parameters apply to how we handle this, but if they do, we could bring those over too with parameter names replaced.) |
Thumbs up and cheers @bartchr808 @BradLarson for your feedback and thanks @joaogui1 for kick-starting this process. To sum up, so that we're on the same page:
And, for additional guidance, these Google docs/code style guides should also be useful:
(I'll also conduct some deep exploration since the project is already at v0.6 and it's very cold outside.) |
@saeta since you mentioned it in https://groups.google.com/a/tensorflow.org/forum/#!topic/swift/tQKFYL1Ykac Update: I can add a few bits to CONTRIBUTING.md, since it already exists, if y'all don't mind. Will raise via a separate PR fyi @BradLarson @bartchr808 @eaplatanios @dan-zheng. And if you guys are ok with it, it can built upon and expanded in already existing CONTRIBUTING.md https://github.com/tensorflow/swift-apis/blob/master/CONTRIBUTING.md : An example of soon-to-be-proposed additions:
For your reference, this is what it says in the TensorFlow 2.x API docs under https://github.com/tensorflow/tensorflow/blob/master/CONTRIBUTING.md:
|
It's been a little while since we last discussed this, are you still interested in working on these documentation additions? We're revisiting this after the holidays, and wanted to check back in. |
Hi, the idea was to finish it after the holidays, but I've been quite sick. If it's not a problem I will finish it once I've recovered |
Please take your time, hope you feel better soon. We were just checking in on the list of pending pull requests to make sure we weren't neglecting anything. Whenever you feel ready to work on this, we'll be glad to review. |
Great to see that API documentation is now being discussed seriously! Very much 👍 for this effort. |
Sources/TensorFlow/Loss.swift
Outdated
@@ -262,6 +282,8 @@ func _vjpSoftmaxCrossEntropyHelper<Scalar: TensorFlowFloatingPoint>( | |||
} | |||
|
|||
/// Returns the sigmoid cross entropy (binary cross entropy) between logits and labels. | |||
/// Given the logits and probabilites, the sigmoid cross entropy computes `reduction(-sigmoid(logits) * log(p))` | |||
/// Where sigmoid(x) = `1 / (1 + exp(-x))` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will equations like these be shown as LaTex does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, code enclosed in backticks appear with inline code formatting, just like in Markdown.
The Jazzy API documentation generator doesn't support LaTex formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having no LaTex—at least for now—is good for user experience. @fchollet wrote an entire (best-selling) book on deep learning without any LaTex:
"...we’ll steer away from mathematical notation, which can be off-putting for those without any mathematics background and isn’t strictly necessary to explain things well."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@8bitmp3 you just made me feel nervous 😬 because my book will include equations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there are some points worthy discussing about the hinge losses:
|
Oks, sorry again for the delay, I believe that's about it, feedback is welcome as always. |
Pinging @BradLarson |
Co-Authored-By: Brad Larson <[email protected]>
And what do you guys think of this @8bitmp3 @BradLarson ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took me so long to review this, I wanted to get to the bottom of the categorical hinge loss question. Near as I can tell, our implementation matches that of both Keras and tf.keras:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/losses.py#L975
and it looks like the documentation for tf.keras might actually be wrong. If you submitted a PR correcting that on the tf.keras side, I bet they'd be happy to look into that.
Regarding converting binary values, let's pull that out as a separate issue to be discussed so that the rest of these documentation improvements can be integrated. Thank you again for your work on this.
As discussed in the google group I'm adding equations for the losses, but I'm afraid that without LaTeX support it may be a little bit unreadable, what do you think?
I have added equations for the most common losses as of now, after feedback I can write the rest.