Removing softmax as terminal activation function from two models #232

BradLarson · 2019-11-27T19:53:54Z

The LeNet-5 and DenseNet models were previously set up with a softmax activation function on their last layer. This caused problems during training, because they were then used with softmaxCrossEntropy(logits:labels:), leading to the softmax being applied twice. With this modification, the LeNet-5 model trained on MNIST matches the loss and accuracy of a reference Python TF 2.0 version of the same model at each step in training.

Additionally, the sum of the loss, rather than the average loss, was being reported for the LeNet-MNIST example. This has been corrected.

Finally, leftover TODO comments from a completed fix have been removed.

…ay for LeNet example.

Removing softmax from last layers of two models, improving loss displ…

0469c5d

…ay for LeNet example.

BradLarson requested a review from marcrasi December 2, 2019 21:09

marcrasi approved these changes Dec 2, 2019

View reviewed changes

BradLarson merged commit 14e694d into tensorflow:master Dec 2, 2019

BradLarson deleted the softmax branch December 2, 2019 21:25

BradLarson mentioned this pull request May 6, 2020

Removing softmax activation within remaining classification models #491

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removing softmax as terminal activation function from two models #232

Removing softmax as terminal activation function from two models #232

Uh oh!

BradLarson commented Nov 27, 2019

Uh oh!

Uh oh!

Removing softmax as terminal activation function from two models #232

Removing softmax as terminal activation function from two models #232

Uh oh!

Conversation

BradLarson commented Nov 27, 2019

Uh oh!

Uh oh!