Add float32 tests to travis #2264

kyleabeauchamp · 2017-06-03T16:07:07Z

No description provided.

kyleabeauchamp · 2017-06-03T20:42:57Z

The test failures are exactly as expected: the float32 fails, while float64 succeeds.

PS: I made the choice of running py27 + float32 and py3k + float64, under the assumption that it is unlikely that any errors will be correlated with BOTH py version and float precision.

kyleabeauchamp · 2017-06-03T20:45:01Z

Another possibility would be to separately add 3 more lines to the env matrix so that we have py27+32, py27+64, py36+64. That would take more time for CI, but allow us to see whether py27+36 are workiing for float64

ColCarroll · 2017-06-03T22:22:45Z

Erg -- I like the idea of these tests, but not sure how to get everything to run. I'd guess the tests could run about twice as fast if we looked over them all and figured out what is needed.

If I recall correctly, the free tier of travis has a limit of 50minutes per job, and 4 hours over all the jobs. Last successful build of master was 6 jobs of between 19 and 38 minutes, totaling 2:36.

All this is to say that I like your approach here, and wish there was a way to see if failures were due to python2 vs python3 or not. I wonder if there's a sneaky test fixture we could use that

catches theano TypeErrors,
reruns with float64
fails with the original message along with (and {passes | fails} with float64) appended

kyleabeauchamp · 2017-06-03T22:37:44Z

OK, I tried adding the 3 extra lines of tests. This will let us see py27/36 errors orthogonally from float32 errors. Hopefully the wall time is still within the limits.

twiecki · 2017-06-03T22:46:08Z

I agree with @ColCarroll. We would need to fix all tests before merging this. A different approach would be to fix individual tests and add a fixture for those that already work.

kyleabeauchamp · 2017-06-03T22:47:43Z

We don't need to fix all tests first. We have 3 separate indicators for the float32 tests, which will allow you to ignore them at will.

The alternative is not testing float32, which will inevitably lead to regressions and duplicated efforts.

ColCarroll · 2017-06-03T22:53:12Z

what about using pytest's xfail decorator? It allows you to mark expected failures given a condition, and will not fail the tests, but will mark tests that "unexpectedly pass". As more of the float32 tests pass the suite, the decorator could be removed to prevent regressions.

ColCarroll · 2017-06-03T22:53:46Z

The unexpected pass seems super useful in figuring out where the decorator could be removed, and you might even add the strict argument so that unexpected passes fail the test suite, forcing the commiter to remove the decorator greedily.

kyleabeauchamp · 2017-06-03T23:00:08Z

We have several tests that are run using parameterized, which IIRC won't work with xfail decorations.

Also, we're currently almost at the point where we should remove the decorator to prevent regressions. I basically did everything that I could without getting feedback from owners of individual tests. If float32 support is something that is desired, then exposing the test failures on Travis is a good way to floatX behavior on everyone's radar.

twiecki · 2017-06-04T10:39:34Z

@kyleabeauchamp What do you think we should do about the failing tests then? Keep float32 failing for the time being until we address all of them?

kyleabeauchamp · 2017-06-04T16:06:38Z

I guess I'm proposing that we could keep them failing, with the idea that we then check new features by success on float64 and not increasing the number of float32 failures.

ColCarroll · 2017-06-04T17:12:21Z

Hm... I haven't used xfail myself, so I'm not familiar with difficulties, but this suggests you can mark individual parameters as failing.

The problem with merging failing tests is that it becomes way more onerous to see why they're failing, and increases the chance of missing a regression. I think especially when trying to push out a 3.1 release, we need all the automated help we can get.

I'm suggesting having a single commit activate the float32 tests and xfail(strict=True) all the currently failing cases. As those remaining failures get fixed (inadvertently or not) the strict flag will naturally force removing those flags. I have a pretty good test setup on my machine and can make a pr with on your branch with those decorators if it would help?

Just feel like I should also emphasize that I think adding float32 to the test matrix is a great idea for getting float32 support working (and that it would be awesome if everything ran smoothly on the gpu). Maybe that can be a/the focus for 3.2 @twiecki ?

twiecki · 2017-06-05T09:18:58Z

@kyleabeauchamp Merged the previous two PRs, want to rebase?

kyleabeauchamp · 2017-06-06T03:06:56Z

So the failing tests seem to be running fine on my machine:

 pytest -x pymc3/tests/test_text_backend.py 
=============================================================================================== test session starts ================================================================================================
platform linux -- Python 3.6.0, pytest-3.0.7, py-1.4.32, pluggy-0.4.0
rootdir: pymc3, inifile: setup.cfg
collected 113 items 

pymc3/tests/test_text_backend.py ..X....X....X..XXXXXxXXXXXXXXXXXXXXXXXXXXxXXXXXXXXXXXXXXXXXXXXxXXXXXXXXXXXXXXXXX...XXXxXXXXXXXXXXXXXXxXXXXXXXXXXX

================================================================================ 15 passed, 5 xfailed, 93 xpassed in 20.74 seconds =================================================================================````

kyleabeauchamp · 2017-06-06T03:08:49Z

nevermind, still see issues with sqlite backend

kyleabeauchamp · 2017-06-06T03:55:04Z

So the previous commit fixes a problem that I only see on Travis, but not on my local setup. :(

kyleabeauchamp · 2017-06-06T05:39:11Z

OK this seems to be working now.

twiecki · 2017-06-06T09:34:09Z

pymc3/tests/test_distributions.py

@@ -430,7 +430,7 @@ def test_bound_normal(self):
        PositiveNormal = Bound(Normal, lower=0.)
        self.pymc3_matches_scipy(PositiveNormal, Rplus, {'mu': Rplus, 'sd': Rplus},
                                 lambda value, mu, sd: sp.norm.logpdf(value, mu, sd),
-                                 decimal=select_by_precision(float64=6, float32=0))
+                                 decimal=select_by_precision(float64=6, float32=-1))


Crazy that float32 matches so little...

twiecki · 2017-06-06T09:34:50Z

@kyleabeauchamp This looks fantastic, thanks for the herculian effort.

kyleabeauchamp added 3 commits June 3, 2017 09:06

add float32 tests

bd7a019

Try again

2608f4e

Use correct quotes

56fb1a1

Try full grid of py27, 36, and floatX

3a49643

ColCarroll mentioned this pull request Jun 4, 2017

WIP: Add basic tests for examples notebooks #2252

Closed

22 tasks

kyleabeauchamp added 5 commits June 5, 2017 08:01

More fixes for float32

d836a91

Merge remote-tracking branch 'upstream/master' into flags

ac49fb7

Merge branch 'flags' of github.com:kyleabeauchamp/pymc3 into flags

5ae1460

Merge remote-tracking branch 'upstream/master' into flags

507ce8e

Fix precision in test

5cb850a

kyleabeauchamp added 2 commits June 5, 2017 20:12

Skip two more tests

efac938

Fix another test

adacf81

Fix another test

65bca33

twiecki reviewed Jun 6, 2017

View reviewed changes

twiecki merged commit 21511ad into pymc-devs:master Jun 6, 2017

junpenglao mentioned this pull request Jun 15, 2017

Tests for the sqlite backend fail if floatX=float32 #1756

Closed

Uh oh!

Add float32 tests to travis #2264

Add float32 tests to travis #2264

Uh oh!

Conversation

kyleabeauchamp commented Jun 3, 2017

Uh oh!

kyleabeauchamp commented Jun 3, 2017

Uh oh!

kyleabeauchamp commented Jun 3, 2017

Uh oh!

ColCarroll commented Jun 3, 2017

Uh oh!

kyleabeauchamp commented Jun 3, 2017

Uh oh!

twiecki commented Jun 3, 2017

Uh oh!

kyleabeauchamp commented Jun 3, 2017

Uh oh!

ColCarroll commented Jun 3, 2017

Uh oh!

ColCarroll commented Jun 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyleabeauchamp commented Jun 3, 2017

Uh oh!

twiecki commented Jun 4, 2017

Uh oh!

kyleabeauchamp commented Jun 4, 2017

Uh oh!

ColCarroll commented Jun 4, 2017

Uh oh!

twiecki commented Jun 5, 2017

Uh oh!

kyleabeauchamp commented Jun 6, 2017

Uh oh!

kyleabeauchamp commented Jun 6, 2017

Uh oh!

kyleabeauchamp commented Jun 6, 2017

Uh oh!

kyleabeauchamp commented Jun 6, 2017

Uh oh!

twiecki Jun 6, 2017

Choose a reason for hiding this comment

Uh oh!

twiecki commented Jun 6, 2017

Uh oh!

Uh oh!

ColCarroll commented Jun 3, 2017 •

edited

Loading