-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add float32 tests to travis #2264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The test failures are exactly as expected: the float32 fails, while float64 succeeds. PS: I made the choice of running py27 + float32 and py3k + float64, under the assumption that it is unlikely that any errors will be correlated with BOTH py version and float precision. |
Another possibility would be to separately add 3 more lines to the env matrix so that we have py27+32, py27+64, py36+64. That would take more time for CI, but allow us to see whether py27+36 are workiing for float64 |
Erg -- I like the idea of these tests, but not sure how to get everything to run. I'd guess the tests could run about twice as fast if we looked over them all and figured out what is needed. If I recall correctly, the free tier of travis has a limit of 50minutes per job, and 4 hours over all the jobs. Last successful build of master was 6 jobs of between 19 and 38 minutes, totaling 2:36. All this is to say that I like your approach here, and wish there was a way to see if failures were due to python2 vs python3 or not. I wonder if there's a sneaky test fixture we could use that
|
OK, I tried adding the 3 extra lines of tests. This will let us see py27/36 errors orthogonally from float32 errors. Hopefully the wall time is still within the limits. |
I agree with @ColCarroll. We would need to fix all tests before merging this. A different approach would be to fix individual tests and add a fixture for those that already work. |
We don't need to fix all tests first. We have 3 separate indicators for the float32 tests, which will allow you to ignore them at will. The alternative is not testing float32, which will inevitably lead to regressions and duplicated efforts. |
what about using pytest's xfail decorator? It allows you to mark expected failures given a condition, and will not fail the tests, but will mark tests that "unexpectedly pass". As more of the float32 tests pass the suite, the decorator could be removed to prevent regressions. |
The unexpected pass seems super useful in figuring out where the decorator could be removed, and you might even add the |
We have several tests that are run using parameterized, which IIRC won't work with xfail decorations. Also, we're currently almost at the point where we should remove the decorator to prevent regressions. I basically did everything that I could without getting feedback from owners of individual tests. If float32 support is something that is desired, then exposing the test failures on Travis is a good way to floatX behavior on everyone's radar. |
@kyleabeauchamp What do you think we should do about the failing tests then? Keep float32 failing for the time being until we address all of them? |
I guess I'm proposing that we could keep them failing, with the idea that we then check new features by success on float64 and not increasing the number of float32 failures. |
Hm... I haven't used xfail myself, so I'm not familiar with difficulties, but this suggests you can mark individual parameters as failing. The problem with merging failing tests is that it becomes way more onerous to see why they're failing, and increases the chance of missing a regression. I think especially when trying to push out a 3.1 release, we need all the automated help we can get. I'm suggesting having a single commit activate the Just feel like I should also emphasize that I think adding float32 to the test matrix is a great idea for getting float32 support working (and that it would be awesome if everything ran smoothly on the gpu). Maybe that can be a/the focus for 3.2 @twiecki ? |
@kyleabeauchamp Merged the previous two PRs, want to rebase? |
So the failing tests seem to be running fine on my machine:
|
nevermind, still see issues with sqlite backend |
So the previous commit fixes a problem that I only see on Travis, but not on my local setup. :( |
OK this seems to be working now. |
@@ -430,7 +430,7 @@ def test_bound_normal(self): | |||
PositiveNormal = Bound(Normal, lower=0.) | |||
self.pymc3_matches_scipy(PositiveNormal, Rplus, {'mu': Rplus, 'sd': Rplus}, | |||
lambda value, mu, sd: sp.norm.logpdf(value, mu, sd), | |||
decimal=select_by_precision(float64=6, float32=0)) | |||
decimal=select_by_precision(float64=6, float32=-1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crazy that float32 matches so little...
@kyleabeauchamp This looks fantastic, thanks for the herculian effort. |
No description provided.