[cctbxbb] [git/cctbx] master: rename test files, remove them from run_tests (23a4a6fe4)
ndevenish at gmail.com
Sat Mar 3 04:11:47 PST 2018
Plethora of additional points in support of pytest:
- Classification: bootstrap dials, then run the tests. All will say "OK"
but in reality only about 1/3 of them ran for various skip reasons. This is
*horribly* misleading and has led to broken builds in the past. I proposed
a patch a while ago to add a skip classification but this was rejected
because some tests (two, I think?) in phenix classified as 'SKIP' instead
of 'OK'. There is a system to classify as 'WARN', but *only* if
phenix_regression is installed, it misclassifies several items in
dials/cctbx (apparently OK showing as WARN is acceptable where OK as SKIP
wasn't) and in any case relies on a horrible whitelist of "text that looks
like warnings but shouldn't classify the test as a warning". In pytest,
things like skip/warn/expected failure are easy (e.g. you can mark tests as
"this is currently expected to fail" rather than leaving your CI^H^Hdaily
build failing permanently)
- Technical debt/NIH: It's one framework for you to learn, sure. But it's N
'frameworks' to learn for the next N developers - and since pytest is very
common, there is a good chance that they'll either know it already, or the
knowledge will be usefully applicable to other projects in the future.
Every single new developer has to go through the WTF-laden process of
learning the 'historical' idiosyncrasies.
- Categorisation: A dumb 'list of tests' is okay until you have e.g. very
long regression tests that don't need to be run as frequently as unit tests
- Documentation: How does something work/how do you do something
unfamiliar? I'll let you compare the documentation for libtbx approx_equal
pytest's approx https://docs.pytest.org/en/latest/builtin.html#pytest.approx
- Maintenance/bit rot: Somebody else writes this documentation, somebody
else finds and fixes the bugs, and somebody else is responsible for making
things better. Something untouched for a decade that people are afraid to
or prevented from improving is not a good thing.
Additionally, I'd want to argue great improvements and benefits in things
like locality, parametrization, granularity (separation of tests),
formality, discoverability, as well as lots of other things, but I think
you get my point. Basically, every possible improvement a formal testing
suite could have over what is effectively a dumb list of files to run.
Python's unittest would have most of these benefits, but with the
disadvantage of a little more boilerplate. Pytest has less boilerplate and
syntax, and for almost equivalent total functionality you only have to
remember a) "call the files test_something.py and the functions
test_something" b) use assert (you can even use libtbx.approx_equal if you
On Sat, Mar 3, 2018 at 11:03 AM, <markus.gerstel at diamond.ac.uk> wrote:
> Hi Pavel,
> pytest was suggested* by a couple of people in 2015 when I asked about
> unittest. Your recommendation at that time was to go ahead, so thats what I
> did - with the result that in April 2016 the pytest compatibility layer was
> introduced to cctbx and pytest included in the base build, so we can use
> pytests alongside libtbx tests. In our packages we have been using pytests
> for well over 2 years now, and just one month ago converted xia2 to pytest
> As to "why pytest" I would like to point you to our wiki page "Why
> pytest?" at https://github.com/dials/dials/wiki/Why-pytest%3F which does
> make the case in more detail and includes instructions to enable pytest for
> cctbx modules. However I'm more than happy to demonstrate part of it here
> using your example.
> First of all, the example you gave is not quite correct. You are missing
> the import for approx_equal and missed the crucial line in run_tests.py
> that ensures the test is actually run. This is one of the drawbacks of
> libtbx testing: there is no automatic test discovery. And this tends to
> happen quite a lot, cf. https://github.com/dials/dials/issues/506
> In pytest your example would look like this:
> import pytest
> def test_that_2_times_2_is_4():
> x = 2
> result = x*x
> assert result == pytest.approx(4)
> which I would argue has the exact same documentation/example value.
> Let's change the expectation value from 4 to 5, forget everything we know
> about the code, run it and compare the output:
> $ python tst_multiply.py
> approx_equal eps: 1e-06
> approx_equal multiplier: 10000000000.0
> 4.0 approx_equal ERROR
> 5.0 approx_equal ERROR
> Traceback (most recent call last):
> File "tst_multiply.py", line 10, in <module>
> File "tst_multiply.py", line 7, in exercise
> assert approx_equal(result, 5., 1.e-6)
> *Something* went wrong, it has to do with 1e-06, a very large number (I
> still don't understand what that means), two errors for 4.0 and 5.0, and a
> function called exercise(), which tells me exactly nothing. I have to look
> into the code to understand what went wrong and why. From the outset I
> don't know what the code is supposed to do.
> $ pytest
> =================================== test session starts
> platform linux2 -- Python 2.7.13, pytest-3.1.3, py-1.4.34, pluggy-0.4.0
> rootdir: /dls/tmp/wra62962/directories/lh0UFeF6, inifile:
> plugins: xdist-1.20.1, timeout-1.2.0, forked-0.2, cov-2.5.1
> collected 2 items
> test_multiplication.py .F
> ======================================== FAILURES
> ___________________________ test_that_two_times_two_equals_five
> def test_that_two_times_two_equals_five():
> x = 2
> result = x*x
> > assert result == pytest.approx(5)
> E AssertionError: assert 4 == 5 +- 5.0e-06
> E + where 5 +- 5.0e-06 = <class '_pytest.python.approx'>(5)
> E + where <class '_pytest.python.approx'> = pytest.approx
> test_multiplication.py:11: AssertionError
> =========================== 1 failed, 1 passed in 0.04 seconds
> Oh look, a test to ensure two times two equals five failed. Apparently it
> compared 4 to 5 +- 5e-06.
> You also see that the other test in the file that I left in (which
> compares 2*2 to 4) worked. You didn't know that from the libtbx-style test
> Now I can fix the test, and pytest allows me to just rerun the failed
> tests, not everything.
> If you want to know how, have a look at https://github.com/dials/
> dials/wiki/pytest which contains a lot more information about running
> pytest and converting tests.
> If you don't want to use it that is fine, too. Thanks to the compatibility
> layer you can still use libtbx.run_tests_parallel, and you will still get
> the more useful assertion messages. I would encourage you to try it though.
> You might find it useful.
> * And rightly so; pytest requires much less boilerplate, produces cleaner
> code, and is overall just better - Thank you, Luc. Thank you, Jamasi.
> PS: On fable specifically: thanks to the conversion I already found and
> fixed one broken test which didn't fail, and another test with race
> From: cctbxbb-bounces at phenix-online.org [cctbxbb-bounces at phenix-online.org]
> on behalf of Pavel Afonine [pafonine at lbl.gov]
> Sent: Saturday, March 03, 2018 07:13
> To: cctbx mailing list; Winter, Graeme (DLSLtd,RAL,LSCI)
> Subject: Re: [cctbxbb] [git/cctbx] master: rename test files, remove them
> from run_tests (23a4a6fe4)
> I'd say at least because:
> - the first 10+ years of CCTBX did not use pytest. AFAIK, the first
> attempt was by our postdoc Youval Dar back in 2015 (correct me if I'm
> wrong). I feel adding different testing styles are only to make the
> code-base inconsistent (very much like mixing flex and np arrays isn't
> cool, in my opinion!).
> - originally tests were considered as simple usage examples for
> functionalities they are testing; this is because writing and (most
> importantly!) maintaining the proper documentation was not provisioned.
> A simple test like
> def exercise():
> """ Make sure 2*2 is 4. """
> assert approx_equal(result, 4., 1.e-6)
> if(__name__ == "__main__"):
> print "OK"
> is much easier to grasp rather than the same cluttered with the stuff
> (that, to add to the trouble, one needs to learn in the first place!).
> All the best,
> On 3/3/18 14:36, Graeme.Winter at diamond.ac.uk wrote:
> > What’s bad about pytest?
> >> On 3 Mar 2018, at 02:26, Pavel Afonine <pafonine at lbl.gov> wrote:
> >> Just to make sure: you are converting to use pytest this particular
> codes (fable), correct?
> >> Pavel
> >> P.S.: I'm allergic to pytest.
> >> On 3/3/18 07:46, CCTBX commit wrote:
> >>> This in preparation for pytestification.
> >> _______________________________________________
> >> cctbxbb mailing list
> >> cctbxbb at phenix-online.org
> >> http://phenix-online.org/mailman/listinfo/cctbxbb
> cctbxbb mailing list
> cctbxbb at phenix-online.org
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> cctbxbb mailing list
> cctbxbb at phenix-online.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cctbxbb