Introduction
We recently discussed how tests of Beman projects should be implemented. There was a post (by Louis Dionne) somewhere I can’t find arguing that tests based on assert()
potentially allow their use by standard library implementations: using the tests from corresponding Beman projects seems really attractive, even if they are only part of the actually used tests. The tests for these libraries can’t really depend on testing frameworks depending on the standard library itself, especially if they have a link time component: the standard libraries are frequently tested with multiple configurations which aren’t necessary binary compatible. As a result, something like Google Test is pretty much out of question.
On the other hand, I’d rather have people contributing projects and testing with Google Test than either not having the corresponding projects contributed or not have them tested - both would be worse outcomes. Thus, the approach to testing will certainly be a guideline to which deviations are allowed. It is still worth outlining what is desirable to have from tests and what approaches can be used to achieve ideally all of them.
Desirable Features of Tests
The primary objective of tests is to improve the confidence that the software tested is of good quality. There are various ways to achieve that and depending on how it is done different features can be achieved. Here are things I consider worth considering. This list is probably not complete and I guess I’ll remember an important feature milliseconds after pressing the “Create Topic” button. It could be a starting point, though.
Cmake Integration
It seems the direction for Beman projects is to build with cmake
: whatever is used for testing should be able to integrate with cmake
. That is
- Tests are rebuild whenever the test or anything it depends on changes.
- Tests are run as part of the build.
- A test failure results in a build failure.
Testing Different Implementations
The components in Beman projects are supposed to live in a namespace like Beman::Something26
. A future standard library implementation would likely live in namespace std
or namespace std::something
but certainly not in namespace Beman::Something26
. If tests are implemented in terms of Beman::Something26
they’ll need to be rewritten on some level. This rewrite may be mostly mechanical but it may be easier to test projects using an indirection coming from a test configuration file:
// file: beman-test-config.hpp
namespace tn = Beman::Something26;
The actual tests would then use tn
(or whatever name is deemed appropriate) to refer to the entities under test using explicit qualification or a using
directive. Anybody wanting to test their implementation of the same interface/specification in a different namespace just provides a different configuration header. A project specific configuration header could declare similar aliases for packages it depends on and use corresponding qualification where necessary to also tests with different combinations of dependencies:
using on = Beman::Optional26;
For dependencies a similar approach may even be reasonable in the actual implementation: an implementation initially using Beman::Optional26
can then easily be migrated to use the standard component once it becomes available.
Assert Only
When testing an actual standard library it shouldn’t depend on, well, a standard library. Using different build configuration immediately cause grief if there is an object file as the build configuration may use different ABIs. Instead, a really basic test mechanism using, essentially, just <cassert>
's assert(condtion)
as the basis is preferable. However, ideally even that can be replaced: reporting test failures may require more than just bailing out at a specific place and relying on whatever information the assert()
macro provides or a debugger, e.g., when testing on a free-standing platform (see below). If there is a configuration, it could also provide some tools which are a bit more fancy (the code below may not be quite correct as I think assert(c)
no support variadic parameters and that should be covered):
// file: beman-test-config.hpp
#if BEMAN_HAS_TEST_OVERRIDE
# include "local-test-config.hpp"
#else
# include <cassert>
# define BEMAN_ASSERT(id, cond) assert(cond)
#endif
Free-Standing
At least some parts of the standard C++ library will target free-standing environments: I want my library to also run on an ESP32. I currently don’t have much experience but it seems the work-flow for embedded systems often is:
- Connect the device using a USB cable to a computer, possibly holding down a [tiny] button on the device, and transfer the program to a drive appearing in the file system.
- Disconnect the device and connect it to a power source to run the program. Maybe it can communicate results back using USB, Wifi, or Bluetooth but it may also just blink a LED fast or slow to indicate success or failure.
Testing with such a process probably means that abort()
ing on an assertion failure is really annoying and you may want to get information about multiple failures, at least. Also, running a program per test case is pretty much out of question: if there are many test cases you’d ideally run all of them. The implication is that it is probably desirable to build all tests into one executable and run all of them. On the other hand, on a hosted environment having many small programs may be easier to manage.
To support both uses case I could imagine some kind of test driver (lit?) which just overrides the name of the entry point appropriately and provides a main()
calling into the entry point on a hosted environment. Maybe tests could even be written to be viable Google Tests without actually requiring the use of Google Tests.
Component-Level Coverage
A common approach for checking coverage is to run all tests of a project collecting coverage information, e.g., using gcov
and then reporting the aggregated results. Doing so is certainly better than not considering coverage at all but this kind of coverage report includes “accidental” coverage: any component used while testing another components may receive additional coverage. However, the use of that code isn’t really necessarily tested at all! Thus, it is desirable to somehow declare which component is actually tested by a given test and only aggregate coverage of a components from running its associated tests.
An approach to achieve that is implementing components in component-specific files and considering tests in files matching these names to be tests of the component. A test driver may then need to carefully aggregate the coverage report. The result is hopefully higher confidence in the test results assuming there is suitable overall coverage achieved.
Standard I/O Channels
Tests are ideally just evaluating code and don’t depend on anything external. However, some component constantly writing something to standard output or standard error because some write(1, "I'm here\n", 9);
statement wasn’t removed is still a problem (I’m sure, nobody would ever do that, of course!). Testing should ideally be able to verify a program’s side effects. In a previous life I used dejagnu
to test my implementation of standard library components: it does allow full control over standard input, output, and error channels (I’m not recommending to use dejagnu
- in the nearly 30 year since I did that we have [hopefully] moved on).
Shared Source of Test Infrastructure
Ideally, the bulk of the testing infrastructure isn’t duplicated in every project. Instead, there should be one central place with hopefully easy to follow instructions for using tests in a Beman (and possiby other) project. Simply using Google Test achieves some of the objectives but certainly not all.
What To Use?
After listing desirable features, I admit that I don’t have a ready solution addressing all I want. I think I know how most of these things can be implemented and it doesn’t seem hard. For past implementations of standard library facilities I did run my own testing infrastructure but I wouldn’t want to advertise any of these!
Should we create a Beman testing framework or is there something reasonably covering the desirable features we agree are desirable?