Test Infrastructure: preferably not Google Test!

Introduction

We recently discussed how tests of Beman projects should be implemented. There was a post (by Louis Dionne) somewhere I can’t find arguing that tests based on assert() potentially allow their use by standard library implementations: using the tests from corresponding Beman projects seems really attractive, even if they are only part of the actually used tests. The tests for these libraries can’t really depend on testing frameworks depending on the standard library itself, especially if they have a link time component: the standard libraries are frequently tested with multiple configurations which aren’t necessary binary compatible. As a result, something like Google Test is pretty much out of question.

On the other hand, I’d rather have people contributing projects and testing with Google Test than either not having the corresponding projects contributed or not have them tested - both would be worse outcomes. Thus, the approach to testing will certainly be a guideline to which deviations are allowed. It is still worth outlining what is desirable to have from tests and what approaches can be used to achieve ideally all of them.

Desirable Features of Tests

The primary objective of tests is to improve the confidence that the software tested is of good quality. There are various ways to achieve that and depending on how it is done different features can be achieved. Here are things I consider worth considering. This list is probably not complete and I guess I’ll remember an important feature milliseconds after pressing the “Create Topic” button. It could be a starting point, though.

Cmake Integration

It seems the direction for Beman projects is to build with cmake: whatever is used for testing should be able to integrate with cmake. That is

  • Tests are rebuild whenever the test or anything it depends on changes.
  • Tests are run as part of the build.
  • A test failure results in a build failure.

Testing Different Implementations

The components in Beman projects are supposed to live in a namespace like Beman::Something26. A future standard library implementation would likely live in namespace std or namespace std::something but certainly not in namespace Beman::Something26. If tests are implemented in terms of Beman::Something26 they’ll need to be rewritten on some level. This rewrite may be mostly mechanical but it may be easier to test projects using an indirection coming from a test configuration file:

// file: beman-test-config.hpp
namespace tn = Beman::Something26;

The actual tests would then use tn (or whatever name is deemed appropriate) to refer to the entities under test using explicit qualification or a using directive. Anybody wanting to test their implementation of the same interface/specification in a different namespace just provides a different configuration header. A project specific configuration header could declare similar aliases for packages it depends on and use corresponding qualification where necessary to also tests with different combinations of dependencies:

using on = Beman::Optional26;

For dependencies a similar approach may even be reasonable in the actual implementation: an implementation initially using Beman::Optional26 can then easily be migrated to use the standard component once it becomes available.

Assert Only

When testing an actual standard library it shouldn’t depend on, well, a standard library. Using different build configuration immediately cause grief if there is an object file as the build configuration may use different ABIs. Instead, a really basic test mechanism using, essentially, just <cassert>'s assert(condtion) as the basis is preferable. However, ideally even that can be replaced: reporting test failures may require more than just bailing out at a specific place and relying on whatever information the assert() macro provides or a debugger, e.g., when testing on a free-standing platform (see below). If there is a configuration, it could also provide some tools which are a bit more fancy (the code below may not be quite correct as I think assert(c) no support variadic parameters and that should be covered):

// file: beman-test-config.hpp
#if BEMAN_HAS_TEST_OVERRIDE
#    include "local-test-config.hpp"
#else
#    include <cassert>
#    define BEMAN_ASSERT(id, cond)  assert(cond)
#endif

Free-Standing

At least some parts of the standard C++ library will target free-standing environments: I want my library to also run on an ESP32. I currently don’t have much experience but it seems the work-flow for embedded systems often is:

  1. Connect the device using a USB cable to a computer, possibly holding down a [tiny] button on the device, and transfer the program to a drive appearing in the file system.
  2. Disconnect the device and connect it to a power source to run the program. Maybe it can communicate results back using USB, Wifi, or Bluetooth but it may also just blink a LED fast or slow to indicate success or failure.

Testing with such a process probably means that abort()ing on an assertion failure is really annoying and you may want to get information about multiple failures, at least. Also, running a program per test case is pretty much out of question: if there are many test cases you’d ideally run all of them. The implication is that it is probably desirable to build all tests into one executable and run all of them. On the other hand, on a hosted environment having many small programs may be easier to manage.

To support both uses case I could imagine some kind of test driver (lit?) which just overrides the name of the entry point appropriately and provides a main() calling into the entry point on a hosted environment. Maybe tests could even be written to be viable Google Tests without actually requiring the use of Google Tests.

Component-Level Coverage

A common approach for checking coverage is to run all tests of a project collecting coverage information, e.g., using gcov and then reporting the aggregated results. Doing so is certainly better than not considering coverage at all but this kind of coverage report includes “accidental” coverage: any component used while testing another components may receive additional coverage. However, the use of that code isn’t really necessarily tested at all! Thus, it is desirable to somehow declare which component is actually tested by a given test and only aggregate coverage of a components from running its associated tests.

An approach to achieve that is implementing components in component-specific files and considering tests in files matching these names to be tests of the component. A test driver may then need to carefully aggregate the coverage report. The result is hopefully higher confidence in the test results assuming there is suitable overall coverage achieved.

Standard I/O Channels

Tests are ideally just evaluating code and don’t depend on anything external. However, some component constantly writing something to standard output or standard error because some write(1, "I'm here\n", 9); statement wasn’t removed is still a problem (I’m sure, nobody would ever do that, of course!). Testing should ideally be able to verify a program’s side effects. In a previous life I used dejagnu to test my implementation of standard library components: it does allow full control over standard input, output, and error channels (I’m not recommending to use dejagnu - in the nearly 30 year since I did that we have [hopefully] moved on).

Shared Source of Test Infrastructure

Ideally, the bulk of the testing infrastructure isn’t duplicated in every project. Instead, there should be one central place with hopefully easy to follow instructions for using tests in a Beman (and possiby other) project. Simply using Google Test achieves some of the objectives but certainly not all.

What To Use?

After listing desirable features, I admit that I don’t have a ready solution addressing all I want. I think I know how most of these things can be implemented and it doesn’t seem hard. For past implementations of standard library facilities I did run my own testing infrastructure but I wouldn’t want to advertise any of these!

Should we create a Beman testing framework or is there something reasonably covering the desirable features we agree are desirable?

1 Like

Louis Dionne made this comment which described Lit with clang-verify which I believe he was advocating for at C++Now.

Having standard workflows for testing specific projects is the important part to me. One should not need to peruse README files to understand how (or even if!) a specific project provides regression testing.

The specific test driver technology is less important to me. I could see googletest working well in some cases and something like lit working better for others, perhaps even in the same project.

More important is what volunteers we get to set up automation workflows to support testing. And what volunteers we get to write and maintain the tests themselves.

Note that cmake actually doesn’t give you any of this. Neither ctest or the test target that cmake generates in the build system depend on the builds. Workflow automation outside cmake needs to be set up.

The object file problem is why GoogleTest recommends vendoring it into your project, and discourages packaging it. Bloomberg does so, but we also control ABI strictly within a package distro, so we can get away with it.

Optional26 uses gtest because it’s what I’m most familiar with, it’s wired into my scratch template, and mostly worked. However, the problems of testing static_assert rendering of “Mandates” is a major problem for googletest, and probably for most testing frameworks. lit is probably the best in this space, but it’s very much unlike most xUnit style frameworks.

I’m not entirely convinced that testing standard library level components should be in terms of even more primitive components. It’s a philosophical argument I’ve been having with Lakos for the last couple decades, and unlikely to be resolved any time soon.

re: indirection for tests -
I’ve done it with macros, but that’s awful: Compiler Explorer near the top of the code. But it did make comparison a lot easier. Abusing #include to c/p tests might work, too? Now that compiler-explorer supports multi-file projects, too.

Last release of dejagnu was 3 years ago. LLVM has some more modern tools here, but the space is … underserved?

Considering that we’re going to want to build a matrix of compilers, sanitizers, debug/release mode flags, fuzzers(?) we probably will have to rebuild any linked infrastructure every time. So we’re either vendoring into the build tree via git submodule or subtree, or just a commit, or doing a FetchContent of some kind, which is still per build, but a bit more hidden. We won’t be able to rely on any binary packaging system.

1 Like

Attempting to port Beman.Optional64 to use lit would be a great initial project for a new contributor.

Are there any time constraints on this? I’d like to start contributing, but I only have a limited amount of time I can spend on it.

I’m working on a writeup on Lit that concludes that its adoption probably isn’t ideal for us. I hope to have that done today and that could help inform what the next steps we’d take.

1 Like

It seems to be the policy of at least libc++ and libstdc++, so whatever its merits as a philosophical discussion, the reality is that the actual std::lib tests that might benefit from Beman tests do not (and will not) use gtest. That’s not likely to change soon either.

2 Likes

As someone who has used lit before, and wished it had better integration with CMake/CTest for familiarity and things like compile_commands.json integration, I was wondering if the main functionality could be implemented in a bit of plain CMake code with CTest as the test runner. This is the result: https://github.com/jiixyj/cmake-assert-tests. It supports .pass, .compile.pass, .compile.fail and .verify tests for now.

It supports both “merged” (with the code in src/) and “separated” (top-level tests/ folder) test placement. With a folder structure like this:

.
├── src
│   ├── CMakeLists.txt
│   ├── main.cpp
│   ├── testlib
│   │   ├── add.cpp
│   │   ├── add.h
│   │   ├── add.test
│   │   │   ├── add.pass.cpp
│   │   │   ├── constexpr_add.pass.cpp
│   │   │   ├── constructor.compile.fail.cpp
│   │   │   ├── constructor.compile.pass.cpp
│   │   │   └── constructor.verify.cpp
│   │   ├── fibonacci.h
│   │   └── fibonacci.test
│   │       └── fibonacci.pass.cpp
│   ├── testlib.h
│   └── testlib.test
│       └── include.compile.pass.cpp
└── tests
    ├── CMakeLists.txt
    └── testlib
        └── add
            └── add.pass.cpp

…you get those tests registered with CTest:

  Test #1: testlib:include.compile.pass
  Test #2: testlib.add:add.pass
  Test #3: testlib.add:constexpr_add.pass
  Test #4: testlib.add:constructor.compile.fail
  Test #5: testlib.add:constructor.compile.pass
  Test #6: testlib.add:constructor.verify
  Test #7: testlib.fibonacci:fibonacci.pass
  Test #8: testlib/add/add.pass

It’s just a PoC/experiment for now, but maybe something like this could fill the niche of “tests without a test framework” but without a “heavy” test runner. Of course, the danger is that it will over time morph into an ad hoc, informally-specified, bug-ridden, slow implementation of half of lit…

I am unsure if that is the best idea. Builds and rebuilds should be fast. We probably do not want to pay for the unit test run every time. Also, if some test is already failing (because of another bug), it might be hard to prove that our unrelated change compiles fine.

What we typically do is incorporate a test run into the CI runs and also while building a project or creating a package with package managers. See the following for implementation details:
mp-units/conanfile.py at ed9e67537bbfa5395975e6c9403c92eb7429edd0 · mpusz/mp-units · GitHub.

With the above, conan create and conan build will run unit tests and header standalone tests right after the CMake build. But when we rebuild the project in the IDE as a result of incremental feature development, we do not want to pay for tests to run every single time. Most IDEs allow running popular test frameworks as a separate step inside their GUI.

Of course, we can also run CTest to run all of the tests from the command line.

As I already mentioned header standalone test, I would strongly recommend those to Beman projects as well. More can be found at CMake documentation.

A static_assert() is a perfect test framework for compile-time-enabled libraries. If the project compiles, it is correct :slight_smile:

I have thousands of static_asserts in my unit test files in mp-units. With them, I can test for observable runtime behavior and verify if the library generates proper types at compile time. I also have an undefined behavior verification for free.

Another thing to test in a modern library is negative tests for constraints. We need to throw some types that do not satisfy concepts and ensure that the type under test does not compile. It is trivial to do with static_assert as well.

Using alternative runtime frameworks for such libraries may not catch all the issues. For example, recently, we had to remove inplace_vector from the plenary session because adding constexpr to some functions made the feature unimplementable. This would not be tested with GTest.

Of course, this does not mean that we do not need runtime unit testing at all. Some libraries will not be fully compile-time enabled. Even for mp-units, I have to test std::format at runtime as we still can’t format a text at compile time. Also, some math functions are not constexpr until C++26.

1 Like