Considering Lit for the Beman Project

Abstract

The LLVM Integrated Tester (lit) is being proposed as the recommended test framework for Beman Projects. While it supports compilation failure testing, it has limited usage experience, introduces complexity, and has ergonomics challenges. We instead suggest recommending a more popular framework, such as GTest, supplemented with CMake functions adding negative compilation testing capabilities.

Link to full doc…

That sounds like the right direction, thank you for exploring that.
As discussed during the meeting, for a simple go-to solution we could recommend “doctest” (somewhat improved version of gtest):

I agree with the findings in the paper. Seems like there’s no one-size-fits-all testing solution. Having gtest, catch2, or doctest as a default recommendation sounds sensible. In cases where these are not an option (testing standard lib) or are missing features (assert compile failure) authors can go for a more suitable solution.

I don’t have any experience with doctest. I’ve used gtest and catch2 and was happy with both. I think catch2 was a bit easier to use compared to gtest and I like the expressiveness of the Given-When-Then-style tests in catch2.

I’m fine with the conclusion of not using lit. The python dependency and complexity of deploying lit are real issues.

However, I’m not sure about recommending tools like GTest since that will make it more difficult for standard library vendors to adopt Beman tests. None of the big three vendors use a testing framework like GTest, catch2, or doctest. Instead, they have a bunch of TUs, each with their own main. If Beman tests follow that pattern, then it is easy to move the tests into each respective STL vendor’s test suite. Those frameworks are all great for most library uses, but they are less great if the goal is to get into a standard library. I’ll also note that using one of the big test frameworks will end up as a barrier for freestanding testing.

My suggestion will be main tests, but orchestrated by cmake code. Much of that work would need to be done for negative compilation testing anyway, so making it work for the positive testing too doesn’t seem like much of a stretch.

I’ll note that lit is used by all of clang / llvm for testing, and not just the libc++ portion of the project. That demonstrates more of the flexibility of the framework. A long time ago (2015) I was able to use lit to cross-compile libc++ tests on my host machine, then run the tests on a simulator.

One of the benefits of lit that wasn’t mentioned was in-test build configuration control. The test framework can attempt to build and run each test under multiple configurations, and the tests can indicate whether they support that configuration or not. In libc++, this most commonly manifests as saying that a test only supports certain standards, but other features can be tested for (example). I can imagine Beman tests needing to annotate that they are expected to fail on certain implementations.

I do feel that the adoption and documentation roadblock points are a bit overblown. If anything, the concern is in the other direction (i.e. the test frameworks are harder to understand). With lit tests (or other main tests), you write a main, add some asserts, and you are done (example). With the frameworks, you need to see how this framework defines their tests, how this framework spells it’s asserts and expectations, and so on.

1 Like

Don’t those kind of cmake functions (such as icm) have even less usage experience than lit? (which is used by all of LLVM, not just by libc++ as stated in the doc).

icm’s build failure testing capability certainly has less usage experience than lit’s. A GitHub code search revealed only a couple Open Source projects using it. It was interesting to see NVIDIA/stdexec as one of those.

By the way, I updated the document to clarify all of LLVM is using lit and that icm has had little uptake in the Open Source world.

1 Like

In my opinion, header-only libraries should have the option to be tested with header-only test frameworks.
Also, to not generate almost any warning when included.

That limits the possibilities a lot; as far as I know Gtest and Catch2 are not header-only.

Doctest is header only (but I never used it), so are Boost.LightweightTest and mu-t (seems experimental).
Boost.Test can be included as header only but the compile times are abysmal, and generates lots of warnings.

Not sure about others possibilities, but this is something to take into account for header-only proposals that I expect to be the common case.

I’m not terribly concerned about the python dependency as python is easily available and python has lots of tools for managing projects. See other thread on pre-commit. I should have a draft PR soon for example and optional.
Lit, on the other hand, doesn’t seem to be packaged, so would be very complicated to get working portably? Unless I’ve missed something. An LLVM development install is a huge ask.

However, for a lot of tests, they are going to look just like the source for lit tests. Making the negative tests, the failure to compile, more reusable should be a goal. That’s outside what xUnit style frameworks like gtest and catch2 do, in any case. We’re going to have to do something different in any case.

Welcome @correaa!

Could you elaborate a bit on the use case you have in mind? People using header-only libraries I know usually paste the headers in their repository somewhere. Will your users be using CMake? How would they build the tests?

This is a good point. One solution would be to forgo the unit test framework niceties and stick with assert. Although we wouldn’t be using lit, I expect tests written that way to be extremely portable.

Yes, what I am saying is that, in general, for someone developing a header only library, it is unreasonable to also require (or strongly suggest) to use a test framework that needs compilation or binaries.

This is my experience:
I developed a a Boost-like library (an array library to be specific) that is header only.
I started using Catch2 when it is was header only and all was fine.
Then Catch2 changed in a way that needed compilation.

Given the complexity, and since this was going to be a Boost library anyway, I changed to Boost.Test, which also needed compilation.
This worked well for a while, because Boost.Test comes precompiled in many systems and it is as nice as Catch2.

However this became a burden when I needed to test the library against mildly exotic systems: 32-bit systems, apple-M systems and Windows (MSVC, clang, gcc).
In these cases, preparing Boost.Test for these systems required compilation of Boost.Test (or in the case of Windows, downloading large binaries) which was too heavy in the CI.

Boost.Test can be used in header-only mode, but it is not designed for that, so compilation times were extremely large because I have a few dozens of cpp files in the test.

At the end of the day, recently, I ended up using Boost.LightweightTest (part of Boost.Core), which doesn’t have too many features but it is very lightweight and it has a proportional complexity (it is not an overkill) with the library I am trying to test.

I am not suggesting to use B.LWT specifically, I am trying to communicate the value of having the option of a lightweight, header-only framework for a small, header-only library, which I guess is going to be a common case for Beman.

For large projects, I guess it is justified to use compiled test frameworks, because the upfront cost is amortized by the complexity and compilation times of the large projects. Also they have more features, like test for thrown exceptions and nicer diagnostics.

I don’t know what is the best way to proceed, complex libraries will benefit with feature-rich frameworks, like GTest, but I have the impression, from experience, that for the majority it will add unnecessary complexity.
At the same time, proposing (as a template) two different test frameworks, one header-only and one compiled can be beneficial in this sense, but also can be confusing.

I just learned from a comment in this thread that doctest is (or can be used as) header-only, so it could be what we are looking for. (just keep an eye on compilation test if one has many individual cpp tests)

Regarding your comment: “People using header-only libraries I know usually paste the headers in their repository somewhere.”. I think what I am saying is independent of this use pattern. In any case header-only library can still benefit from using CMake (I do it this way, but if someone still wants to use it by copy the headers to a directory that is fine with me).

I’m late to the comment chain here – but we haven’t seemed to come to any resolution. I agree with @correaa here that most of the test frameworks are way too heavy. For boost.datetime I wrote a less than 100 sloc framework that did all I needed – it uses real functions instead of macros bc I hate debugging macros. It’s not really suitable, but it’s an example of the sort of thing a library author might resort to if we don’t have a recommendation on a header only framework.

boost.lwt looks pretty good (except macros) but obviously way more comprehensive.

Professionally we’ve switched to using boost.ut. It’s no macro (yay!), but does require c++20. You can read the docs for yourself, but it scales up from ‘very simple’ to extremely feature rich. We’ve encountered zero issues with it. Would be nice if Kris would actually put it into boost, but I can understand his reluctance to bother at this point.

1 Like