Make Beman Packageable

Disclaimers

Apologies for CMake Mortals

Before I dig into the issue below, I want to apologize up front to anyone not deep into CMake features. There are a lot of arcane features at play here, which I do not like, but they are the options we have to choose from. Feel free to ask questions and I’ll do my best to provide references to CMake upstream docs and interpretations of them.

To be clear, I don’t use all these features, and I’ve been teaching myself about them quite a bit while working on the Problem Statement here.

CMake Upstream Needs Work

I think there are upstream CMake issues to sort out in all this, especially with respect to how one declares a dependency in CMakeLists.txt. I can start conversations in those contexts, especially the CMake Discourse, but I don’t think the Beman Project will want to wait around for those concerns to play out.

Problem Statement

To start, Beman projects should be fully packageable, full stop. If Beman projects are not able to ship to Debian, Ubuntu, Arch Linux, Conan, Vcpkg, etc., it’s dramatically undermines a key goals of the whole endeavor – to get usage experience for these APIs before they are standardized.

Right now, the Beman Exemplar and possibly other Beman libraries are not packageable, but even more simply, they are not designed to consume their dependencies as packages. Instead, they are hardcoded to fetch dependencies, especially GoogleTest, from github.com and build them from source. This isn’t wrong, especially to get started, but it is limiting and causes problems.

For instance:

  • Users may want to use an existing copy of GoogleTest because:
    • Other dependencies require that version
    • They need to patch GoogleTest to work in their environment
    • They have the full GoogleTest source available as a package (this is a thing at least on Ubuntu)
  • Users may not want their build systems performing off-machine I/O because:
    • They are concerned about supply chain attacks and/or need to produce supply chain documentation such as Software Bills of Materials
    • They are concerned about reproducibility of builds
    • They’re coding on an airplane or train or otherwise have a bad ISP
  • Build pessimization
    • Speed - downloading and building GoogleTest from source is slower than linking against a prebuilt one
    • Reliability - a prebuilt GoogleTest has fewer moving parts – you’re not running through all the GoogleTest CMakeLists.txt in addition to the ones in the repo you’re building

What to do?

We have some options, unfortunately. There isn’t a clear consensus in the wider CMake world

Option 1: New option per library (per dependency?) such as -DBEMAN_EXEMPLAR_FETCH_GOOGLETEST

This would be an option() in CMakeLists.txt per library. It would be used in an if statement controlling whether to call FetchContent or find_package for GoogleTest

Option 2: Always use FetchContent

This PR uses this approach. It calls only FetchContent in CMakeLists.txt and uses FETCHCONTENT_TRY_FIND_PACKAGE_MODE to redirect to find_package.

However, it seems blocked unless we work with upstream (GoogleTest and/or CMake) to fix name collisions inside the GoogleTest project on Windows when using FETCHCONTENT_TRY_FIND_PACKAGE_MODE.

Option 3: Always use find_package

This PR uses this approach. It attempts to not break existing build-from-source use cases by leveraging standard CMake mechanisms to intercept find_package(GTest) and redirect to equivalent dependency fetching using FetchContent

Users and CI could/would set this as needed -DCMAKE_PROJECT_TOP_LEVEL_INCLUDES=./cmake/use-fetch-content.cmake when they want to use FetchContent, git, etc. to build GoogleTest from source.

Use of a lockfile.json is demonstrated in the PR, though we could hardcode FetchContent calls in use-fetch-content.cmake if that’s preferred

Option 4: Develop a Better API

I’m mostly throwing this in for completeness. It’s a great design, but innovating in build tooling is pretty tangential to the end goals of the Beman project. Probably if you like this option, it’s because none of the above are really interesting you to the point that you don’t care about helping CMake broadly decide how things should work with current options.

xkcd: Standards, but at $dayjob, we actually don’t do any of the above. What we do looks a lot like Option 3 involving an abstract dependency provider API, but we use the target names instead of the find_package package names. Also we can do them in a batch command:

require_targets(
  GTest::main
  Beman::Optional
)
# ... later ...
target_link_libraries(some-application-test PRIVATE
  GTest::main
  Beman::Optional
)

If we’re really excited about this approach, I could implement a new pass at this as an open source CMake module in a week or two. It would interoperate fine with find_package and FetchContent because target names are first-class citizens in CMake.

I’m talking to Ben Boeckel about this on CppLang Slack. Without weighing in on specifics, he recommends we use find_package(GTest) as a rule.

1 Like

If I had to choose, I would go with option 3 since find_package(GTest) is the norm. Also, having a fallback to fetch and build from source isn’t a bad choice. However, if I am not mistaken the proposed way is to provide a toolchain file in the configuration step. This file is usually generated by the package manager, the concept is that you consume packages the way you want your package to be consumed! That way you create a consistent user experience. So, I think failing to find GTest if a toolchain is not provided is not a bad thing at all. CMake should express “what” is needed while package manager should tell “where” and “how” to obtain it. If presets are used, the toolchain file can be provided to a base preset that is inherited by other presets. That way is not a burden during development and the three problems (should be) solved by the package manager.

1 Like

Just to keep the conversation in one spot, @river expressed some concerns about the complexity of the PR in Option 3. I think that’s a fair description of using FetchContent in general. It is a very simple dependency management system, and we’d need to innovate in different ways to fill in feature gaps like portability to packaging systems.

I replied that there is a variant of Option 3 in which we prioritize initial Conan or vcpkg support. I would expect that to work reasonably well on all platforms with fairly optimal amount of up-front implementation cost and ongoing maintenance costs.

I could make a third PR demonstrating one or both of those approaches in a couple hours. Later. Let me know if that’s interesting, but I expect it would mostly look like (1) removing the cmake/use-fetch-content.cmake, (2) removing the lockfile.json, (3) adding at least one of conanfile.txt and vcpkg.json, and (4) updating the README.md to provide new instructions for users who might not be familiar with Conan or vcpkg.

EDIT: Of course, someone else feel free to beat me to making that PR.

At the moment I’m persuaded that option 3 is the best of the bad options. I’m not a wizard at cmake or package management, but I think the refactoring of fetch content into it’s own .cmake makes it easier to see the specifics of that part of any libraries approach. We do something similar with conan rules at the day job in combo with cmake. I understand @river concerns – however, this is only going to get more complicated as real conan, vcpkg support gets added – unless something in the world fundamentally changes…

note: I moved the thread to Beman Project Development category bc it’s not about a specific Beman library.

This cannot be stressed hard enough! This is the preferred approach in my opinion: All projects should simply assert that their dependencies are correctly resolved. How they are resolved in the end is completely orthogonal.

As a library author, it is important to keep in mind that there are multiple ways your clients will want to use your library:

  • Install the library from a system package and use find_package or pgk-config.
  • Vendor the code as subtree and use add_subdirectory.
  • Fetch the code with FetchContent, from a git repository or a tar ball.
  • Use a non-public package manager.

It is your resposibility that your library supports all of the above. That means you should:

  • Provide both awesome_lib.pc and AwesomeLibConfig.cmake.
  • Make sure that the build interface matches the install interface of your library. If the installed library is supposed to be used as beman::awesome_lib, make sure to define an alias with the same name.
  • Make sure to respect build options set by a super-project. Don’t override compile flags etc. If possible, don’t make a decision whether your libraries are static or dynamic.
  • Don’t produce any clutter. Make sure to prefix project specific cmake options so that they can be easily grouped in the cache editor. Don’t print extensive output like a configuration summary.
  • Hardcode the version information in the project() command! Don’t execute git during the build in order to retrieve the version (the git history is not available in the tar ball).
  • Make no assumptions about the build environment. The source directory is not guaranteed to be writable. The build directory is not guaranteed to be a subdirectory of the source. Network connectivity may be restricted.

Most importantly: Make no assumption how your clients will resolve the dependencies of your library. Give them all the information they need, but don’t constrain their options.

2 Likes

Hi everyone, I know I’m new to the project so I don’t want to impose, but @river suggested I might offer an opinion here. Looking over the options, I think I agree w/@Jeff-Garland that Option 3 looks pretty good. As others have said, find_package is the canonical way for a library to declare a dependency. It’s up to the person building the library to provide the dependency, and we shouldn’t make any assumptions on their behalf. To that end, providing an opt-in dependency provider that forwards find_package calls to FetchContent, as @bretbrownjr has proposed, seems like a good compromise between convenience and flexibility.