Make Beman Packageable

Disclaimers

Apologies for CMake Mortals

Before I dig into the issue below, I want to apologize up front to anyone not deep into CMake features. There are a lot of arcane features at play here, which I do not like, but they are the options we have to choose from. Feel free to ask questions and I’ll do my best to provide references to CMake upstream docs and interpretations of them.

To be clear, I don’t use all these features, and I’ve been teaching myself about them quite a bit while working on the Problem Statement here.

CMake Upstream Needs Work

I think there are upstream CMake issues to sort out in all this, especially with respect to how one declares a dependency in CMakeLists.txt. I can start conversations in those contexts, especially the CMake Discourse, but I don’t think the Beman Project will want to wait around for those concerns to play out.

Problem Statement

To start, Beman projects should be fully packageable, full stop. If Beman projects are not able to ship to Debian, Ubuntu, Arch Linux, Conan, Vcpkg, etc., it’s dramatically undermines a key goals of the whole endeavor – to get usage experience for these APIs before they are standardized.

Right now, the Beman Exemplar and possibly other Beman libraries are not packageable, but even more simply, they are not designed to consume their dependencies as packages. Instead, they are hardcoded to fetch dependencies, especially GoogleTest, from github.com and build them from source. This isn’t wrong, especially to get started, but it is limiting and causes problems.

For instance:

  • Users may want to use an existing copy of GoogleTest because:
    • Other dependencies require that version
    • They need to patch GoogleTest to work in their environment
    • They have the full GoogleTest source available as a package (this is a thing at least on Ubuntu)
  • Users may not want their build systems performing off-machine I/O because:
    • They are concerned about supply chain attacks and/or need to produce supply chain documentation such as Software Bills of Materials
    • They are concerned about reproducibility of builds
    • They’re coding on an airplane or train or otherwise have a bad ISP
  • Build pessimization
    • Speed - downloading and building GoogleTest from source is slower than linking against a prebuilt one
    • Reliability - a prebuilt GoogleTest has fewer moving parts – you’re not running through all the GoogleTest CMakeLists.txt in addition to the ones in the repo you’re building

What to do?

We have some options, unfortunately. There isn’t a clear consensus in the wider CMake world

Option 1: New option per library (per dependency?) such as -DBEMAN_EXEMPLAR_FETCH_GOOGLETEST

This would be an option() in CMakeLists.txt per library. It would be used in an if statement controlling whether to call FetchContent or find_package for GoogleTest

Option 2: Always use FetchContent

This PR uses this approach. It calls only FetchContent in CMakeLists.txt and uses FETCHCONTENT_TRY_FIND_PACKAGE_MODE to redirect to find_package.

However, it seems blocked unless we work with upstream (GoogleTest and/or CMake) to fix name collisions inside the GoogleTest project on Windows when using FETCHCONTENT_TRY_FIND_PACKAGE_MODE.

Option 3: Always use find_package

This PR uses this approach. It attempts to not break existing build-from-source use cases by leveraging standard CMake mechanisms to intercept find_package(GTest) and redirect to equivalent dependency fetching using FetchContent

Users and CI could/would set this as needed -DCMAKE_PROJECT_TOP_LEVEL_INCLUDES=./cmake/use-fetch-content.cmake when they want to use FetchContent, git, etc. to build GoogleTest from source.

Use of a lockfile.json is demonstrated in the PR, though we could hardcode FetchContent calls in use-fetch-content.cmake if that’s preferred

Option 4: Develop a Better API

I’m mostly throwing this in for completeness. It’s a great design, but innovating in build tooling is pretty tangential to the end goals of the Beman project. Probably if you like this option, it’s because none of the above are really interesting you to the point that you don’t care about helping CMake broadly decide how things should work with current options.

xkcd: Standards, but at $dayjob, we actually don’t do any of the above. What we do looks a lot like Option 3 involving an abstract dependency provider API, but we use the target names instead of the find_package package names. Also we can do them in a batch command:

require_targets(
  GTest::main
  Beman::Optional
)
# ... later ...
target_link_libraries(some-application-test PRIVATE
  GTest::main
  Beman::Optional
)

If we’re really excited about this approach, I could implement a new pass at this as an open source CMake module in a week or two. It would interoperate fine with find_package and FetchContent because target names are first-class citizens in CMake.

I’m talking to Ben Boeckel about this on CppLang Slack. Without weighing in on specifics, he recommends we use find_package(GTest) as a rule.

1 Like

If I had to choose, I would go with option 3 since find_package(GTest) is the norm. Also, having a fallback to fetch and build from source isn’t a bad choice. However, if I am not mistaken the proposed way is to provide a toolchain file in the configuration step. This file is usually generated by the package manager, the concept is that you consume packages the way you want your package to be consumed! That way you create a consistent user experience. So, I think failing to find GTest if a toolchain is not provided is not a bad thing at all. CMake should express “what” is needed while package manager should tell “where” and “how” to obtain it. If presets are used, the toolchain file can be provided to a base preset that is inherited by other presets. That way is not a burden during development and the three problems (should be) solved by the package manager.

1 Like

Just to keep the conversation in one spot, @river expressed some concerns about the complexity of the PR in Option 3. I think that’s a fair description of using FetchContent in general. It is a very simple dependency management system, and we’d need to innovate in different ways to fill in feature gaps like portability to packaging systems.

I replied that there is a variant of Option 3 in which we prioritize initial Conan or vcpkg support. I would expect that to work reasonably well on all platforms with fairly optimal amount of up-front implementation cost and ongoing maintenance costs.

I could make a third PR demonstrating one or both of those approaches in a couple hours. Later. Let me know if that’s interesting, but I expect it would mostly look like (1) removing the cmake/use-fetch-content.cmake, (2) removing the lockfile.json, (3) adding at least one of conanfile.txt and vcpkg.json, and (4) updating the README.md to provide new instructions for users who might not be familiar with Conan or vcpkg.

EDIT: Of course, someone else feel free to beat me to making that PR.

At the moment I’m persuaded that option 3 is the best of the bad options. I’m not a wizard at cmake or package management, but I think the refactoring of fetch content into it’s own .cmake makes it easier to see the specifics of that part of any libraries approach. We do something similar with conan rules at the day job in combo with cmake. I understand @river concerns – however, this is only going to get more complicated as real conan, vcpkg support gets added – unless something in the world fundamentally changes…

note: I moved the thread to Beman Project Development category bc it’s not about a specific Beman library.

This cannot be stressed hard enough! This is the preferred approach in my opinion: All projects should simply assert that their dependencies are correctly resolved. How they are resolved in the end is completely orthogonal.

As a library author, it is important to keep in mind that there are multiple ways your clients will want to use your library:

  • Install the library from a system package and use find_package or pgk-config.
  • Vendor the code as subtree and use add_subdirectory.
  • Fetch the code with FetchContent, from a git repository or a tar ball.
  • Use a non-public package manager.

It is your resposibility that your library supports all of the above. That means you should:

  • Provide both awesome_lib.pc and AwesomeLibConfig.cmake.
  • Make sure that the build interface matches the install interface of your library. If the installed library is supposed to be used as beman::awesome_lib, make sure to define an alias with the same name.
  • Make sure to respect build options set by a super-project. Don’t override compile flags etc. If possible, don’t make a decision whether your libraries are static or dynamic.
  • Don’t produce any clutter. Make sure to prefix project specific cmake options so that they can be easily grouped in the cache editor. Don’t print extensive output like a configuration summary.
  • Hardcode the version information in the project() command! Don’t execute git during the build in order to retrieve the version (the git history is not available in the tar ball).
  • Make no assumptions about the build environment. The source directory is not guaranteed to be writable. The build directory is not guaranteed to be a subdirectory of the source. Network connectivity may be restricted.

Most importantly: Make no assumption how your clients will resolve the dependencies of your library. Give them all the information they need, but don’t constrain their options.

3 Likes

Hi everyone, I know I’m new to the project so I don’t want to impose, but @river suggested I might offer an opinion here. Looking over the options, I think I agree w/@Jeff-Garland that Option 3 looks pretty good. As others have said, find_package is the canonical way for a library to declare a dependency. It’s up to the person building the library to provide the dependency, and we shouldn’t make any assumptions on their behalf. To that end, providing an opt-in dependency provider that forwards find_package calls to FetchContent, as @bretbrownjr has proposed, seems like a good compromise between convenience and flexibility.

1 Like

I’m going to reply to things as I get through them.

The best way to do this is to make a Unix-style install. “Everything” can consume them and there doesn’t need to be any special logic in the project.

The best way to do this is to use find_package for dependencies. Packaging systems can intercept them using dependency providers to wire things up as they need.

One may use vendored versions, but it is best to also support consuming things via external sources (see VTK’s vtk_module_third_party support for an example framework for providing internal targets that bridge to either vendored or external suppliers of the dependency in question).

Agreed. My preferred way these days is to have user options to enable “user-facing” features (e.g., MPI support) and then to assert dependencies based on those flags. Naming them on the dependencies directly can be…confusing if there are intermingled dependencies involved. Best to just name them after what the user is actually interested in and wire things up internally. I have come to loathe the “let’s inspect the system for X, Y, and Z and change behavior based on their ambient discoverability state” as it just exacerbates the “but those configure flags work on my machine” problems.

I agree with all of these things, but will add notes to some.

Beware of this issue where property manipulation needs to happen on the “real” target, not the alias target. I’d love to get it fixed some day…

One can always add to them though (e.g., by using a “linked everywhere” internal target that provides consistent warning flags in addition to anything the user ends up requesting).

git executions are indeed less than idea, but file(READ "version.txt" version) isn’t the end of the world.

1 Like

Update

I know we’ve forked the deliberations and discussions out into a few different context. I’ll post updates here so everyone can follow what’s happening. The bottom line is that we’re making progress.

The Beman Standard replaced its [CMAKE.USE_FETCH_CONTENT] recommendation with a [CMAKE.USE_FIND_PACKAGE] recommendation. That means that all Beman projects are encouraged to use find_package in a similar way to how Exemplar does. This should match most CMake tutorials you will encounter on the web, and it’s in line with the advice that @ben.boeckel contributed to this discussion thread.

I’ve published Release Build System Improvements · bemanproject/exemplar as a new release of Beman Exemplar. The release notes are editable, so feel free to make suggestions for improvements if you have any.

@river already created an issue that can serve to track next steps for Beman packaging: Show that exemplar can be used as a packaged dependency · Issue #107 · bemanproject/exemplar

Note that I’ve made two sub-issues on River’s issue, each with its own sub-issue. That is to track separate features for Conan and vcpkg support.

I created Support Conan dependencies by bretbrownjr · Pull Request #136 · bemanproject/exemplar to add initial support for fetching dependencies from Conan. See the TODO items in the PR description if you’re interested in contributing something.

I also created Support vcpkg dependencies by bretbrownjr · Pull Request #137 · bemanproject/exemplar to do likewise for vcpkg. It has similar TODO items to contribute to if anyone has interest.

If it’s possible, it would be great to get review from our contacts in the industry (thanks @ben.boeckel !) to make sure we’re really following best practices.

1 Like

I have been drafting a PR to vcpkg register to push exemplar 1.0.0 to vcpkg.

I can handle the CI part of vcpkg, I worked on it, I can dig it up from my local fork to see what the road block was.

I am busy this two weeks preparing for my interviews. I can work on this maybe this weekend.

Understood. Interviews are definitely more important.

If you get stuck on vcpkg support, let us know so we can try to help out.

This is something I’ve never been able to figure out how to do. Any ideas on how to do this?

It is very easy. add_library(foo STATIC) creates a static library.
A dynamic library is created with add_library(foo SHARED).
add_library(foo) creates a static or a shared library depending on the value of BUILD_SHARED_LIBS.

There are cases where a library has to be static, like when it provides a main function such as the gtest_main library. There are also situations when a library has to be dynamic. But in the general case, it is better to let the consumer of the library decide.

1 Like

Using BUILD_SHARED_LIBS works when consuming a library as a sub-project, but I think the more interesting (and challenging) question is how to package shared, static, and, more generally, any number of other library variants in a way that makes it easy to select the “flavor” of the library when the package is consumed. IIUC, Conan goes some way to solving this problem, but a pure CMake option is possible, too. That’s what I was trying to achieve in this MR.

Partial support for “variants” is in CMake 4.0 via the new CPS features. Though it is modelled as Configurations, AKA build types.

But even then, there isn’t a coherent way to ensure N libraries all behave consistently when they see a “DEBUG_ASAN_STATIC_OPTIMIZE” variant. Some of that is well covered with toolchain files, but we don’t want N toolchain files across M projects, especially given package managers also provide toolchain files.

My current thinking is that we need an “install_beman_library” CMake function shipped via it’s own CMake module that provides inversion of control for most of these features. And trust toolchain files to cover the other concerns. It seems likely that a feature like that turns into an upstream CMake feature or a standalone open source project if it proves as useful as I expect it would be.

Possibly a good project to put in the backlog for C++Now library in a week projects.

Neat. What’s the status of that PR?

I think @bretbrownjr had some concerns w/the implementation. It’s fairly easy to come up w/a variant naming convention that we could use across the Beman ecosystem, but it’s definitely harder to come up w/a variant naming standard that would apply outside of the Beman ecosystem. I don’t know that that’s really a problem that Beman should be trying to solve, though. I think we should be trying to deliver something that provides a reasonable set of default variants, but that allows library consumers to opt out of the defaults and to build and publish any number of variants using whatever naming convention they want.

For example, we don’t need to decide what the DEBUG_ASAN_STATIC_OPTIMIZE configuration is or what that name means. We just need to allow the library consumer to build a variant named DEBUG_ASAN_STATIC_OPTIMIZE if they want to. It should be up to the library consumer to decide what that name means (in terms of the compile options, link options, etc), to use the name consistently across their own ecosystem, and to select that variant when consuming the build artifacts.

I agree w/@bretbrownjr that we should extract most of this logic into functions so it can be consistently shared/applied across all Beman projects.