We are talking about build matrix, linters, debug on CI etc. My preference would be have a dedicated repo for infrastructure and be able to use as dependency in each repo (e.g., optional26 to have as a submodule this infrastructure repo; I’m open to any dependency like support - need to test this idea).
Advantages:
no duplication for infrastracture
fast way to deliver updates
we can have a stable branch/version which is pulled by other repos and keep main as a development branch.
We didn’t have time to discuss this at the sync. Open earlier discussion here.
I would say yes, for the reasons you listed. It will greatly reduce duplication and also make it easier for existing Beman projects that are missing some aspects of infrastructure to integrate it.
Making it easier to support library layout expectations on different environments
Making GitHub Actions less repetitive by having common build-run-and-test scripts, etc.
I have had in mind adding dependencies to account for the above also. But I have been hesitant to do so before we get some reasonable support for Conan and vcpkg.
It would be possible to use FetchContent for this, I believe. Perhaps that is the best option for now. But we should be willing to prioritize support of packaging, even if we have to rethink our use of git-submodules, for instance.
The downside of having it outside the project is that projects get broken by changes outside their span of control. With a distributed project we can’t even do a forkift upgrade of everything in github, because that still means peoples working branches are broken.
If the infrastructure is shared, it also tends to be frozen because of the perceived costs of change.
I think this idea is worthy of further investigation, but it’s hard to know if it would be better without knowing exactly how it is implemented. Submodules confuse people, it’s unclear how FetchContent would work with pre-commit, .clang-format wants to live in the top-level directory, incremental dependency upgrade paths, etc.
@neatudarius do you have a concrete idea for how a separate infrastructure repo would work? Without that I’m afraid it’s tough to give feedback.
Currently we have different degree of infrastructure tool implementation across beman, can we sync up/ propagate between projects these improvements to minimize duplicate code?
I will go first:
Exemplar have:
Lint infrastructure + review dog
Preset of GCC variants
Single platform CI testing across major C++ version and sanitizers
Enable/ disable examples build
Enable/ disable testing
Devcontainer/ codespace support
Exemplar will benefit from:
Clang-tidy support (currently in the works, will help from reference implementation)
Multi-platform CI testing (windows in the works, macos pending PR by @ClausKlein )
[CMAKE.CONFIG] (in progress/ stale PR by @neatudarius , I think this is implemented elsewhere)
Code coverage infrastructure
Doxygen (I think this is implemented elsewhere)
VCPKG support (need expertise, especially CI side)
Compiler version testing matrix on CI (I think this is implemented elsewhere)
Let me know if the project you are working will benefit anything from exemplar (I will shoot a PR over), and you have implemented any other tools exemplar doesn’t currently have.
@neatudarius
Would you mind deleting this thread and move your exact message in Should we have a dedicated repo for infrastructure?? (Or at least delete it temporarily). Or at least defer any work to copy infrastructure until we finish the other thread.
I think we need to have a reusable infrastructure, then your call to action would be possible in a scalable way.
Hello @neatudarius. I wish I had seen this topic earlier, I came to the same conclusion and I wrote this topic here. All in all, we need a core infrustructure repo that will evolve with the new standards, best practices it will add new pipelines and improve existing ones. The starting template (examplar) will have a way to update it’s core in a non-intrusive manner to the developer of the Library. The examplar (alone) is a good step forward but it doesn’t scale with the number of the new libraries created since making a change to the examplar requires to make the same change manually to all libraries that forked from the examplar. To make this happen we have to draw a line of what functionality the core infrastructure should own, what portion of this can be modified by the developer of a library that was forked by the examplar and what is the interface (folder structure, configuration files) between the two.
I’m supportive of one or more repositories to support infrastructure needs. I don’t mind a single repo for all utilities for now, though we should keep an eye on whether it makes sense to deliver beman-query, CMake utilities, etc. in one repo or in a few related ones.
For a current use case, I’m attempting to tweak exemplar to make it packageable and otherwise friendly to build from a machine in “airplane mode”. The main issue right now is figuring out how to depend on “GoogleTest” in a simple way in CMakeLists.txt that doesn’t imply git operations (FetchContent does this by default).
One option to provide packaging support (see this PR) is a lot simpler in terms of style in CMakeLists.txt, but it does require an additional utility (named use-fetch-content.cmake in the current iteration of the PR) in each Beman repo. And to be clear I have serious concerns when I see non-declarative logic copy/pasted across every Beman repo, even at the scale of single conditionals or loops in CMakeLists.txt, so entire toolchain files (which we have) and other CMake utilities should be refactored out to a common infrastructure repository as soon as we reasonably can.
For another example, we have hardcoded support for Windows package layout in the Beman Exemplar. If we expect every Beman repository to maintain install rules that specific, we should really refactor that logic out and maintain it commonly for all repos.
That reminds me an old presentation @mpusz did https://youtu.be/S4QSKLXdTtA?t=1730 were he used Conan cache to build new projects. However, this assumes that the library was used (by an other project?) in the past and it is still in Conan’s cache. It is quite old presentation and technical details might need some update since Conan 2.0 was released, but I thing the core idea is the same.
Right now it has EXEMPLAR all over the place, and I’ve changed that in the draft find_package branch for optional. I think two uses is enough to indicate it should live in one place before we get a third.
I started experimenting here on how we can pull infrastructure changes using subtree from infra to any library that derived from the examplar template. I created a pull_infra.sh script that fetches some files and folders.
The initial plan was to pull all files inside the .devcontainer/ and .github/ folders (using vendor workflow) because the infrastructure code should reside in those folders. The assumption I made is that all library repos (sooner or later) should have the same containers and pipelines and the development of the infrastructure will be done in the infra repo. Then the library developers will run the script manually to pull the latest changes as they wish, and/or a pipeline will run it only after a successful release of the library.
However, my assumption doesn’t look correct since the contents of the .devcontainer folder in the infra and the examplar repos are entirely different. That left me wondering what the purpose of infra repo (the readme file was not that enlightening). @neatudarius since you grasped the idea, can you explain the purpose of the infra repo? Thank you in advance.
Everyone I have talked to wants to see a consistent experience across the Beman repos.
I’ll let Darius speak from his perspective, but my expectation is that we just haven’t gotten that far yet.
I was reading up on publishing GitHub actions that can be reused this morning. I’m chewing over how and when we would get the .github/ directory in beman.exemplar simplified, not if.
Intresting to see how publishing GitHub actions could be reused and automatically get updated by all library repos that derived from the examplar. I suspect that after a new action version is released a small version increment should be made on all the library repos. If this is true it would not be that convenient to do as the number of libraries increases over time. So, my experimentation is on how to massively update all libraries.
A possible scenario is 1) Create a new or update a GitHub action and publish it. 2) Update the workflow files in the infra repo. 3) Let library developers pull those changes without interrupting their development lifecycle.
I think that not forcing the library repos to get infrastructure updates when there is a new publication, and letting them pull when they are ready, is important. On the other hand, going on every repo and committing a new version is not fun, so I wanted to automate this. Let me know if you find an existing solution to that using just github actions.
Btw, I think the steps in the approach above can work with publishing docker images too, but we have to wait and see.
Hello @Sdowney I used subtree in this PR Reuse infra by Jason5480 · Pull Request #150 · bemanproject/exemplar · GitHub as you suggested. The goal is to prove that we can pull changes from infra repo into exemplar (and any other Beman library) I’d love it if you reviewed the changes I made. However, I will not merge until we decide what the relation between infra and exemplar repos should be. Also, I want your opinion on the infrastructure needed to run the pull_infra.sh script. I wrote the in the prerequisites that git subtree and rsync are needed, but I had some problems running it inside the container since the subtree was not part of the git core installation and I had to install it manually. If we decide to do it that way, we should find a more elegant solution for the installation of the subtree. Thank you in advance!