Commit graph

10 commits

Author SHA1 Message Date
Adrien Ferrand
84178e2773
Do not reuse existing builds on Launchpad when executing snapcraft remote-build (#8719)
We observed recently several unexpected behavior during the execution of snap jobs in Azure. In particular it seems that `snapcraft remote-build` is tending to reattach to the latest builds on Launchpad triggered by the nightly builds on master, independently from the actual branch, status of the code, or targeted architectures.

Primarily if the builds on Launchpad are stalled for some reason, it blocks effectively any other Azure snap jobs until someone manually cancel the builds on Launchpad. Secondarily it means that the outcome of the builds may be inconsistent, because they can be the result of a build for the master source even if you are on a PR that modifieds these sources (including `snapcraft.yaml`).

After digging in `snapcraft` source code, I realized that the signature computed to understand if a build should be resumed, is not based one some hashes against the snapcraft working directory content, but is simply a hash of the working directory absolute path *itself*. It means that every builds triggered from the working directory `/my/path/certbot` for instance, are recognized as the same unique build on Launchpad side, and may be resumed if they already exist, and so independently from the source code, `snapcraft.yaml` or targeted archs.

For the record, relevant parts in `snapcraft` source code:
82024d3748/snapcraft/project/_project.py (L44)
82024d3748/snapcraft/project/_project.py (L86-L89)
82024d3748/snapcraft/cli/remote.py (L128-L132)

This PR makes effectively the resume build mechanism effectively a noop by moving the source code first in a temporary directory with random name before running `snapcraft remote-build`. This way the signature is never the same and builds are always recognized as brand new builds.

* Invalidate snapcraft remote-build cache by using a temporary workspace.

* Capture one more state in the build
2021-03-22 10:39:09 -07:00
Adrien Ferrand
0962b0fc83
Kill current snapcraft build when a "Chroot problem" is encountered (#8442)
* Kill snapcraft build when a "Chroot problem" is encountered

* Display specific helper for "Chroot problem" status and cancel retry mechanism in this case.

* Isolate build tmp directories

* Configure XDG_CACHE_HOME

* Kill snapcraftctl with chroot problem is encountered
2021-03-11 13:08:20 -08:00
Adrien Ferrand
67b65bb2c0
Deprecate acme.typing_magic module, stop using it in certbot (#8643)
* Deprecate acme.magic_typing, stop to use it in certbot

* Isort

* Add a changelog entry

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
2021-03-09 16:12:32 -08:00
Adrien Ferrand
3889311557
Setup a timeout to the remote snap build process (#8484)
This PR adds a `--timeout` flag to `tools/snap/build_remote.py` in order to fail the process if the time execution reaches the provided timeout. It is set to 5h30 on the relevant Azure job, while the job itself has a timeout of 6h managed on Azure side. This allows a slightly better output for these jobs when the snapcraft build stales for any reason.
2020-12-11 12:33:11 -08:00
Brad Warren
e570e8ad32
Generate plugin snap configs as needed (#8411)
While reviewing https://github.com/certbot/certbot/pull/8404, it occurred to me that we're keeping both the generated files and the script used to generate them in `git`. Keeping both around seems unnecessary and is almost asking for the files to get out of sync at some point in the future. I fixed that by removing the files, adding them to `.gitignore`, and updating `build_remote.py` to generate them as needed.

* Remove generated files.

* Add generated files to gitignore.

* Reuse generate_dnsplugins_all.sh in build_remote
2020-10-30 14:12:57 -07:00
Brad Warren
1be005289a
Print more output from snapcraft remote-build (#8321)
* Print more output from snapcraft remote-build.

* Include the build target in the output.
2020-09-25 18:58:04 +02:00
Brad Warren
e8a232297d
Pin non-cb-auto dependencies in our plugin snaps (#8217)
This PR fixes our [Azure failures](https://dev.azure.com/certbot/certbot/_build/results?buildId=2492&view=results) by pinning our Python dependencies that are not included in certbot-auto.

This is done using the same approach as our [snap README](575092d603/tools/snap (build-the-snaps)) and [Docker images](575092d603/tools/docker/core/Dockerfile (L24-L25)) with some minor details changed to hopefully make the Python code more readable.

You can see tests passing with this change at https://dev.azure.com/certbot/certbot/_build/results?buildId=2495&view=results.
2020-08-17 11:54:29 -07:00
Brad Warren
9bbcc0046c
fix --archs default (#8195) 2020-08-11 12:52:24 -07:00
Adrien Ferrand
a6f2061ff7
Improve log dump in snaps remote builds when an unexpected behavior is detected. (#8173)
Fixes #8169

This PR improves snaps remote builds script by dumping the output of `snapcraft remote-build` when unexpected behavior is detected:
* when all builds for a project finish with a zero status code, and none of them are marked as failed, we expect to have all the associated snap files available locally.
* when some builds are marked as failed, we expect to have a build output for each of them available locally.

In these two situations, if the expectation are not matched, then the script will display the output of `snapcraft remote-build` itself. I added also a control error to handle nicely the absence of an expected build output on the local machine.

* Improve log dump in snaps remote builds when an unexpected behavior is detected

* Use the manager

* Update tools/snap/build_remote.py

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
2020-07-27 12:01:51 -07:00
Adrien Ferrand
14dfbdbea5
Build snaps using the remote-build feature (#8153)
Snapcraft has a feature name `remote-build`. It allows to compile snaps using the Canonical dedicated build architecture for several architectures. Compared to the QEMU-enabled Docker approach used currently, the remote build has several advantages:
* the builds are done on the native architecture, making them basically faster than what can be achieved on QEMU
* it avoids to depend on `adferrand/snapcraft` (which could be otherwise be fixed with the merge of https://github.com/snapcore/snapcraft/pull/3144, but this will not happen in the short term)
* when everything is good, all snaps build can be run in parallel and then can be orchestrated by one single Azure Pipeline job, since the heavy tasks are done remotely.

This PR makes the necessary ajustements to use the remote build feature instead of the QEMU-enabled docker approach.

One complex task was to be able to compile the `certbot` snap on `arm64` and `armhf`. Indeed on these architectures the pre-compiled wheel for `cffi` is not available. So it needs to be compiled during the snap build. Sadly, the current version of the python plugin in snapcraft is limited by the fact that `wheels` is not installed in the virtual environment set up to build the python packages, and there is no easy way to change that except by overridding the whole build process.

In the long term, I think I will open a PR on `snapcraft` Git repository to provide a consistent solution. But for the short term, I used the possibility to provide arguments to the `venv` module, to add the flag `--system-site-packages`. With it, the virtual environment can use the system site package, where `wheel` is available.

The other significant additions are in `tools/snap/build_remote.py` script. If invoking the remote build on a local machine is quite straight-forward, it is another story on the CI because we need build auditability and resiliency during these non-interactive actions. In particular we should avoid as possible inconsistent results on the nightly pipeline and the release pipeline.

So this script wraps the `snapcraft` call into a retry logic, and improves its logs in the context of parallel builds.

For the minor modifications, it is mainly about ensuring that plugins can be built (some of them also need `cffi` for instance), and simplify the Azure Pipeline since all snaps are retrieved in one go.

Please note that the `test-` branches still run only the `amd64` architecture. Indeed I noticed that builds on `arm64` and `armhf` are tending to be very slow to start (up to 40 min) while the `amd64` ones wait at max 10 mins, and usually 30 seconds only when the overall load on Canonical side is low.

To work on `certbot/certbot` repository, one secured file needs to be added, because `snapcraft` needs to be authenticated against Launchpad with credentials allowing remote builds. To do so, from a local machine that have this capability, one can extract the existing file at `$HOME/.local/share/snapcraft/provider/launchpad/credentials`, and register it as a secured file in Azure Pipeline with the name `snapcraftRemoteBuildCredentials`.

* Define scripts

* Setup pipeline to use remote builds

* Focus on packaging builds

* Set credentials

* Setup git

* Launch all builds in parallel

* Add dev dependencies to build cffi and cryptography

* Convert to a python logic

* Reorganize the pipeline

* Handle the fact that snap builds may be taken from cache

* Generate constraints

* Exit code

* Check existence

* Try to handle better non zero exit code

* Add --system-site-packages to get wheel in the venv

* Add executable permissions

* Troubleshoot

* Dynamic display, take the maximum timeout for snap build job

* Allow retries if the remote build does not start

* Trigger only amd64 builds for test branches

* Exit properly

* Update snapcraft.yaml

* Fix snap run

* Set secured file name

* Update .azure-pipelines/templates/jobs/packaging-jobs.yml

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>

* Update .azure-pipelines/templates/jobs/packaging-jobs.yml

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>

* Update .azure-pipelines/templates/jobs/packaging-jobs.yml

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>

* Move order in deps

* Reactivate all builds

* Use Manager() as a context manager

* Use Pool as a context manager

* Some nice refactorings

* Check snapcraft execution interruption with exit codes

* Use f-string and format expressions

* Start log

* Consistent use of single/double quotes

* Better loop to extract lines

* Retry on build failures

* Few optimizations

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
2020-07-22 16:05:20 -07:00