This PR adds a `--timeout` flag to `tools/snap/build_remote.py` in order to fail the process if the time execution reaches the provided timeout. It is set to 5h30 on the relevant Azure job, while the job itself has a timeout of 6h managed on Azure side. This allows a slightly better output for these jobs when the snapcraft build stales for any reason.
Fixes https://github.com/certbot/certbot/issues/8134.
* Test on Python 3.9.
* Mention Python 3.9 support in changelog.
* s/\( *'Pro.*3\.\)8\(',\)/\18\2\n\19\2/
* undo changes to tox.ini
* Move more tests to Python 3.9
* Update PyYAML and packages which pinned it back
* Upgrade typed-ast
* Use <= to "pin" dnspython
* Fix lint by telling pylint it cannot be trusted
* Disable mypy on RFC plugin
* add comment about <= support
* tests: add certbot-dns-rfc2136 integration tests
* dont use 'with' form of socket.socket
fixes py2 crash
* address some feedback:
- conftest: make DNS server a global resource
- conftest: add dns_xdist parameter into node config
- conftest: add --dns-server=bind flag
- conftest: if configured, point the ACME server to the DNS server
- dnsserver: make it sort-of compatible with xdist (future-proofing)
- context: parameterize dns-rfc2136 credentials file (future proofing)
- context: reduce dns-rfc2136 propagation time to speed up tests
- tox: add a integration-dns-rfc2136 target
- rfc2136: add a test/zone for subdelegation
- rfc2136: skip tests if no DNS server is configured
* try add integration-dns-rfc2136 to CI
* mock recursive dns via RPZ
* update --dns-server args and tox.ini args
* address more feedback:
- dns_server: rename rfc2136 creds file to .tpl
- dns_server: dont vary dns server port, instead we will vary zone names (#8455)
- dns_server: log error if bind9 fails to stop cleanly
- dns_server: replace assert with raise
- context: remove redundant _worker_id
- context: remove redundant cleanup override
- context: fix seek/flush in credentials context manager
- context: rename skip_if_no_server -> ...bind_server
- context: add newline EOF
* conftest: document _setup_primary_node sideeffects
* ci: rfc2136-integration from standard->nightly
* fix _stop_bind (function was renamed to stop)
* ignore errors from shutil.rmtree during cleanup
* dns_server: check for crash while polling
* remove --dry-run from rfc2136 test
Do we have any specific reason to run the standard Linux integration tests on Python 2.7?
If not, we should move to a more recent version of Python. This PR does it for Python 3.8.
* Add a new, simplified version of pipstrap.
* Use tools/pipstrap.py
* Uncomment code
* Refactor pip_install.py and provide hashes.
* Fix test_sdists.sh.
* Make code work on Python 2.
* Call strip_hashes.py using Python 3.
* Pin the oldest version of httplib2 used in distros
* Strip enum34 dependency.
* Remove pip pinnings from dev_constraints.txt
* Correct pipstrap docstring.
* Don't set working_dir twice.
* Add comments
Fixes#8202
This PR adds an Azure Pipeline job to execute certbot plugins --prepare for each Docker image created during the CI on amd64.
* Prepare basic integration tests for certbot dockers
* Add a displayName for the integration tests task
* add set -e to all bash instances in deploy-stage.yml
* retry uploading snap if we fail
* Add the rest of the set -e calls for bash in azure while we're here
* use retry based on travis_retry
* add set -e to the script: sections that run on macOS/Linux
* actually don't fail on result
* reset result before running command because bash short circuits or conditionals
* remove inapplicable comment
Partial fix for #8256
This PR makes tox calls pipstrap before any commands is executed, and Azure Pipelines calls pipstrap when appropriate (when an actual call to pip is done).
* Invoke pipstrap in tox and during the CI
* Set default value for PYTHON_VERSION and always set python interpreter
* Set Python for snaps_build also
* Fix the build for Windows installer
* Add a warning comment for pinned versions in pipstrap
* Rebuild letsencrypt-auto
* Same version than the installer build
* Let's update to latest pip for installer tests
Fixes https://github.com/certbot/certbot/issues/8022, https://github.com/certbot-docker/certbot-docker/issues/25, and https://github.com/certbot-docker/certbot-docker/issues/20.
This PR builds on https://github.com/certbot/certbot/pull/8192 to set up similar builds in Azure to what we currently have at release time as well as nightly builds allowing us to catch problems in these images before a release. It also fully automates our Docker deployments removing a manual step from our release process. We'll need to update our release instructions once this PR lands.
If you're not familiar with our `certbot-docker` setup, you can read about how these scripts customized the build process on Docker Hub at https://docs.docker.com/docker-hub/builds/advanced/.
You can see the process working properly at:
* Nightly build on my fork: https://dev.azure.com/bmw0523/bmw/_build/results?buildId=345&view=logs&j=70ac378a-cb1f-50d1-b328-169807afbcfa
* Release build on my fork: https://dev.azure.com/bmw0523/bmw/_build/results?buildId=346&view=logs&j=70ac378a-cb1f-50d1-b328-169807afbcfa
* Nightly build on Certbot's Azure setup: https://dev.azure.com/certbot/certbot/_build/results?buildId=2426&view=logs&j=70ac378a-cb1f-50d1-b328-169807afbcfa
The builds on my fork pushed to https://hub.docker.com/u/certbottest. The credentials for this account are in our shared vault in 1password if you want to play around with this.
While the scripts will (almost?) always be run in CI, I tested the scripts successfully on macOS, Ubuntu 18.04, and Ubuntu 20.04, however, **the scripts do not seem to work when using the Docker snap, at least on Ubuntu 20.04.** It does work with the `docker.io` packages from `apt`. I was able to make things work by no longer setting `DOCKER_BUILDKIT`, but as I described in the code comments, this breaks things on Azure.
When writing this PR, I tried to make the minimal modifications to our current set up to get the behavior we want. I'm planning on working on splitting the Docker builds into different Azure jobs so it doesn't increase the overall build time, but this isn't trivial so I figured it would be best done in a separate PR.
* Remove license.
* update build scripts
* write deploy code
* Remove unused READMEs.
* rewrite readme
* Make testing on a fork easier.
* Set up Azure automation.
* fix typo
* Make output more verbose.
* clean up cleanup...everybody everywhere
* separate build and deploy
* Document docker-hub credentials
* Use Docker BuildKit when building.
* Remove unneeded .gitignore files.
* Fix tools/docker/README.md grammar
Co-authored-by: ohemorange <ebportnoy@gmail.com>
* Clarify <TAG> in README.
* no docker snap
* rename docker job
Co-authored-by: Erica Portnoy <ebportnoy@gmail.com>
* Improve github release creation process
* Comment file
* Update tools/create_github_release.py
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* run chmod +x on tools/create_github_release.py
* Add description of create github release method
* remove references to unnecessary azure credential
* remove unnecessary import
* Add reminders to update other file to definitions in .azure-pipelines
* Raise an error if we fail to fetch the artifact from azure
* Create github release as a draft, upload artifacts, then un-draft, for hooks to be run at the right point
* get the version number from the release
* add new packages to dev3_extras so they're installed by tools/venv3.py
* remove unnecessary import
* fun fact: tempdirs behave differently when used as a context manager
* Move comment to construct.py
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Snapcraft has a feature name `remote-build`. It allows to compile snaps using the Canonical dedicated build architecture for several architectures. Compared to the QEMU-enabled Docker approach used currently, the remote build has several advantages:
* the builds are done on the native architecture, making them basically faster than what can be achieved on QEMU
* it avoids to depend on `adferrand/snapcraft` (which could be otherwise be fixed with the merge of https://github.com/snapcore/snapcraft/pull/3144, but this will not happen in the short term)
* when everything is good, all snaps build can be run in parallel and then can be orchestrated by one single Azure Pipeline job, since the heavy tasks are done remotely.
This PR makes the necessary ajustements to use the remote build feature instead of the QEMU-enabled docker approach.
One complex task was to be able to compile the `certbot` snap on `arm64` and `armhf`. Indeed on these architectures the pre-compiled wheel for `cffi` is not available. So it needs to be compiled during the snap build. Sadly, the current version of the python plugin in snapcraft is limited by the fact that `wheels` is not installed in the virtual environment set up to build the python packages, and there is no easy way to change that except by overridding the whole build process.
In the long term, I think I will open a PR on `snapcraft` Git repository to provide a consistent solution. But for the short term, I used the possibility to provide arguments to the `venv` module, to add the flag `--system-site-packages`. With it, the virtual environment can use the system site package, where `wheel` is available.
The other significant additions are in `tools/snap/build_remote.py` script. If invoking the remote build on a local machine is quite straight-forward, it is another story on the CI because we need build auditability and resiliency during these non-interactive actions. In particular we should avoid as possible inconsistent results on the nightly pipeline and the release pipeline.
So this script wraps the `snapcraft` call into a retry logic, and improves its logs in the context of parallel builds.
For the minor modifications, it is mainly about ensuring that plugins can be built (some of them also need `cffi` for instance), and simplify the Azure Pipeline since all snaps are retrieved in one go.
Please note that the `test-` branches still run only the `amd64` architecture. Indeed I noticed that builds on `arm64` and `armhf` are tending to be very slow to start (up to 40 min) while the `amd64` ones wait at max 10 mins, and usually 30 seconds only when the overall load on Canonical side is low.
To work on `certbot/certbot` repository, one secured file needs to be added, because `snapcraft` needs to be authenticated against Launchpad with credentials allowing remote builds. To do so, from a local machine that have this capability, one can extract the existing file at `$HOME/.local/share/snapcraft/provider/launchpad/credentials`, and register it as a secured file in Azure Pipeline with the name `snapcraftRemoteBuildCredentials`.
* Define scripts
* Setup pipeline to use remote builds
* Focus on packaging builds
* Set credentials
* Setup git
* Launch all builds in parallel
* Add dev dependencies to build cffi and cryptography
* Convert to a python logic
* Reorganize the pipeline
* Handle the fact that snap builds may be taken from cache
* Generate constraints
* Exit code
* Check existence
* Try to handle better non zero exit code
* Add --system-site-packages to get wheel in the venv
* Add executable permissions
* Troubleshoot
* Dynamic display, take the maximum timeout for snap build job
* Allow retries if the remote build does not start
* Trigger only amd64 builds for test branches
* Exit properly
* Update snapcraft.yaml
* Fix snap run
* Set secured file name
* Update .azure-pipelines/templates/jobs/packaging-jobs.yml
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* Update .azure-pipelines/templates/jobs/packaging-jobs.yml
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* Update .azure-pipelines/templates/jobs/packaging-jobs.yml
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* Move order in deps
* Reactivate all builds
* Use Manager() as a context manager
* Use Pool as a context manager
* Some nice refactorings
* Check snapcraft execution interruption with exit codes
* Use f-string and format expressions
* Start log
* Consistent use of single/double quotes
* Better loop to extract lines
* Retry on build failures
* Few optimizations
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Fixes#8041
This PR makes Azure Pipeline build the DNS plugins snaps for the 3 architectures during the CI.
It leverages the existing logic for building the Certbot snap in order to deploy a QEMU environment with Docker, and leverages the local PyPI index to speed up the build when installing `cffi` and `cryptography`.
All DNS plugins snaps are constructed in one unique docker container, in order to save the time required to install the system dependencies upon first start of `snapcraft`, and so speed up significantly the build.
Finally, all `amd64` DNS plugins snaps are built within 6 minutes. For `arm64` and `armhf`, it is around 40 mins: this is quite fast in fact, considering that 14 DNS plugins snaps are built.
However, this is still an extremely heavy task to make the full 3 architectures builds, even for Azure Pipelines and its 10 parallel jobs capability. That is why I make the `arm64` and `armhf` builds be skipped for the `full-test-suite`, and let them run only for `nightly` and `release`. This means however that these builds will not be done for the release branches. If this is a problem, I can put a more elaborate suspend condition to triggers the builds in this case.
All snaps are stored in the pipeline artifacts storage, making them available for publication during a `release` pipeline.
The PR is set as Draft for now, because I use temporarily `pr_test-suite` to validate the packaging jobs when commits are pushed. Once the PR is ready, I will revert it back to the normal configuration (run the standard tests).
* Configure a script to build DNS snaps
* Focus on packaging
* Trigger all architectures
* Add extra index
* Prepare conditional suspend
* Set final suspend logic
* Set final suspend value
* Loop for publication
* Use python3
* Clean before build
* Add a test
* Add test job in Azure
* Preserve env
* Apply normal config for pipelines
* Skip QEMU jobs only for test branches
* Makes snap run tests depends also on the Certbot snap build
* Update .azure-pipelines/templates/jobs/packaging-jobs.yml
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* Update .azure-pipelines/templates/stages/deploy-stage.yml
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* More accurate way to get the plugin snap name
* Integrate DNS snap tests into certbot-ci
* Fixes
* Update certbot-ci/snap_integration_tests/conftest.py
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* Update certbot-ci/snap_integration_tests/conftest.py
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
* Clean an _init_.py file
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Fixes#8071 and fixes https://github.com/certbot/certbot/issues/8110.
This PR migrates every job from Travis in Azure Pipeline.
This PR essentially converts the Travis jobs into Azure Pipeline with a complete iso-fonctionality (or I made a mistake). The jobs are added in the relevant existing pipelines (`main`, `nightly`, `advanced-test`, `release`). A global refactoring thanks to the templating system is done to reduce greatly the verbosity of the pipeline descriptions.
A specific feature (not present in Travis) is added: the stage `On_Failure`. Using directly the Mattermost API, it allows to notify pipeline failure in a Mattermost channel with a link to the failed pipelines without the need to authenticate to Microsoft.
See https://github.com/certbot/certbot/pull/8098#issuecomment-649873641 for the post merge actions to do at the end of this work.
After getting a +1 from everyone on the team, this PR removes the use of `codecov` from the Certbot repo because we keep having problems with it.
Two noteworthy things about this PR are:
1. I left the text at 4ea98d830b/.azure-pipelines/INSTALL.md (add-a-secret-variable-to-a-pipeline-like-codecov_token) because I think it's useful to document how to set up a secret variable in general.
2. I'm not sure what the text "Option -e makes sure we fail fast and don't submit to codecov." in `tox.cover.py` refers to but it seems incorrect since `-e` isn't accepted or used by the script so I just deleted the line.
As part of this, I said I'd open an issue to track setting up coveralls (which seems to be the only real alternative to codecov) which is at https://github.com/certbot/certbot/issues/7810.
With my change, failure output looks something like:
```
$ tox -e py27-cover
...
Name Stmts Miss Cover Missing
------------------------------------------------------------------------------------------
certbot/certbot/__init__.py 1 0 100%
certbot/certbot/_internal/__init__.py 0 0 100%
certbot/certbot/_internal/account.py 191 4 98% 62-63, 206, 337
...
certbot/tests/storage_test.py 530 0 100%
certbot/tests/util_test.py 374 29 92% 211-213, 480-484, 489-499, 504-511, 545-547, 552-554
------------------------------------------------------------------------------------------
TOTAL 14451 647 96%
Command '['/path/to/certbot/dir/.tox/py27-cover/bin/python', '-m', 'coverage', 'report', '--fail-under', '100', '--include', 'certbot/*', '--show-missing']' returned non-zero exit status 2
Test coverage on certbot did not meet threshold of 100%.
ERROR: InvocationError for command /Users/bmw/Development/certbot/certbot/.tox/py27-cover/bin/python tox.cover.py (exited with code 1)
_________________________________________________________________________________________________________________________________________________________ summary _________________________________________________________________________________________________________________________________________________________
ERROR: py27-cover: commands failed
```
I printed the exception just so we're not throwing away information.
I think it's also possible we fail for a reason other than the threshold not meeting the percentage, but I've personally never seen this, `coverage report` output is not being captured so hopefully that would inform devs if something else is going on, and saying something like "Test coverage probably did not..." seems like overkill to me personally.
* remove codecov
* remove unused variable group
* remove codecov.yml
* Improve tox.cover.py failure output.
I want to do what I did in https://github.com/certbot/certbot/pull/7733 to our Azure Pipelines setup, but unfortunately this isn't currently possible. The only filters available for service hooks for the "build completed" trigger are the pipeline and build status. See

To accomplish this, I propose splitting the "advanced" pipeline into two cases. One is for builds on protected branches where we want to be notified if they fail while the other is just used to manually run tests on certain branches.
As discussed in #7539, we need proper tests of the Windows installer itself in order to variety that all the logic contained in a production-grade runtime of Certbot on Windows is correctly setup by each version of the installer, and so for a variety of Windows OSes.
This PR handles this requirement. The new `windows_installer_integration_tests` module in `certbot-ci` will:
* run the given Windows installer
* check that Certbot is properly installed and working
* check that the scheduled renew task is set up
* check that the scheduled task actually launch the Certbot renew logic
The Windows nightly tests are updated accordingly, in order to have the tests run on Windows Server 2012R2, 2016 and 2019.
These tests will evolve as we add more logic on the installer.
* Configure an integration test testing the windows installer
* Write the test module
* Configurable installer path, prepare azure pipelines
* Fix option
* Update test_main.py
* Add confirmation for this destructive test
* Use regex to validate certbot --version output
* Explicit dependency on a log output
* Use an exception to ask confirmation
* Use --allow-persistent-changes
This PRs extends the installer tests on Azure Pipeline, in order to run the integration tests on a certbot instance installed with the Windows installer for several Windows versions, corresponding to the scope of supported versions on Certbot:
* Windows Server 2012 R2
* Windows Server 2016
* Windows Server 2019
One can see the result on: https://dev.azure.com/adferrand/certbot/_build/results?buildId=311
* Try specific installer-build step
* Install Python manually
* Add tests on windows 2019
Clean up some places missed by #7544.
Found this when running test farm tests. They were working as of 5d90544, and I will truly shocked if subsequent changes (all to the windows installer) made them stop working.
* Release script needs to target new CHANGELOG location
* Clean up various other CHANGELOG path references
* Update windows paths for new certbot location
* Add certbot to packages list for windows installer
In Travis, the full test suite doesn't run on PRs for point release branches, just on commits for them. I think this behavior makes sense because what we actually want to test before a point release is the exact commit we want to release after any squashing/merging has been done. This PR modifies Azure to match this behavior.
After this PR lands, I need to update the tests required to pass on GitHub.
This PR creates a pipeline triggered on tag push matching v0.* (eg. v0.40.0).
Once triggered, this pipeline will build the windows installer, and run integration tests on it, like for the pipeline run nightly.
I also add a simple script to extract from CHANGELOG.md file to extract the relevant part to put it in the body of the GitHub release. I believe it makes things nicer.
* Create release pipeline
* Relax condition on tags
* Put beta keyword
* Update job name
* Fix release pipeline