Related to https://github.com/certbot/certbot/issues/10581
Following up on https://github.com/certbot/certbot/pull/10631,
https://github.com/certbot/certbot/pull/10622, and
https://github.com/certbot/certbot/pull/10634, this PR converts the
release
[pipeline](https://dev.azure.com/certbot/certbot/_build?definitionId=3)
from Azure to Github Actions.
While this is the last migration PR, I don't think we should close the
issue, as we're still using launchpad for armhf builds. I plan to
continue investigating that and at minimum write up my findings.
To test
[notifications](https://opensource.eff.org/eff-open-source/pl/nfgh6obi8tfqikn4ydp7dshakr)
and creating a [github
release](https://github.com/ohemorange/Things-that-are-gr9/releases/tag/v1.0.19),
I ran workflows that no-oped most the other jobs [in a test
repo](https://github.com/ohemorange/Things-that-are-gr9/actions/runs/25938082721)
(I've made minor changes to names and comments since then, but no code
changes). Everything else is basically the same as nightly, with
different tags. For the docker deployment, `${{ github.ref_name }}` is
the tag name, so `v1.2.3`.
Why not parametrize the tests a bit more, by putting an `env` at the top
with `dockerTag: ${{ github.ref_name }}` and `snapReleaseChannel: beta`?
Because the `env` context is [not
available](https://docs.github.com/en/actions/reference/workflows-and-actions/contexts#context-availability)
to `with`; only `github, needs, strategy, matrix, inputs, vars`. We
could use `vars`, by creating `docker_tag_release` or whatever in the
[variable
section](https://github.com/certbot/certbot/settings/variables/actions)
of the repo settings by using the web interface, but that seems worse to
me than having it in the file but twice.
You will note that the contents of `release.yml` are very similar to
`nightly.yml`. While it would be nice to factor that out and reuse the
code, github actions would then flatten everything in the grouped code
together, making the results much harder to check. You can see what that
flattening would look like
[here](https://github.com/ohemorange/Things-that-are-gr9/actions/runs/25941524972)
(if we put them all in one workflow). Currently, it will look something
like
[this](https://github.com/certbot/certbot/actions/runs/25688414262),
which is much more readable.
We could split it into "stages" like we had in azure pipelines (probably
1. standard and extended tests (and changelog?), 2. snap package and
deploy (depend on tests), 3. docker package and deploy (depend on
tests), 4. github release (depend on snap package and deploy), 5-6.
notify (depend on github release)), but in addition to a minor slowdown
(currently github release only depends on snap and docker package, not
deploy, just so we're trying to do all the deployments simultaneously
and not partially in case of build failures), it would still be only
like three fewer jobs, since we'd still want all the info passing and
dependency relationships.
While it would be nice from a UX perspective to group the two
notification jobs together, you can't do that cleanly using the built-in
`if` key, and I don't think it's worth switching to a messier github
api-based version just to group them.
For mattermost notifications, we currently get the person to tag by
running `AUTHOR_NAME="$(git log -1 --pretty=format:'%an')"` and mapping
that to mattermost username. We could instead use `github.actor` to get
the github handle, and map that instead. I didn't bother since we
already have working, tested code, but can if a reviewer thinks it's
worth it.
---------
Co-authored-by: Will Greenberg <willg@eff.org>
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
These tests are failing because the runner migrated ownership to github,
and that somehow broke it even though it wasn't supposed to. They're not
responding to issues [other
than](https://github.com/actions/runner-images/issues/14100) "a CVE or
vulnerability" during the transition period (ends june 12). Let's just
turn these off for now. We weren't even running this at all in azure,
it's fine.
Related to https://github.com/certbot/certbot/issues/10581
Following up on #10631 and
https://github.com/certbot/certbot/pull/10622, this PR converts the
`nightly`
[pipeline](https://dev.azure.com/certbot/certbot/_build?definitionId=5)
from Azure to Github Actions.
`schedule` and `workflow_dispatch` triggers only work on merged
branches, not PRs. To see these tests running, I temporarily added a
`push` trigger in commit
[a2e9c43](a2e9c4303e).
You can see the results of those tests
[here](https://github.com/certbot/certbot/actions/runs/25688414262).
I did not split each file into its own commit this time because I feel
like the general idea is clear. The relevant files in azure pipelines to
reference are:
- the deleted `.azure-pipelines/nightly.yml` -->
`.github/workflows/nightly.yml`
- `.azure-pipelines/templates/jobs/common-deploy-jobs.yml` -->
`.github/workflows/deploy_docker_images.yml` and
`.github/workflows/deploy_snaps.yml`
- `.azure-pipelines/templates/stages/changelog-stage.yml` -->
`.github/workflows/create_changelog.yml`
I chose to split `common-deploy-jobs` into `deploy_docker_images` and
`deploy_snaps`. This is because the docker arm32v6 build takes a long
time, but uploading to docker is quick, while the armhf snaps build
varies but is often quicker, but uploading the snaps can take some time.
By splitting them, we can specify the dependencies more precisely, and
hopefully shave some time off the total. Without the split, tests took
[53 minutes
total](https://github.com/certbot/certbot/actions/runs/25684264622).
After the split, tests took [33 minutes
total](https://github.com/certbot/certbot/actions/runs/25688414262)!
As before, the "nightly deploy stage" from azure has been omitted for
clarity.
`rerun.yml` did not exist before. There's not a great built-in way to
rerun individual jobs in github actions, which I wanted for the snap
builds specifically, since other timeouts can still happen. I could have
made an action or additional workflow and wrapped that in a script to
retry it, but I figured actually it's nicer to have the ability to rerun
anything. This is equivalent to clicking "rerun all failed jobs," which
I feel is usually what we want. Unfortunately, I am pretty sure that to
test it, the rerun script will need to be merged first, since it relies
on `workflow_dispatch`.
You can see that packages were successfully uploaded to
[dockerhub](https://hub.docker.com/r/certbot/certbot/tags) and the [snap
store](https://dashboard.snapcraft.io/stores/snaps/); looking at the
timestamps is probably the easiest way to confirm (about 11:45am
Monday).
Related to https://github.com/certbot/certbot/issues/10581
Following up on #10622, this PR converts the `full-test-suite`
[pipeline](https://dev.azure.com/certbot/certbot/_build?definitionId=4)
from Azure to Github Actions.
Nightly test changes for context not included in this PR are available
[here](https://github.com/certbot/certbot/compare/test-convert-full-pipeline...convert-all-pipelines).
Since this branch is named `test-convert-full-pipeline`, these tests
will show up in the checks section of this PR.
The major changes I made here are splitting the docker and snaps tests
for a better github actions UX, and removing the intermediate "stage"
file, since stages are not a concept in GHA. This means that we get the
nice dropdowns for the different categories on the left bar of the [test
run
page](https://github.com/certbot/certbot/actions/runs/25139155528/job/73684692548)
so it's easier to see each type of test. The very slight drawback is
that the four jobs listed in `.github/workflows/full_test_suite.yml` do
need to be duplicated in `nightly.yml`, but that's a reasonable tradeoff
to me.
Also, we now test our certbot and dns plugin snaps on all architectures
for the first time (using `dpkg --add-architecture` to run armhf tests
on an arm64 machine), which is very nice and in my opinion worth the
very slightly extra time and code.
In this PR, we build arm64 and amd64 snaps directly on github's runners.
armhf snaps are built using launchpad as before. This makes the workflow
file a little long. There are perhaps some micro-optimizations for code
deduplication I could make, like creating an action to install
dependencies based on the architecture, but I don't think it's super
worth it, especially since the dependencies vary enough that we'd still
need some code (for example, even between installing deps for certbot
and dns runs, we'd still need to additionally install `nginx-light`).
A very slight potential time improvement we could make here would be to
optionally depend on the different architectures before running their
respective tests. I'm not sure if this can be done without writing
different jobs, and since once those jobs start they run in parallel, it
didn't really seem worth looking into for me. I am of course open to
alternate points of view here and in general.
Another potential change to bring the two build strategies more in line
would be to stop using the python script to send off all the launchpad
builds, and instead put each in a separate, matrixed job like the github
jobs. We could even continue retrying the builds within each job. This
would mean that if one dns plugin build happens to fail three times, all
the builds wouldn't have to be retried. While I think that's not the
worst idea, I personally think that belongs in a separate PR, as this PR
is already quite long.
Speaking of the PR length, I can undo the changes made here to build
arm64 and amd64 snaps on github actions, to have a simpler
conversion-only PR to review. Some of the choices I made here,
particularly around UX, were based on the fact that the jobs would look
like this, so it might not be as clear why I made those choices, but if
it's easier to review it's no problem to put it back. I could also
remove the code that tests the other snaps since it's new, but I figured
it'd be nice to show that they are in fact being built correctly, since
otherwise the built snaps wouldn't be consumed anywhere.
---------
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Related to https://github.com/certbot/certbot/issues/10581
This is the first step of migrating to github actions.
Nightly and full tests have been converted on branch
`convert-all-pipelines`; you can see additional changes to do those for
context
[here](https://github.com/certbot/certbot/compare/convert-pr-tests...convert-all-pipelines).
Some notes:
- All github workflows must be flat in the `.github/workflows/`
directory.
- Github actions doesn't have a concept of "stages." Instead, it
generates a dependency graph, which is kind of nice. You can see an
example of a more complicated one
[here](https://github.com/certbot/certbot/actions/runs/24580625688).
- I don't know why the actions in the left bar (under Actions tab -->
All workflows) are using the path instead of the listed name. I suspect
it has something to do with not being run on main. Once it's merged, if
the name doesn't change, we can delete previous runs and that will clear
the entry on the left.
- "permissions" is for the fine-grained github PAT. contents: read is
needed for the "checkout" action, which basically everything uses. it's
still best practice to define per-workflow. it can also be defined
per-job, but per-workflow seemed nicer to me.
[This](https://docs.github.com/en/actions/reference/workflows-and-actions/workflow-syntax#permissions)
is the best permissions explanation I've found; [some
actions](https://github.com/actions/checkout) mention what permissions
they need.
- For definitions of the keywords to `on`, see
[here](https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows).
- Some of the potential inputs in tox steps are not used in this PR
because we're not running the AWS tests. It seemed messier to take them
out here and put them back later when the extended tests need them, but
I can do that on request.
We currently have a `main` [protection
rule](https://github.com/certbot/certbot/settings/branch_protection_rules/5466)
set that Azure pipelines PR test suite must pass before merging.
Obviously I don't want to turn that off before this PR is reviewed. In
github actions, it can only require a specific job to pass, though you
can have multiple. To address this, I've created a job that requires all
other jobs to pass, and that can be set at the required job. We probably
do not want to list every individual job, as that includes every job
generated by a matrix strategy. To find it in the protection rules page,
start typing "PR test suite success" and it will show up.
---------
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Co-authored-by: Will Greenberg <willg@eff.org>
this fixes https://github.com/certbot/certbot/issues/10462 and i
personally think this is a nice change worth trying
now when people open (non-draft) PRs, one of us will automatically be
assigned hopefully helping to both automate and speed up the review
process
i personally don't think this PR requires two reviews
this pr is in response to https://words.filippo.io/compromise-survey/.
ohemorange and i read this late on a friday to (speaking for myself at
least) much panic as it has some very strong words to say about the
github actions trigger pull_request_target which we use. looking into
the issue more, i also found that the popular static analysis tool
[zizmor](https://github.com/zizmorcore/zizmor) flags any github actions
workflow that uses the pull_request_target trigger with the message:
```
error[dangerous-triggers]: use of fundamentally insecure workflow trigger
pull_request_target is almost always used insecurely
```
this only added to my concern
the general problem with pull_request_target is that it runs with
additional privileges (e.g. potential write access, access to secrets)
in an environment containing values that can be set by an attacker.
these values include things such as references to the arbitrary code
contained in the triggering pr and pr titles which have been used to
perform shell injection attacks. not carefully treating these values
like the untrusted data it is while executing code in the privileged
environment given to pull_request_target has resulted in many supply
chain attacks
that's not to say that pull_request_target CAN'T be used securely.
zizmor even has [an
issue](https://github.com/zizmorcore/zizmor/issues/1168) brainstorming
how to not warn about all uses of the trigger as some are clearly fine
and the only way to accomplish what the user wants. i'm going to argue
that our uses of the trigger are ok
looking through the links provided by filippo's blog and [zizmor's
docs](https://docs.zizmor.sh/audits/#dangerous-triggers), i think we can
break down attacks used against pull_request_target into roughly 2
categories:
1. shell injection: "Nx S1ingularity" and "Ultralytics" from filippo's
blog
2. checking out and running a PR's code: "Kong Ingress Controller" and
"Rspack" from filippo's blog and https://ptrpa.ws/nixpkgs-actions-abuse
from zizmor docs
i think none of our pull_request_target workflows have these problems.
none of them use a shell (the [zizmor
issue](https://github.com/zizmorcore/zizmor/issues/1168) i linked
earlier suggests that any pull_request_target workflow that uses a run
block should always be flagged as insecure). instead, our workflows just
call action-mattermost-notify which can be [pretty easily
audited](https://github.com/mattermost/action-mattermost-notify/blob/2.0.0/src/main.js)
(as all the other files in the repo are boilerplate). passing possible
attacker controlled values directly to an action written in another
language is one of the approaches for mitigating script injection
[recommended by
github](https://docs.github.com/en/actions/reference/security/secure-use#use-an-action-instead-of-an-inline-script).
our workflows also do not check out the triggering pr's code
despite all that, i took this opportunity to cleanup and harden things a
bit. i reduced the permissions for each workflow and confirmed they each
still work on my fork. i also pinned the mattermost action to an exact
version and added some inline documentation
with these changes, our github workflows trigger few to no
warnings/errors when checked with zizmor,
[octoscan](https://github.com/synacktiv/octoscan), and [openssf
scorecard](https://github.com/ossf/scorecard)
if this pr is approved, i'll make similar changes to our josepy repo
blast from the past! resurrects
https://github.com/certbot/certbot/pull/9803 with all of @bmw's changes.
i figured instead of force-pushing a basically brand new branch and
obliterating the old review, i'd just start from a clean slate
fixes#8272
---------
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Co-authored-by: Brad Warren <bmw@eff.org>
We need this to create issues to track work like "update venv.py to
address upcoming pip build system deprecation" since we no longer have a
blank issue template.
Adding automation for team triage meetings for when PRs or Issues are
assigned. You can see an example in the "Test" channel.
---------
Co-authored-by: ohemorange <erica@eff.org>
- Better labels upon an issue going stale will help triage better. There
other PRs with "needs update" that are manually put and therefore we
can't explicitly filter for stalebot.
- For management purposes, being able to view how many issues are
auto-closed helps as well.
Pasted from the old one. Maybe we can just rename it but this is what
github's web interface led me to create.
I want to make sure that they at least create the template so that they
read it. If they then choose to ignore it that's fine, but it should
always pop up. Basically I want to keep the old behavior. Open to
alternatives.
We could also play around with the new issue forms:
https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/syntax-for-issue-forms
Or label this one the "bug" template, and create a second one that is
blank but has the header text paragraph. I haven't seen a way to make
something appear in all templates, including the "blank" one, other than
just turning off blank templates.
We're a few years behind the curve on this one, but using "master" as a
programming term is a callous practice that explicitly uses the
historical institution of slavery as a cheap, racist metaphor. Switch to
using "main", as it's the new default in git and GitHub.