* [VAULT-42245] Add IBM license update to enos upgrade scenario (#12661)
* initial changes
* more changes
* test
* test changes
* Fix test
* try ignoring customer id
* clean up
* more clean up
* lint
* PR comments
* make edition a variable
* lint
* PR comments
* add default for customer id
* fix script and lint
* specify license file
* Apply suggestion from @ryancragun
Co-authored-by: Ryan Cragun <me@ryan.ec>
* always configure ibm license
* Update enos/modules/verify_log_secrets/main.tf
Co-authored-by: Ryan Cragun <me@ryan.ec>
* lint
---------
Co-authored-by: Ryan Cragun <me@ryan.ec>
* lint
---------
Co-authored-by: Jenny Deng <jenny.deng@hashicorp.com>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* Add LDAP secrets engine blackbox tests
* Format
* format
* cleanup environment
* Install ldap-utils in CI for LDAP domain provisioning
* wrap in eventually
* debugging
* fix ip issues
Co-authored-by: Luis (LT) Carbonell <lt.carbonell@hashicorp.com>
- actions/cache => v5.0.4
Dep updates
- actions/download-artifact => v8.0.1
Support for CJK characters
- dorny/paths-filter => v4.0.1
Node 24, support for merge queues
- hashicorp/action-setup-enos => v1.52
Security release for downstream vuln
- pnpm/action-setup => v5.0.0
Node 24, support for native caching
- slackapi/slack-github-action => v3.0.1
Node 24, lots of internal dep updates, ability to run Slack commands
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* Fix GitHub Actions expression evaluation error in build workflow
- Add hcp-setup job with explicit step-by-step parameter validation
- Replace problematic inline expressions with debuggable logic steps
- Use proper fallback values (0 instead of '') for number type inputs
- Resolve 'Unexpected value' error on scheduled runs
- Maintain existing workflow logic and conditional behavior
- Add clear logging for troubleshooting parameter resolution
* Fix type conversion for pull-request number in build workflow
- Use fromJSON() to convert string output to number type
- Resolves type mismatch error in reusable workflow input
Co-authored-by: Luis (LT) Carbonell <lt.carbonell@hashicorp.com>
* actions: pull in gotestsum when executing the cloud scenario
* cloud: add 'hcp' changed-file group and trigger cloud scenario when the files change
* slightly simplify expression
---------
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
- docker/setup-buildx-action v3.12.0 => v4.0.0
Node 24 upgrade, switch to ESM, some deprecated inputs have been
removed.
- docker/build-push-action v6.19.2 => v7.0.0
Node 24 upgrade, switch to ESM, some deprecated envs have been
removed.
- actions/setup-node v6.2.0 => v6.3.0
Bug fixes, internal dep updates, support for parsing `devEngines`.
- action-setup-enos v1.50 => v1.51
Use enos 0.0.36
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* rework UI CI workflow to partition JS tests (#11967)
* add setup-pnpm action
* remove reading vault keys from vault server output
* update ci workflow to build app and go binary first, then run tests in partitions
* fix errant tests
* address PR feedback
* Apply suggestions from code review
Co-authored-by: Ryan Cragun <me@ryan.ec>
* more feedback changes
* restore test-helper.js
* restore auth test helpers
* check in ui/tests/helpers/vault-keys.js
* use v7 of download-artifact action
* make test-ui reusable workflow
* add status job
---------
Co-authored-by: Ryan Cragun <me@ryan.ec>
* update new UI tests to run CE tests on the CE branch (#12537)
---------
Co-authored-by: Matthew Irish <39469+meirish@users.noreply.github.com>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* adding ibm tests for ent files
* adding debug commands
* adding code changes
* adding reload tests
* remove settings.json
* remove ryboe q
* changing isHashicorpLicense to isIBMLicense and moving DiagnoseCheckLicenseGeneration to core_util_common.go
* fix test
* reverting non-license related tests
* reverting non-license related tests
* removing hashicorp license test
* modify reload server_ent_test.go
* change ibm-license paths
* adding census reload server test
* moving LicensingEntitlementSelectionConfig to core_util_common.go
* add EntReloadLicenseAndConfig to stubs
* fix operator diagnose bug
* move bug fix into ce and ent files
* add more ibm test cases
* Update command/command_testonly/server_testonly_ent_test.go
* address comments
* make fmt
---------
Co-authored-by: akshya96 <87045294+akshya96@users.noreply.github.com>
Co-authored-by: Jenny Deng <jenny.deng@hashicorp.com>
* rough draft
* add some stuff for dynamic secrets
* add some more helpers and sample tests
* new helpers, new tests, refactoring
* Add Basic Smoke SDK Scenario (#11678)
* Add simple test for stepdown election
* Add a smoke_sdk scenario
* add script to run tests locally
* fix up a few things
* VAULT-39746 - Add Tests to Smoke SDK and Cloud Scenarios (#11795)
* Add some go verification steps in enos sdk test run script
* formatting
* Add a smoke_sdk scenario userpass secret engine create test (#11808)
* Add a smoke_sdk scenario userpass secret engine create test
* Add the some additional tests
* Add Smoke tests to Cloud Scenario (#11876)
* Add a smoke_sdk scenario userpass secret engine create test
* Add the some additional tests
* Add smoke testing to cloud
* Add test results to output and test filtering
* comment
* fix test
* fix the smoke scenario
* Address some various feedback
* missed cleanup
* remove node count dependency in the tests
* Fix test perms
* Adjust the testing and clean them up a bit
* formatting
* fmt
* fmt2
* more fmt
* formatting
* tryagain
* remove the docker/hcp divide
* use the SHA as ID
* adjust perms
* Add transit test
* skip blackbox testing in test-go
* copywrite
* Apply suggestion from @brewgator
* Add godoc
* grep cleanup
---------
Co-authored-by: Josh Black <raskchanky@gmail.com>
Co-authored-by: Luis (LT) Carbonell <lt.carbonell@hashicorp.com>
Update to the latest actions. The primary motivation here is to get the
latest action-setup-enos.
- actions/cache => v5.0.3: security patches
- actions/checkout => v6.0.2: small fixes to git user-agent and tag
fetching
- hashicorp/action-setup-enos => v1.50: security patches
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
- actions/checkout -> v6.0.2: some minor changes around setting the
ACTIONS_ORCHESTRATION_ID and some fixes to `fetch-tags`.
- actions/setup-python -> v6.2.0: Node 24 compat
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* [VAULT-41857] pipeline(find-artifact): add support for finding artifacts from branches (#11799)
Add support for finding matching workflow artifacts from branches rather than PRs. This allows us to trigger custom HCP image builds from a branch rather than an PR. It also enables us to build and test the HCP image on a scheduled nightly cadence, which we've also enabled.
As part of these changes I also added support for specifying which environment you want to test and threaded it through the cloud scenario now that there are multiple variants. We also make the testing workflow workflow_dispatch-able so that we can trigger HVD testing for any custom image in any environment without building a new image.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* actions(runners): add backup self-hosted runner types
We've previously added backup runner types for various self-hosted
runners but were not exhaustive. This change adds at least one backup
instance type to each specified on-demand runner type.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
- actions/cache -> v5.0.2: A bugfix around not retrying cache entries on
429s.
- actions/setup-go -> v6.2.0: NodeJS bump and internal actions/cache
bump. We don't use the caching in setup-go so this ought to have no
impact for us.
- actions/setup-node -> v6.2.0: internal bump of actions/cache.
- pnpm/action-setup -> v4.2.0: Adds support for .npmrc file.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
Sometimes our CI slack message outputs the wrong information, most
notably the data race failure when only UI tests run but the UI tests
fail. In an effort to fix this false positive I noticed that there are
several error cases we didn't consider when creating the notification.
Now we only report which failures were detected in the message.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* move from yarn to pnpm for package management
* remove lodash.template patch override
* remove .yarn folder
* update GHA to use pnpm
* add @babel/plugin-proposal-decorators
* remove .yarnrc.yml
* add lock file to copywrite ignore
* add @codemirror/view as a dep for its types
* use more strict setting about peerDeps
* address some peerDep issues with ember-power-select and ember-basic-dropdown
* enable TS compilation for the kubernetes engine
* enable TS compilation in kv engine
* ignore workspace file
* use new headless mode in CI
* update enos CI scenarios
* add qs and express resolutions
* run 'pnpm up glob' and 'pnpm up js-yaml' to upgrade those packages
* run 'pnpm up preact' because posthog-js had a vulnerable install. see https://github.com/advisories/GHSA-36hm-qxxp-pg3
* add work around for browser timeout errors in test
* update other references of yarn to pnpm
Co-authored-by: Matthew Irish <39469+meirish@users.noreply.github.com>
* Remove esoteric builds
Builds we want gone:
- NetBSD (386/amd64/arm)
- OpenBSD (386/amd64/arm)
- Solaris
- FreeBSD (arm)
- Linux (arm)
* trying to make the linter happy
Co-authored-by: Josh Black <raskchanky@gmail.com>
This change does a few things that might not be obvious:
- We stop requesting the previous runner image. This will result in us
using Docker 29 instead of 28. With this comes changes in our
container build system, most notably that container images are now
exported as OCI images. Every container runtime that we support also
supports OCI images so this ought to have no meaningful impact to
downstream users. One noticeable change is that the image layers are
now compressed so the final image size on disk will be considerably
smaller than before.
- Upgrade `hashicorp/action-setup-enos` to the latest version. This is not
strictly required for this change but as we just released a new version of
the CLI it makes sense to update it here. We should also note that recently
we released a new version of `terraform-provider-enos` which contains
necessary for this change as our docker and kind resources needed to be
updated handle OCI and Docker exported images. Previously they relied on
files that existed only in Docker images.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
Update the base images for all scenarios:
- RHEL: upgrade base image for 10 to 10.1
- RHEL: upgrade base image for 9 to 9.7
- SLES: upgrade base image for 15 to 15.7
- SLES: add SLES 16.0 to the matrix
- OpenSUSE: remove OpenSUSE Leap from the matrix
I ended up removing OpenSUSE because the images that we were on were rarely updated and that resulted in very slow scenarios because of package upgrades. Also, despite the latest release being in October I didn't find any public cloud images produced for the new version of Leap. We can consider adding it back later but I'm comfortable just leaving SLES 15 and 16 in there for that test coverage.
I also ended up fixing a bug in our integration host setup where we'd provision three nodes instead of one. That ought to result in many fewer instance provisions per scenario. I also had to make a few small tweaks in how we detected whether or not SELinux is enabled, as the prior implementation did not work for SLES 16.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* use 'stable' instead of .go-version for the security scanner
if we don't do this, the security scanner might not run because it's
using a different version of Go than what we have on whatever release
branch this is running on.
* update branches the scanner runs on
Co-authored-by: Josh Black <raskchanky@gmail.com>
Fix an incompatibility where we check out the repository with
checkout@v6 and then attempt to check it out again at checkout@v5 in the
set-product-version action.
* update enos directory to trigger lint
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
This was started to remove a trailing " that would show up when UI tests
failed. Since I was here I normalized our emoji to use `flashing-light`
instead of `rotating_light` because the former is rendered better in the
new Slack instance.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* actions(setup-enos): update action-setup-enos to pull in enos 0.0.34 (#10561)
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
When a pull request is created against a CE branch and it has changed any files in the `gotoolchain` group we'll automatically trigger the diff for every Go module file in the repo against the equivalent in the corresponding enterprise branch. If there's a delta in like configuration it will automatically fail the `build/ce-checks` job. It will also write a complete explanation of the diff to the step output and also to the `build/ce-checks` job step summary.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
Migrate all slack notifications to the `ibm-hashicorp` workspace. This
required creating three new `incoming-webhook` configurations which are
capable of posting into three different Slack channels, depending on the
workflow.
As they all use the `incoming-webhook` event, many of our integrations
had to be migrated from `chat.postMessage` and those changes are
reflected here.
Of note, there are lots of changes to the `release-procedure-ent`
workflow as it has by far the most uses of the Slack integrations. In
some cases it was to appease `actionlint` issues, in others I made small
idiomatic tweaks. I translated all of the payload messages to YAML
instead of JSON, which fits better into our existing workflows and also
because most of the payload messages were invalid JSON all together.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
Our service users now have compatible use-case's that allow us to use
the service user credentials everywhere. Drop `action-doormat` so that
our workflows execute correctly in the `hashicorp/vault` context.
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* SECVULN-22299: Use Doormat GitHub Action in CI
* remove step id
* remove step id
* grab aws account id in separate step
* add oidc perms
* add perms for other workflows
* remove usages of aws login creds
* add conditions for CE vs ent
* fix lint
* test perms
* add perms
* fix metadata
* update role arn
* use ci role arn
* print secret
* try again
* try workaround
* update all arns
* remove echo step
* cleanup
* cleanup
* address feedback
* re-add perms
* use service account
* fix conflict
* address feedback
* add read permission
* use write-all
* expose role arn
Co-authored-by: Charles Nwokotubo <charles.nwokotubo@hashicorp.com>
* actions: use self-hosted runners in hashicorp/vault
While it is recommended that we use self-hosted runners for every
workflow in private and internal accounts, this change was primarily
motivated by different runner types using different cache paths. By
using the same runner type everywhere we can avoid double caches of the
internal Vault tools.
* disable the terraform wrapper in ci-bootstrap to handle updated action
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>
* [VAULT-39671] tools: use github cache for external tools
We currently have some ~13 tools that we need available both locally for
development and in CI for building, linting, and formatting, and testing Vault.
Each branch that we maintain often uses the same set of tools but often pinned
to different versions.
For development, we have a `make tools` target that will execute the
`tools/tool.sh` installation script for the various tools at the correct pin.
This works well enough but is cumbersome if you’re working across many branches
that have divergent versions.
For CI the problem is speed and repetition. For each build job (~10) and Go test
job (16-52) we have to install most of the same tools for each job. As we have
extremely limited Github Actions cache we can’t afford to cache the entire vault
go build cache, so if we were to build them from source each time we incur a
penalty of downloading all of the modules and building each tool from source.
This yields about an extra 2 minutes per job to install all of the tools. We’ve
worked around this problem by writing composite actions that download pre-built
binaries of the same tools instead of building them from source. That usually
takes a few seconds. The downside of that approach is rate limiting, which
Github has become much more aggressive in enforcing.
That leads us to where we are before this work:
- For builds in the compatibility docker container: the tools are built from
source and cached as separate builder image layer. (usually fast as we get
cache hits, slow on cache misses)
- For builds that compile directly on the runner: the tools are installed on
each job runner by composite github actions (fast, uses API requests, prone
to throttling)
- For tests, they use the same composite actions to install the tools on each
job. (fast, uses API requests, prone to throttling)
This also leads to inconsistencies since there are two sources of truth: the
composite actions have their own version pin outside of those in `tools.sh`.
This has led to drift.
We previously tried to save some API requests and move all builds into
the container. That almost works but docker's build conatiner had a hard
time with some esoteric builds. We could special case it but it's a bandaid at
best.
A prior version of this work (VAULT-39654) investigated using `go tool`, but
there were some showstopper issues with that workflow that make it a non-starter
for us. Instead, we’ll attempt to use more actions cache to resolve the
throttling. This will allow us to have a single source of truth for tools, their
pins, and afford us the same speed on cache hits as we had previously without
downloading the tools from github releases thousands of times per day.
We add a new composite github action for installing our tools.
- On cache misses it builds the tools and installs them into a cacheable path.
- On cache hits it restore the cacheable path.
- It adds the tools to the GITHUB_PATH to ensure runner based jobs can find
them.
- For Docker builds it mounts the tools at `/opt/tools/bin` which is
part of the PATH in the container.
- It uses a cache key of the SHA of the tools directory along with the
working directory SHA which is required to deal with actions/cache
issues.
This results in:
- A single source of truth for tools and their pins
- A single cache for tools that can be re-used between all CI and build jobs
- No more Github API calls for tooling. *_Rate limiting will be a thing of
the past._*
Signed-off-by: Ryan Cragun <me@ryan.ec>
Co-authored-by: Ryan Cragun <me@ryan.ec>