Commit graph

324 commits

Author SHA1 Message Date
Jesse Hallam
b60ba8b6b4
chore(ci): allow build-server-image to build and push from release branches (#36716) 2026-05-22 14:15:23 -04:00
Jesse Hallam
834a86b5e3
MM-68248: Support OpenSearch v3 (#36617)
* Remove build-opensearch-image.yml -- the published image is unused

* MM-68248: Support OpenSearch v3

* MM-68248: Add CI job to test OpenSearch v2 backwards compatibility

* MM-68248: Handle missing indexes gracefully before reindex

OpenSearch v3 rejects _update_by_query and _delete_by_query with no index
argument (405), and returns index_not_found_exception (404) when querying
an exact index name that hasn't been created yet. Both arise before any
reindex has run, since indexes are created on first document write.

Return nil/empty instead of an error from all affected operations, and add
test coverage for each in the no-indexes state.

* MM-68248: Fix copy-paste operation names in DeleteFilesBatch

* MM-68248: Add i18n string for delete_files_batch error

* Revert "MM-68248: Add i18n string for delete_files_batch error"

This reverts commit e885678088.

* Revert "MM-68248: Fix copy-paste operation names in DeleteFilesBatch"

This reverts commit 4b7caacf59.

* Revert "MM-68248: Handle missing indexes gracefully before reindex"

This reverts commit 2d2d522f86.
2026-05-22 15:37:50 +00:00
Christopher Poile
03f2eaaa0b
[MM-68400] Four plugin hooks and ChannelGuard enforcement (#36152)
* allow workflow_dispatch trigger for Server CI (for plugins CI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [MM-68402] MBE Phase 2: declare four generic plugin hooks (#36291)

* new hooks-only phase 2

* remove ChannelWillBeMoved

* remove RecapWillBeProcessed and MessageWillBeRewrittenByAI

Drop the AI/recap hooks from the new-hook surface; AI-LLM paths
remain uncovered in tech preview and are documented as residuals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [MM-68403] MBE Phase 3: ChannelGuards primitive (storage + cache + plugin API) (#36365)

* phase 3

* phase 3: register ChannelGuard mock in test setup helper

NewChannels' startup-time call to reloadGuardCache invokes
s.ChannelGuard().GetAll(); without an expectation on the mock store,
every test that sets up the server with GetMockStoreForSetupFunctions
panics during init.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase 3: register ChannelGuard mock in retrylayer test

retrylayer.New walks every store getter to wrap it; without the mock
expectation on ChannelGuard, TestRetry panics during layer construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* use rctx properly in the store methods

* phase 3: match rctx arg in testlib ChannelGuard mock

GetAll now takes request.CTX, so the testify expectation must include
mock.Anything; otherwise the call panics under the mocked store.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* phase 3: set api.ctx in TestChannelGuardLowercaseNormalization

The test constructs PluginAPI directly without a ctx, which used to
work when App.RegisterChannelGuard built its own EmptyContext. Now
that the App methods take rctx from the caller, the nil ctx panics
inside RequestContextWithMaster.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [MM-68404] MBE Phase 4: App-layer plugin hook wiring (#36407)

* phase 4

* Fix nil rctx in TestChannelGuardLowercaseNormalization

The PluginAPI struct literal was missing ctx: rctx after a refactor
moved the rctx declaration below the struct construction, leaving
api.ctx as nil. This caused a nil pointer dereference in reloadGuardCache
when RegisterChannelGuard called store.RequestContextWithMaster(nil).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* Remove ChannelWillBeMoved hook call from MoveChannel (phase 4)

The hook and its ID were removed from mbe-phase-2 but the call site in
MoveChannel and its i18n string were not cleaned up during the rebase.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* remove channel will be moved test

* Remove RecapWillBeProcessed and MessageWillBeRewrittenByAI hook calls (phase 4)

The hooks and their IDs were removed from mbe-phase-2 but the call sites
in ProcessRecapChannel and RewriteMessage, their i18n strings, and their
tests were not cleaned up during the rebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert channel_id plumbing on rewrite endpoint (phase 4)

The channel_id field on RewriteRequest was added in phase 4 to feed the
synthetic post passed to MessageWillBeRewrittenByAI. With that hook
removed from mbe-phase-2, channel_id has no consumer; revert the field,
the api4 validation, the app.RewriteMessage parameter, and the
corresponding webapp client + hook plumbing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [MM-68555] MBE Phase 5: Channel-guard enforcement + two-phase dispatch (#36473)

* phase 5

* Bake plugin counter-file paths into source instead of env vars

t.Setenv panics when an ancestor test calls t.Parallel, so the two
channel-guard tests broke under ENABLE_FULLY_PARALLEL_TESTS in CI.
Build each plugin source per-subtest with its temp file path embedded
as a Go literal — same pattern as TestPluginUploadsAPI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Remove guarded helpers and tests for dropped hooks (phase 5)

The runGuardedRecapWillBeProcessed and runGuardedMessageWillBeRewrittenByAI
helpers were never wired (their app-layer call sites were already removed
in the phase-4 cleanup), and the corresponding sub-tests across panic /
allow / reject / partial plugins reference hooks that no longer exist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [MM-68405] MBE Phase 6: fire MessagesWillBeConsumed on the edit path (#36475)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* rebase onto master

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-21 18:16:05 +00:00
Eva Sarafianou
5cacd26776
Fix config-change-checker to use merge-base for per-file diffs (#36670)
Automatic Merge
2026-05-21 10:23:44 +02:00
Maria A Nunez
2db507464d
Add auth token to flaky test webhook (#36636)
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 20:34:27 -04:00
Amy Blais
0675d0ea0b
Automations for config.json, API, audit log event, and Go release notes (#36075)
Some checks are pending
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres (shard 0) (push) Blocked by required conditions
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
* Create config-change-checker.yml

* Create check_config_changes_ci.py

* Update config-change-checker.yml

* Update check_config_changes_ci.py

* Update check_config_changes_ci.py

* Update check_config_changes_ci.py

* Update check_config_changes_ci.py

* Update config-change-checker.yml

* Update check_config_changes_ci.py

* Update config-change-checker.yml

* Update config.go

* Fix check_api to detect multi-line and multi-method endpoints

The previous implementation matched the .Handle(...).Methods(...) regex
line-by-line against diff lines. This silently missed two real and
common patterns in api4/:

  1. Multi-line .Handle(...) declarations — e.g. group.go has 18 of
     them, where the path lives on one line and the wrapper/handler on
     the next. The regex never matched, so PRs adding such endpoints
     produced empty release-note entries.

  2. Multi-method declarations like
     .Methods(http.MethodGet, http.MethodHead) (4 instances in file.go)
     — the old regex required a closing paren immediately after the
     first method.

The fix:

  - Add a file_at(ref, path) helper that snapshots a file at a git ref
    via 'git show', so checkers can compare full file states instead of
    pattern-matching diff text.
  - Add _scan_endpoints() that whitespace-collapses the file before
    matching, letting the regex span what were originally multiple
    lines.
  - Loosen _HANDLE_RE to capture the methods list as a substring and
    extract individual HTTP verbs with a known-method allowlist, so
    multi-method declarations produce one entry per verb.
  - Switch check_api to set-diff (after - before) / (before - after)
    on the parsed endpoint sets. This also cleanly handles routes
    that move within a file (no fragile add/remove dedup needed).
  - Anchor the new/deleted file detection to '^new file mode \d+' to
    avoid false positives from stray text in source files.

Made-with: Cursor

* Track enclosing struct in check_config to avoid dedup collisions

The previous check_config keyed its add/remove dedup on the bare field
name. The dedup intent was to ignore fields that were merely reordered
within config.go (which appear in the diff as both '-Foo' and '+Foo').

But because the key was just the field name, an unrelated rename in one
struct could silently cancel out a real new field with the same name in
a different struct. For example, in a single PR:

    -    EnableFoo *bool   // removed from ServiceSettings
    +    EnableFooV2 *bool

    -    EnableBar *bool   // removed from EmailSettings
    +    EnableFoo *bool   // newly added — but wrongly cancelled below

The dedup would see 'EnableFoo' in both lists and drop both entries,
hiding the brand-new EmailSettings.EnableFoo from the release-note
output.

The fix tracks each field's enclosing struct using a brace-depth stack
that walks the file at BASE_SHA and HEAD_SHA. Fields are keyed as
(struct_name, field_name) tuples, so identically-named fields in
different structs are distinct, and the dedup only collapses true
reorderings. As a side benefit the rendered output is now
'StructName.FieldName' which is much more useful to reviewers.

Switching to file-at-revision scanning + set diff also removes the
custom dedup logic entirely — set arithmetic handles "moved within
file" naturally.

Made-with: Cursor

* Switch remaining checkers to file-at-revision style; drop lines_by_sign

check_audit_events and check_go_version still parsed +/- diff lines
directly, with the same brittle dedup-and-cancel logic that was used in
the previous check_config. After the previous two commits the rest of
the file uses the file_at(ref, path) helper to compare full file
states between BASE_SHA and HEAD_SHA, which:

  - removes the entire moved-within-file dedup dance (set arithmetic
    handles it for free),
  - aligns all four checkers on a single, easy-to-reason-about pattern,
  - is robust to whitespace-only or reordering edits in the watched
    files.

For Dockerfile.buildenv the helper also avoids a subtle case where the
old code only inspected +/- lines: an edit to an unrelated RUN line
that didn't touch the FROM line could in theory leave both old_ver and
new_ver as None even though the version was effectively unchanged.
Reading the file at each revision compares the actual current and
previous FROM line directly.

The lines_by_sign helper now has no callers, so remove it.

Made-with: Cursor

* Update config.go

* Update config.go

* Update check_config_changes_ci.py

* Update check_config_changes_ci.py

* Update check_config_changes_ci.py

* Update check_config_changes_ci.py

* Tighten check_config_changes_ci.py: regex coverage + idempotency

- Restore tolerant `_HANDLE_RE` so 2-arg wrappers (e.g. `api.APISessionRequired(handler, handlerParamFileAPI)`)
  are not silently dropped from the api4 endpoint scan; broaden the `.Methods(...)`
  capture so string-literal variants (`Methods("GET")`) work too. Filtering moves
  back to the `_HTTP_METHODS` allowlist in `_parse_methods` to keep stray
  identifiers from being treated as HTTP verbs.
- Make `strip_old_note` also remove auto-generated lines that landed outside
  the ```release-note fence (the inject_note fallback paths) so reruns no
  longer accumulate duplicates when a PR has no fence.
- Skip the GitHub PATCH when the PR description is already up to date, so
  every commit no longer triggers an unconditional write.
- Wire up `check_go_version`'s `additions` path in `_format_lines` and
  `_AUTO_LINE_RE` so a freshly-added Dockerfile.buildenv emits a note.
- Remove the now-dead `CheckResult.to_markdown` method (replaced by
  `_format_lines`).

Made-with: Cursor

* Restore ExperimentalSettings.EnableWatermark

The field was removed in f71527f0b1 but `server/config/client.go`,
`server/config/client_test.go`, and `server/public/model/config_test.go`
still reference it (added on master in #36025). Restoring the field
makes the branch compile again so CI can go green.

Made-with: Cursor

* Replace placeholder release-note content (NONE / N/A) on injection

The script previously appended its auto-detected lines INSIDE the
```release-note fence but never displaced template placeholders, so PRs
that only had `NONE` ended up with output like:

    NONE
    Added `Foo.Bar` configuration setting.
    Go runtime updated from 1.25.8 to 1.25.9.

When the existing fence content is empty or consists only of placeholder
tokens (NONE, N/A, NA, dashes — case-insensitive), replace it entirely
with the auto-detected entries. User-written human content is still
preserved by appending instead.

Idempotent: stripping followed by re-injection keeps the placeholder
visible when there's nothing to inject, and replaces it again when there
is.

Made-with: Cursor

* Update config-change-checker.yml

* Update check_config_changes_ci.py

---------

Co-authored-by: Your Name <eva.sarafianou@gmail.com>
Co-authored-by: Mattermost Build <build@mattermost.com>
2026-05-19 12:04:25 +03:00
sabril
9d318dc4cd
refactor: speed up E2E test workflows and eliminate npm cache-restore failures (#36599)
Workers no longer run `npm ci` — `node_modules` and framework binaries
are restored from actions/cache populated once by a new `prep-deps` job.
This closes the intermittent EEXIST/ENOENT failure inside npm's own
cacache writer that occasionally fails `npm ci` on a runner. Removing
`npm ci` from workers also cuts ~5 min of duplicated install work per worker.

dispatch-begin now runs as its own job after prep-deps so it fires once the
per-worker test-server setup is the only remaining work before dispatch-run.
2026-05-19 11:26:39 +08:00
sabril
1d1580cb3c
chore: update reusable workflows to specific commit sha (#36600) 2026-05-19 10:57:53 +08:00
sabril
8eb97fa6c3
refactor: remove redundant status update jobs from E2E test workflows (#36579)
Some checks failed
Server CI / Vet API (push) Has been cancelled
Server CI / Check migration files (push) Has been cancelled
Server CI / Generate email templates (push) Has been cancelled
Server CI / Check store layers (push) Has been cancelled
Server CI / Check mmctl docs (push) Has been cancelled
Server CI / Postgres (shard 0) (push) Has been cancelled
Server CI / Postgres (shard 1) (push) Has been cancelled
Server CI / Postgres (shard 2) (push) Has been cancelled
Server CI / Postgres (shard 3) (push) Has been cancelled
Server CI / Merge Postgres Test Results (push) Has been cancelled
Server CI / Elasticsearch v8 Compatibility (push) Has been cancelled
Server CI / Postgres FIPS (shard 0) (push) Has been cancelled
Server CI / Postgres FIPS (shard 1) (push) Has been cancelled
Server CI / Postgres FIPS (shard 2) (push) Has been cancelled
Server CI / Postgres FIPS (shard 3) (push) Has been cancelled
Server CI / Merge Postgres FIPS Test Results (push) Has been cancelled
Server CI / Run mmctl tests (push) Has been cancelled
Server CI / Run mmctl tests (FIPS) (push) Has been cancelled
Server CI / Build mattermost server app (push) Has been cancelled
Web App CI / check-i18n (push) Has been cancelled
Web App CI / check-external-links (push) Has been cancelled
Web App CI / check-types (push) Has been cancelled
Web App CI / test (platform) (push) Has been cancelled
Web App CI / test (mattermost-redux) (push) Has been cancelled
Web App CI / test (channels shard 1/4) (push) Has been cancelled
Web App CI / test (channels shard 2/4) (push) Has been cancelled
Web App CI / test (channels shard 3/4) (push) Has been cancelled
Web App CI / test (channels shard 4/4) (push) Has been cancelled
Web App CI / upload-coverage (push) Has been cancelled
Web App CI / build (push) Has been cancelled
* refactor: remove redundant status update jobs from E2E test workflows

* refactor: rename context-name to commit-status-context in E2E test workflows
2026-05-16 10:26:00 +08:00
Maria A Nunez
d75155b39d
Add flaky test webhook notification (#36573)
* Add flaky test webhook notification

Co-authored-by: Cursor <cursoragent@cursor.com>

* Bound flaky test webhook request time

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-15 12:10:05 -04:00
David Krauser
4aa1c58e37
ci: invalidate poisoned shard-timing cache and guard future saves (#36568) 2026-05-14 16:01:49 +00:00
yasser khan
47d4720ff4
chore(ci): consolidate openldap runner prep into a composite action (#36563)
Some checks are pending
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres (shard 0) (push) Blocked by required conditions
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
2026-05-14 14:28:51 +05:30
Jesse Hallam
e3fbf8711f
MM-68149: Upgrade to Go 1.26.2 (#36418)
* MM-68149: upgrade to Go 1.26.2

Update go directive in go.mod and .go-version.

* MM-68149: replace pointer helpers with Go 1.26 new()

Go 1.26 extends the built-in new() to accept an initial value expression,
making typed-pointer helpers like model.NewPointer(x), bToP(x), and boolPtr(x)
redundant. Replace every call site with new(x) and remove the now-unused
helper functions and their //go:fix inline directives.

* MM-68149: apply go fix for reflect API and format-string changes

- reflect.Ptr → reflect.Pointer (renamed in Go 1.18, deprecated alias removed in 1.26)
- reflect range-over-struct: for i := 0; i < t.NumField(); i++ → for field := range t.Fields()
  and the equivalent for Methods() and interface types
- Fix format-string concatenation and variadic-arg mismatches flagged by go vet

* MM-68149: update JPEG fixtures and test infrastructure for Go 1.26 encoder

Go 1.26 ships a new image/jpeg encoder that produces slightly different output.
Regenerate all JPEG fixture files and switch the comparison helpers from
byte-equality to pixel-level comparison with a small per-channel tolerance,
so minor encoder drift across patch versions is handled automatically.

Add -update-fixtures flag to make it easy to regenerate fixtures after future
major Go upgrades. Document the update procedure in tests/README.md.

* MM-68149: CI check that go fix ./... produces no changes

* Fix real bugs flagged by CodeRabbit review

- group.go: set newGroup.MemberCount not group.MemberCount (member count
  was populated on the wrong variable and lost before publish/return)
- file_test.go: guard compareImage(GetFilePreview) on the preview slice
  length, not the thumbnail slice length (copy-paste error)
- config_test.go: remove duplicate MinimumLength assignment

* fixup! Fix real bugs flagged by CodeRabbit review
2026-05-12 15:59:12 +00:00
Maria A Nunez
dbaaa36416
ci: gate flaky PR comments on zero merged failures (#36474)
Only comment when action-junit-report reports no failing tests after retry
merging, so all-failure retries are not labeled flaky.

Remove skip/JIRA guidance from the comment body and use neutral wording
for the workflow run link.

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2026-05-11 10:43:05 -04:00
sabril
52c400ed1f
Update E2E test workflows to use context names and server images and bump playwright workers to 10 (#36496)
* Update E2E test workflows to use context names and server images and bump playwright workers to 10

* refactor: update branch naming conventions in E2E test workflows for better aggregation
2026-05-11 11:57:52 +08:00
yasser khan
b052f3463a
E2E/Playwright: balance shard timing by enabling fullyParallel in CI (#36054) 2026-05-08 21:04:32 +00:00
sabril
c3322b3a05
fix: permission required by e2e test-system-io actions (#36478) 2026-05-08 10:56:30 +08:00
sabril
33ddd8a47b
fix: permission required by test-system-io actions (#36477) 2026-05-08 10:36:46 +08:00
sabril
91de3d2383
SEC-10179 Integrate test system IO for Playwright and Cypress (#36376)
* change retry to 1, fixed and disabled failed tests

* add v2 templates for Cypress and Playwright E2E tests with test system io integration

* add commenting to pr

* identify more playwrights to fix separately

* disable deletion-report.spec for separate fix

---------

Co-authored-by: Mattermost Build <build@mattermost.com>
2026-05-08 02:15:49 +00:00
sabril
154286f53f
fix: only run e2e tests for fips for versions v11+ (#36374) 2026-05-04 23:11:55 +08:00
Eva Sarafianou
b7a97f4bdc
ci: disable fullyparallel for unsharded weekly Postgres jobs (#36390)
Restore the `fullyparallel: false` override for the unsharded
`Postgres with binary parameters` and `Postgres FIPS` jobs in the
weekly workflow. The override was originally added to the binary
parameters job in #35995 to prevent resource exhaustion on a single
runner, but was dropped when both jobs moved into
server-ci-weekly.yml in #36036, leaving them on the template default
of `true`.

Without it, the hosted runner is overwhelmed (too many server
instances, WebSocket hubs, and DB connections) and the runner agent
itself loses communication with GitHub mid-run, surfacing as
"hosted runner lost communication with the server" at ~55-60 min
into the Run Tests step. Both runs on April 27 and May 4 failed
this way; the sharded FIPS variant retained for FIPS-touching PRs
in server-ci.yml is unaffected because each shard handles only a
fraction of the packages.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:52:35 +03:00
Amy Blais
24f9da39cd
Update docs-impact-review.yml (#36260) 2026-04-27 14:57:52 +03:00
Jesse Hallam
5817a6d687
Simplify PULL_REQUEST_TEMPLATE.md and document it in AGENTS.md (#36239) 2026-04-24 09:44:44 -03:00
David Krauser
e8c9e525e1
Fix silent test discovery failure in sharded CI (#36222)
Some checks are pending
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres (shard 0) (push) Blocked by required conditions
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
2026-04-22 22:25:56 -04:00
Nuno Simões
e5a7230242
ci: fix cypress statuses perms (#36220)
Error: Action failed with error: HttpError: Resource not accessible by integration
https://github.com/mattermost/mattermost/pull/36178
2026-04-22 21:36:39 +00:00
Jesse Hallam
b82cfe1a4e
ci: collect coverage inline on Postgres job, remove duplicate Coverage job (#36216)
Some checks are pending
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres (shard 0) (push) Blocked by required conditions
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
The Coverage job ran the same tests as Postgres with only ENABLE_COVERAGE=true
and a Codecov upload as differences. Enable coverage directly on the Postgres
job under the same release-branch skip condition, eliminating 4x 8-core runner
hours per PR.
2026-04-22 13:26:04 -03:00
Amy Blais
50b8e1086f
Quick fixes for docs label automation (#36185)
Some checks are pending
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Coverage (shard 0) (push) Blocked by required conditions
Server CI / Coverage (shard 1) (push) Blocked by required conditions
Server CI / Coverage (shard 2) (push) Blocked by required conditions
Server CI / Coverage (shard 3) (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
* Update docs-impact-review.yml

* Update docs-impact-review.yml
2026-04-22 10:32:02 +03:00
Nuno Simões
29bab2184d
e2e: adjust some pipeline settings (#36178)
Some checks are pending
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Coverage (shard 0) (push) Blocked by required conditions
Server CI / Coverage (shard 1) (push) Blocked by required conditions
Server CI / Coverage (shard 2) (push) Blocked by required conditions
Server CI / Coverage (shard 3) (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
2026-04-21 21:25:01 +02:00
Pavel Zeman
2d7a71b018
ci: fix startup_failure in nightly race and weekly workflows (#36198)
Some checks are pending
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Coverage (shard 0) (push) Blocked by required conditions
Server CI / Coverage (shard 1) (push) Blocked by required conditions
Server CI / Coverage (shard 2) (push) Blocked by required conditions
Server CI / Coverage (shard 3) (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
* ci: fix startup_failure in nightly race and weekly workflows

Add id-token: write permission to both server-ci-nightly-race.yml and
server-ci-weekly.yml. The reusable server-test-template.yml declares
id-token: write in its permissions block (needed for FIPS Docker Hub
login via OIDC). GitHub requires that caller workflows grant at least
the permissions declared by any reusable workflow they invoke —
regardless of whether the steps using those permissions are skipped at
runtime. Both new workflows only declared contents: read, causing
immediate startup_failure with zero jobs created.

Release Note
```release-note
NONE
```

Co-authored-by: Claude <claude@anthropic.com>

* ci: remove unused id-token: write from server-test-template

The template declared id-token: write but nothing in the workflow uses
OIDC token exchange — the FIPS Docker Hub login uses plain secrets
(DOCKERHUB_USERNAME/DOCKERHUB_TOKEN), not an OIDC identity token.

Removing it from the template means caller workflows (nightly race,
weekly, and the main server-ci) no longer need to grant id-token: write
either, following the principle of least privilege.

This is the actual root cause fix: the previous commit added id-token
to the callers as a workaround, but the real issue was the template
requesting a permission it never uses.

Co-authored-by: Claude <claude@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2026-04-21 12:22:44 -04:00
Pavel Zeman
9c8191c3b8
ci: add yamllint workflow to detect duplicate YAML keys (#36010)
Some checks are pending
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Elasticsearch v8 Compatibility (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Coverage (shard 0) (push) Blocked by required conditions
Server CI / Coverage (shard 1) (push) Blocked by required conditions
Server CI / Coverage (shard 2) (push) Blocked by required conditions
Server CI / Coverage (shard 3) (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
YAML Lint / yamllint (push) Waiting to run
* ci: add yamllint workflow to detect duplicate YAML keys

Add a yamllint check for workflow files to catch duplicate keys that
YAML parsers silently accept but GitHub Actions rejects on the default
branch.

Release Note
NONE

Co-authored-by: Claude <claude@anthropic.com>

* ci: address review feedback on yamllint workflow

- Add permissions: contents: read (least privilege)
- Bump checkout to v6.0.2 with persist-credentials: false
- Remove pip install step (yamllint is pre-installed on ubuntu-22.04)

Co-authored-by: Claude <claude@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2026-04-20 17:25:12 -04:00
Pavel Zeman
c8f0a31425
ci: move FIPS and binary params tests to weekly schedule (#36036)
* ci: move FIPS and binary params tests to weekly schedule

Move low-regression-risk test suites out of the per-push/PR Server CI
workflow into a new weekly scheduled workflow (Monday 1am EST / 5am UTC):

- Postgres with binary parameters (1x 8-core runner)
- Postgres FIPS sharded tests (4x 8-core runners + merge job)
- mmctl FIPS tests (1x 8-core runner)

This reduces the per-push 8-core runner demand from 14 concurrent jobs
to 5 (4 Postgres shards + 1 ES), which should significantly reduce
queue times that currently reach 90+ minutes during peak hours.

The weekly workflow also supports workflow_dispatch for manual triggering
when urgent FIPS or binary parameter verification is needed.

#### Release Note
```release-note
NONE
```

Co-authored-by: Claude <claude@anthropic.com>

* ci: move coverage shards to 2-core runners

Add a 'runner' input to server-test-template.yml (defaults to
ubuntu-latest-8-cores for backward compatibility) and set coverage
shards to ubuntu-22.04 (2-core).

Coverage is non-blocking (allow-failure: true) so longer runtime
doesn't impact PR feedback. Estimated ~20-30 min per shard on 2-core
vs ~7-9 min on 8-core, but frees 4 more 8-core slots per push.

Combined with the FIPS/binary-params weekly move, per-push 8-core
demand drops from 14 → 4 (just the Postgres test shards + ES v8).

Co-authored-by: Claude <claude@anthropic.com>

* ci: decouple race detector from binary params, add nightly race job

The race detector was accidentally bundled with binary params via
the fullyparallel=false → RACE_MODE coupling in the test template.
These test different things:
- Binary params: Postgres driver binary encoding mode
- Race detector: Go data race detection

Changes:
- Add explicit 'race-enabled' input to server-test-template.yml
- Remove implicit fullyparallel→race coupling from template
- Binary params now runs with fullyparallel: true (default)
- New server-ci-nightly-race.yml runs -race nightly at 2am EST
  on ubuntu-22.04 (2-core) to avoid 8-core contention

Co-authored-by: Claude <claude@anthropic.com>

* ci: add push trigger for release-* branches to weekly workflow

FIPS and binary params validation must run automatically on release
branch pushes, not just on the weekly schedule. Without this trigger,
release branches would lose FIPS/binary coverage entirely.

Co-authored-by: Claude <claude@anthropic.com>

* ci: use ET instead of EST in schedule comments

Cron runs at fixed UTC times regardless of DST. Use ~ET to avoid
implying exact EST/EDT correspondence.

Co-authored-by: Claude <claude@anthropic.com>

* ci: restore conditional FIPS on per-push, unshard weekly FIPS

Per review feedback from @lieut-data:

1. Restore FIPS jobs in server-ci.yml with conditional execution:
   run on all pushes (master/release) and on PRs when go.mod changed
   or branch name contains 'fips'. This ensures Go upgrades and
   explicit FIPS work get immediate feedback.

2. Remove sharding from weekly FIPS — no speed pressure on a weekly
   schedule, so a single unsharded job is simpler (eliminates the
   4-shard matrix + merge job).

3. Restore gomod-changed detection step in the go job.

Both per-push (conditional, unsharded) and weekly (unconditional,
unsharded) FIPS runs use single jobs now, reducing complexity.

Co-authored-by: Claude <claude@anthropic.com>

* ci: restore FIPS sharding for PR runs, remove from push events

FIPS tests in server-ci.yml now only trigger on PRs where the branch
name contains 'fips' or go.mod changed. Sharding (4 shards + merge)
restored for fast iteration on FIPS-related PRs. Regular FIPS coverage
provided by the weekly workflow (unsharded).

This addresses lieut-data's review feedback to restore sharding where
it matters most: during active PR iteration.

Co-authored-by: Claude <claude@anthropic.com>

* ci: add explicit permissions to weekly and nightly workflows

Set minimum required permissions (contents: read) on both new workflow
files per review feedback. Reusable workflows called via 'uses' inherit
the caller's permissions.

Co-authored-by: Claude <claude@anthropic.com>

* ci: keep coverage shards on 8-core runners

Comment out the 2-core runner override for coverage shards per
Eva's feedback. Coverage stays on the default 8-core runners.

Co-authored-by: Claude <claude@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
2026-04-20 16:38:46 -04:00
sabril
d16579964c
add override for e2e test on fips (#36128) 2026-04-16 10:40:07 +08:00
Amy Blais
0fcf3b5ef2
Update docs-impact-review.yml (#36105) 2026-04-15 14:46:10 +03:00
Jesse Hallam
9be83a998a
Fix command injection in server-test-template workflow (#36080)
Replace the unquoted heredoc (which embedded GITHUB_HEAD_REF into a
generated script) with a cp of the existing run-shard-tests.sh, which
already handles the light-only case. Pass BUILD_NUMBER and TEST_TARGET
as explicit docker env vars instead of interpolating them into script
content.
2026-04-14 18:57:51 +02:00
sabril
4dcf6916f7
feat: automatically trigger fips e2e tests for relevant prs (#36014)
Co-authored-by: Mattermost Build <build@mattermost.com>
2026-04-13 10:58:00 +00:00
Jesse Hallam
4d028d557b
Support Elasticsearch v9 alongside v8 (#35781) 2026-04-10 11:15:07 -03:00
Amy Blais
1574bda362
Server: Docs label prompt fix (#36020)
* Update docs-impact-review.yml

* Update docs-impact-review.yml
2026-04-10 16:58:46 +03:00
Pavel Zeman
78b2980ed5
fix: remove duplicate allow-failure input in server test template (#36004)
Some checks are pending
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Coverage (shard 0) (push) Blocked by required conditions
Server CI / Coverage (shard 1) (push) Blocked by required conditions
Server CI / Coverage (shard 2) (push) Blocked by required conditions
Server CI / Coverage (shard 3) (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
The allow-failure input was defined twice in the workflow_call inputs,
causing GitHub Actions to reject the workflow with 0 jobs on master push.
Duplicate was introduced in #35743 merge.

Release Note
NONE

Co-authored-by: Claude <claude@anthropic.com>
2026-04-09 15:47:20 -04:00
Pavel Zeman
860df69621
ci: re-enable server test coverage with 4-shard parallelism (#35743)
* ci: re-enable server test coverage with 4-shard parallelism

The test-coverage job was disabled due to OOM failures when running all
tests with coverage instrumentation in a single process. Re-enable it
by distributing the workload across 4 parallel runners using the shard
infrastructure from the sharding PRs.

Changes:
- Replace disabled single-runner test-coverage with 4-shard matrix
- Add merge-coverage job to combine per-shard cover.out files
- Upload merged coverage to Codecov with server flag
- Skip per-shard Codecov upload when sharding is active
- Add coverage profile merging to run-shard-tests.sh for multi-run shards
- Restore original condition: skip coverage on release branch PRs
- Keep fullyparallel=true (fast within each shard)
- Keep continue-on-error=true (coverage never blocks PRs)

Co-authored-by: Claude <claude@anthropic.com>

* fix: disable fullyparallel for coverage shards

t.Parallel() + t.Setenv() panics kill entire test binaries under
fullyparallel mode. With 4-shard splitting, serial execution within
each shard should still be fast enough (~15 min). We can re-enable
fullyparallel once the incompatible tests are fixed.

Co-authored-by: Claude <claude@anthropic.com>

* fix: add checkout to coverage merge job for Codecov file mapping

Codecov needs the source tree to map coverage data to files.
Without checkout, the upload succeeds but reports 0% coverage
because it can't associate cover.out lines with source files.

Co-authored-by: Claude <claude@anthropic.com>

* ci: add codecov.yml and retain merged coverage artifact

Add codecov.yml with:
- Project coverage: track against parent commit, 1% threshold, advisory
- Patch coverage: 50% target for new code, advisory (warns, doesn't block)
- Ignore generated code (retrylayer, timerlayer, serial_gen, mocks,
  storetest, plugintest, searchtest) — these inflate the denominator
  from 146K to 100K statements, rebasing coverage from 36% to 53%
- PR comments on coverage changes with condensed layout

Save merged cover.out as artifact with 30-day retention (~3.5MB/run).
90-day retention was considered (~6.3GB total vs ~2.1GB at 30 days)
but deferred to keep storage costs low.

#### Release Note
```release-note
NONE
```

Co-authored-by: Claude <claude@anthropic.com>

* ci: add codecov.yml to exclude generated code and enable PR comments (#35748)

* ci: add codecov.yml to exclude generated code and enable PR comments

Add Codecov configuration to improve coverage signal quality:

- Exclude generated code from coverage denominator:
  - store/retrylayer (~10k stmts, auto-generated retry wrappers)
  - store/timerlayer (~14k lines, auto-generated timing wrappers)
  - *_serial_gen.go (serialization codegen)
  - **/mocks (mockery-generated mocks)
- Exclude test infrastructure:
  - store/storetest (~63k lines, test helpers not production code)
  - plugin/plugintest (plugin test helpers)
- Exclude thin wrappers:
  - model/client4.go (~4k stmts, HTTP client methods tested via integration)
- Enable PR comments with condensed layout
- Set project threshold at 0.5% drop tolerance
- Set patch target at 60% for new/changed lines

This rebases the effective coverage metric from ~33.8% to ~43% by
removing ~50k non-production statements from the denominator, giving
a more accurate picture of actual test coverage.

Co-authored-by: Claude <claude@anthropic.com>

* Update codecov.yml

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Jesse Hallam <jesse.hallam@gmail.com>

* fix: bump upload-artifact to v7 and add client4.go to codecov ignore

- Align upload-artifact pin with the rest of the workflow (v4 → v7)
- Add model/client4.go to codecov.yml ignore list as documented in PR description

Co-authored-by: Claude <claude@anthropic.com>

* fix(ci): address Jesse review feedback on coverage sharding

- Remove client4.go from codecov ignore list (coverage is meaningful)
- Remove historical comment block above test-coverage job
- Set fullyparallel back to true (safe per-shard since each runs
  different packages; parallel test fixes tracked in #35751)
- Replace merge-coverage job with per-shard Codecov uploads using
  flags parameter; configure after_n_builds: 4 so Codecov waits for
  all shards before reporting status
- Add clarifying comment in run-shard-tests.sh explaining intra-shard
  coverage merge (multiple gotestsum runs) vs cross-shard merge
  (handled natively by Codecov)
- Simplify codecov.yml: remove verbose comments, use informational
  status checks, streamlined ignore list

Co-authored-by: Claude <claude@anthropic.com>

* fix(ci): set fullyparallel back to false for coverage shards

Coverage shards 1-3 failed with hundreds of test failures because
fullyparallel: true causes panics and races in tests that use
t.Setenv, os.Setenv, and os.Chdir without parallel-safe alternatives.

The parallel-safety fixes are tracked in a separate PR chain:
- #35746: t.Setenv → test hooks
- #35749: os.Setenv → parallel-safe alternatives
- #35750: os.Chdir → t.Chdir
- #35751: flip fullyparallel: true (final step)

Once that chain merges, fullyparallel can be enabled for coverage too.

Co-authored-by: Claude <claude@anthropic.com>

* fix(ci): split fullyparallel and allow-failure into separate inputs

Previously fullyparallel controlled both parallel test execution AND
continue-on-error, meaning disabling parallelism also made coverage
failures blocking. Split into two independent inputs:

- fullyparallel: controls ENABLE_FULLY_PARALLEL_TESTS (test execution)
- allow-failure: controls continue-on-error (advisory vs blocking)

Coverage shards now run with fullyparallel: true (Claudio's original
approach) and allow-failure: true (failures don't block PRs until
parallel-safety fixes land in #35746#35751).

Co-authored-by: Claude <claude@anthropic.com>

* ci: use per-flag after_n_builds for server and webapp coverage

Replace the global after_n_builds: 2 with per-flag values:
- server: after_n_builds: 4 (one per shard)
- webapp: after_n_builds: 1 (single merged upload)

Tag the webapp Codecov upload with flags: webapp so each flag
independently waits for its expected upload count. This prevents
Codecov from firing notifications with incomplete data when the
webapp upload arrives before all server shards complete.

Addresses review feedback from @esarafianou.

Co-authored-by: Claude <claude@anthropic.com>

* fix: consolidate codecov config into .github/codecov.yml

Move all codecov configuration into the existing .github/codecov.yml
instead of introducing a duplicate file at the repo root. Merges
improvements from the root file (broader ignore list, informational
statuses, require_ci_to_pass: false) while preserving the webapp flag
from the original config. Updates after_n_builds to 5 (4 server + 1
webapp).

Co-authored-by: Claude <claude@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Jesse Hallam <jesse.hallam@gmail.com>
2026-04-09 15:27:50 -04:00
Pavel Zeman
cf102afc17
ci: disable fullyparallel for binary parameters job (#35995)
Binary parameters tests run unsharded on a single runner. With
fullyparallel enabled, all ~755 api4 tests run concurrently, causing
resource exhaustion (too many server instances, WebSocket hubs, and DB
connections). The test binary gets killed after 11 minutes with no
individual test failures — just overwhelmed resources.

Disabling fullyparallel for this specific job lets binary parameters
tests pass while we evaluate moving them to a nightly/weekly schedule.

Co-authored-by: Claude <claude@anthropic.com>
2026-04-09 05:44:50 -04:00
Pavel Zeman
6fdef8c9cc
ci: enable fullyparallel mode for server tests (#35816)
* ci: enable fullyparallel mode for server tests

Replace os.Setenv, os.Chdir, and global state mutations with
parallel-safe alternatives (t.Setenv, t.Chdir, test hooks) across
37 files. Refactor GetLogRootPath and MM_INSTALL_TYPE to use
package-level test hooks instead of environment variables.

This enables gotestsum --fullparallel, allowing all test packages
to run with maximum parallelism within each shard.

Co-authored-by: Claude <claude@anthropic.com>

* ci: split fullyparallel from continue-on-error in workflow template

- Add new boolean input 'allow-failure' separate from 'fullyparallel'
- Change continue-on-error to use allow-failure instead of fullyparallel
- Update server-ci.yml to pass allow-failure: true for test coverage job
- Allows independent control of parallel execution and failure tolerance

Co-authored-by: Claude <claude@anthropic.com>

* fix: protect TestOverrideLogRootPath with sync.Mutex for parallel tests

- Replace global var TestOverrideLogRootPath with mutex-protected functions
- Add SetTestOverrideLogRootPath() and getTestOverrideLogRootPath() functions
- Update GetLogRootPath() to use thread-safe getter
- Update all test files to use SetTestOverrideLogRootPath() with t.Cleanup()
- Fixes race condition when running tests with t.Parallel()

Co-authored-by: Claude <claude@anthropic.com>

* fix: configure audit settings before server setup in tests

- Move ExperimentalAuditSettings from UpdateConfig() to config defaults
- Pass audit config via app.Config() option in SetupWithServerOptions()
- Fixes audit test setup ordering to configure BEFORE server initialization
- Resolves CodeRabbit's audit config timing issue in api4 tests

Co-authored-by: Claude <claude@anthropic.com>

* fix: implement SetTestOverrideLogRootPath mutex in logger.go

The previous commit updated test callers to use SetTestOverrideLogRootPath()
but didn't actually create the function in config/logger.go, causing build
failures across all CI shards. This commit:

- Replaces the exported var TestOverrideLogRootPath with mutex-protected
  unexported state (testOverrideLogRootPath + testOverrideLogRootMu)
- Adds exported SetTestOverrideLogRootPath() setter
- Adds unexported getTestOverrideLogRootPath() getter
- Updates GetLogRootPath() to use the thread-safe getter
- Fixes log_test.go callers that were missed in the previous commit

Co-authored-by: Claude <claude@anthropic.com>

* fix(test): use SetupConfig for access_control feature flag registration

InitAccessControlPolicy() checks FeatureFlags.AttributeBasedAccessControl
at route registration time during server startup. Setting the flag via
UpdateConfig after Setup() is too late — routes are never registered
and API calls return 404.

Use SetupConfig() to pass the feature flag in the initial config before
server startup, ensuring routes are properly registered.

Co-authored-by: Claude <claude@anthropic.com>

* fix(test): restore BurnOnRead flag state in TestRevealPost subtest

The 'feature not enabled' subtest disables BurnOnRead without restoring
it via t.Cleanup. Subsequent subtests inherit the disabled state, which
can cause 501 errors when they expect the feature to be available.

Add t.Cleanup to restore FeatureFlags.BurnOnRead = true after the
subtest completes.

Co-authored-by: Claude <claude@anthropic.com>

* fix(test): restore EnableSharedChannelsMemberSync flag via t.Cleanup

The test disables EnableSharedChannelsMemberSync without restoring it.
If the subtest exits early (e.g., require failure), later sibling
subtests inherit a disabled flag and become flaky.

Add t.Cleanup to restore the flag after the subtest completes.

Co-authored-by: Claude <claude@anthropic.com>

* Fix test parallelism: use instance-scoped overrides and init-time audit config

  Replace package-level test globals (TestOverrideInstallType,
  SetTestOverrideLogRootPath) with fields on PlatformService so each test
  gets its own instance without process-wide mutation. Fix three audit
  tests (TestUserLoginAudit, TestLogoutAuditAuthStatus,
  TestUpdatePasswordAudit) that configured the audit logger after server
  init — the audit logger only reads config at startup, so pass audit
  settings via app.Config() at init time instead.

  Also revert the Go 1.24.13 downgrade and bump mattermost-govet to
  v2.0.2 for Go 1.25.8 compatibility.

* Fix audit unit tests

* Fix MMCLOUDURL unit tests

* Fixed unit tests using MM_NOTIFY_ADMIN_COOL_OFF_DAYS

* Make app migrations idempotent for parallel test safety

  Change System().Save() to System().SaveOrUpdate() in all migration
  completion markers. When two parallel tests share a database pool entry,
  both may race through the check-then-insert migration pattern. Save()
  causes a duplicate key fatal crash; SaveOrUpdate() makes the second
  write a harmless no-op.

* test: address review feedback on fullyparallel PR

- Use SetLogRootPathOverride() setter instead of direct field access
  in platform/support_packet_test.go and platform/log_test.go (pvev)
- Restore TestGetLogRootPath in config/logger_test.go to keep
  MM_LOG_PATH env var coverage; test uses t.Setenv so it runs
  serially which is fine (pvev)
- Fix misleading comment in config_test.go: code uses t.Setenv,
  not os.Setenv (jgheithcock)

Co-authored-by: Claude <claude@anthropic.com>

* fix: add missing os import in post_test.go

The os import was dropped during a merge conflict resolution while
burn-on-read shared channel tests from master still use os.Setenv.

Co-authored-by: Claude <claude@anthropic.com>

---------

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: wiggin77 <wiggin77@warpmail.net>
Co-authored-by: Mattermost Build <build@mattermost.com>
2026-04-08 20:48:36 -04:00
Jesse Hallam
71ca373de7
Generate instead of hard-coding test passwords, enforce new minimum for FIPS, shard CI, fix FIPS builds (#35905)
Some checks are pending
Server CI / Check mmctl docs (push) Blocked by required conditions
Server CI / Postgres with binary parameters (push) Blocked by required conditions
Server CI / Postgres (shard 0) (push) Blocked by required conditions
Server CI / Postgres (shard 1) (push) Blocked by required conditions
Server CI / Postgres (shard 2) (push) Blocked by required conditions
Server CI / Postgres (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres Test Results (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 0) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 1) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 2) (push) Blocked by required conditions
Server CI / Postgres FIPS (shard 3) (push) Blocked by required conditions
Server CI / Merge Postgres FIPS Test Results (push) Blocked by required conditions
Server CI / Generate Test Coverage (push) Blocked by required conditions
Server CI / Run mmctl tests (push) Blocked by required conditions
Server CI / Run mmctl tests (FIPS) (push) Blocked by required conditions
Server CI / Build mattermost server app (push) Blocked by required conditions
Tools CI / check-style (mattermost-govet) (push) Waiting to run
Tools CI / Test (mattermost-govet) (push) Waiting to run
Web App CI / check-lint (push) Waiting to run
Web App CI / check-i18n (push) Blocked by required conditions
Web App CI / check-external-links (push) Blocked by required conditions
Web App CI / check-types (push) Blocked by required conditions
Web App CI / test (platform) (push) Blocked by required conditions
Web App CI / test (mattermost-redux) (push) Blocked by required conditions
Web App CI / test (channels shard 1/4) (push) Blocked by required conditions
Web App CI / test (channels shard 2/4) (push) Blocked by required conditions
Web App CI / test (channels shard 3/4) (push) Blocked by required conditions
Web App CI / test (channels shard 4/4) (push) Blocked by required conditions
Web App CI / upload-coverage (push) Blocked by required conditions
Web App CI / build (push) Blocked by required conditions
* Replace hardcoded test passwords with model.NewTestPassword()

Add model.NewTestPassword() utility that generates 14+ character
passwords meeting complexity requirements for FIPS compliance. Replace
all short hardcoded test passwords across the test suite with calls to
this function.

* Enforce FIPS compliance for passwords and HMAC keys

FIPS OpenSSL requires HMAC keys to be at least 14 bytes. PBKDF2 uses
the password as the HMAC key internally, so short passwords cause
PKCS5_PBKDF2_HMAC to fail.

- Add FIPSEnabled and PasswordFIPSMinimumLength build-tag constants
- Raise the password minimum length floor to 14 when compiled with
  requirefips, applied in SetDefaults only when unset and validated
  independently in IsValid
- Return ErrMismatchedHashAndPassword for too-short passwords in
  PBKDF2 CompareHashAndPassword rather than a cryptic OpenSSL error
- Validate atmos/camo HMAC key length under FIPS and lengthen test
  keys accordingly
- Adjust password validation tests to use PasswordFIPSMinimumLength
  so they work under both FIPS and non-FIPS builds

* CI: shard FIPS test suite and extract merge template

Run FIPS tests on PRs that touch go.mod or have 'fips' in the branch
name. Shard FIPS tests across 4 runners matching the normal Postgres
suite. Extract the test result merge logic into a reusable workflow
template to deduplicate the normal and FIPS merge jobs.

* more

* Fix email test helper to respect FIPS minimum password length

* Fix test helpers to respect FIPS minimum password length

* Remove unnecessary "disable strict password requirements" blocks from test helpers

* Fix CodeRabbit review comments on PR #35905

- Add server-test-merge-template.yml to server-ci.yml pull_request.paths
  so changes to the reusable merge workflow trigger Server CI validation
- Skip merge-postgres-fips-test-results job when test-postgres-normal-fips
  was skipped, preventing failures due to missing artifacts
- Set guest.Password on returned guest in CreateGuestAndClient helper
  to keep contract consistent with CreateUserWithClient
- Use shared LowercaseLetters/UppercaseLetters/NUMBERS/PasswordFIPSMinimumLength
  constants in NewTestPassword() to avoid drift if FIPS floor changes

https://claude.ai/code/session_01HmE9QkZM3cAoXn2J7XrK2f

* Rename FIPS test artifact to match server-ci-report pattern

The server-ci-report job searches for artifacts matching "*-test-logs",
so rename from postgres-server-test-logs-fips to
postgres-server-fips-test-logs to be included in the report.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-08 16:49:43 -03:00
sabril
993c3cb5d2
fix: test analysis override (#35987) 2026-04-09 01:24:08 +08:00
sabril
c303dd8e08
fix: test analysis (#35986) 2026-04-09 00:50:06 +08:00
sabril
fba382d5bd
feat(test analysis): using reusable workflow (#35852)
* feat(test analysis): using reusable workflow

* address comment

* pin to main during initial rollout

---------

Co-authored-by: Mattermost Build <build@mattermost.com>
2026-04-09 00:20:28 +08:00
Eva Sarafianou
252eb9661d
Update docs-impact workflow to keep stale comment instead of deleting (#35940)
* Update docs-impact workflow to keep stale comment instead of deleting

When a re-run of the docs impact analysis determines that documentation
updates are no longer needed, the previous bot comment was deleted while
the Docs/Needed label was kept. This left the label without context.

Instead of deleting the comment, update it to explain that a previous
analysis had flagged the PR but the latest run found no docs impact.
This preserves the audit trail and gives maintainers context to decide
whether to remove the label.

Made-with: Cursor

* Update .github/workflows/docs-impact-review.yml

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-04-07 13:20:03 +03:00
Jesse Hallam
4d20645a5b
Inline mattermost-govet into the monorepo (#35869)
* inline mattermost-govet

* fix style issues

* simplify the openApiSync spec test

* README.md tweaks

* fix missing licenses

* simplify README.md

* trigger server-ci on tools/mattermost-govet/**

* Apply 470cf78253

---------

Co-authored-by: Mattermost Build <build@mattermost.com>
2026-04-01 13:24:22 +00:00
Eva Sarafianou
aaefd4109b
MM-68120 - Use repo checkout for build files in server-ci-artifacts (#35842)
* Use repo checkout for Dockerfile in server-ci-artifacts build-docker job

The build-docker job was downloading server-build-artifact to get the
Dockerfile and supporting files for every PR. Switch to checking out
server/build/ directly from the repo for external PRs, while keeping
the artifact-based flow for same-repo PRs so that Dockerfile changes
can be tested before merge.

Only upload server-build-artifact when the PR comes from the same repo,
since external PRs no longer use it.

Made-with: Cursor

* retrigger pipelines

* undo previous commit

---------

Co-authored-by: Mattermost Build <build@mattermost.com>
2026-04-01 14:12:56 +03:00
Amy Blais
47d2c6074d
Docs impact fixes (#35877)
* Update docs-impact-review.yml

* Update docs-impact-review.yml

* Update docs-impact-review.yml

* Apply suggestions from code review

Co-authored-by: Eva Sarafianou <eva.sarafianou@mattermost.com>

* Update .github/workflows/docs-impact-review.yml

Co-authored-by: Eva Sarafianou <eva.sarafianou@gmail.com>

* Update docs-impact-review.yml

* Update docs-impact-review.yml

* Update docs-impact-review.yml

---------

Co-authored-by: Eva Sarafianou <eva.sarafianou@mattermost.com>
Co-authored-by: Eva Sarafianou <eva.sarafianou@gmail.com>
Co-authored-by: Mattermost Build <build@mattermost.com>
2026-04-01 09:44:52 +03:00
yasser khan
2550ecd87b
ci: post success to required e2e status contexts when no relevant changes (#35880)
* ci: post correct skip status from within cypress/playwright reusable workflows

The 'Required Status Checks' ruleset requires e2e-test/cypress-full/enterprise
and e2e-test/playwright-full/enterprise on master and release-*.* branches.
When a PR has no E2E-relevant changes, the jobs were silently skipped, leaving
required statuses unset and the PR permanently blocked.

Architecture fix: instead of a separate skip-e2e job in the caller that
hardcodes status context names, the skip logic now lives inside the reusable
workflows that already own and compute those context names.

Changes:
- e2e-tests-cypress.yml: add should_run input (default 'true') + skip job
  that uses the dynamically-computed context_name when should_run == 'false'
- e2e-tests-playwright.yml: same pattern
- e2e-tests-ci.yml: change e2e-cypress/e2e-playwright job conditions from
  should_run == 'true' to PR_NUMBER != '' (always run when there's a PR),
  pass should_run as input to both reusable workflows
2026-04-01 10:34:55 +05:30