* ci: re-enable server test coverage with 4-shard parallelism
The test-coverage job was disabled due to OOM failures when running all
tests with coverage instrumentation in a single process. Re-enable it
by distributing the workload across 4 parallel runners using the shard
infrastructure from the sharding PRs.
Changes:
- Replace disabled single-runner test-coverage with 4-shard matrix
- Add merge-coverage job to combine per-shard cover.out files
- Upload merged coverage to Codecov with server flag
- Skip per-shard Codecov upload when sharding is active
- Add coverage profile merging to run-shard-tests.sh for multi-run shards
- Restore original condition: skip coverage on release branch PRs
- Keep fullyparallel=true (fast within each shard)
- Keep continue-on-error=true (coverage never blocks PRs)
Co-authored-by: Claude <claude@anthropic.com>
* fix: disable fullyparallel for coverage shards
t.Parallel() + t.Setenv() panics kill entire test binaries under
fullyparallel mode. With 4-shard splitting, serial execution within
each shard should still be fast enough (~15 min). We can re-enable
fullyparallel once the incompatible tests are fixed.
Co-authored-by: Claude <claude@anthropic.com>
* fix: add checkout to coverage merge job for Codecov file mapping
Codecov needs the source tree to map coverage data to files.
Without checkout, the upload succeeds but reports 0% coverage
because it can't associate cover.out lines with source files.
Co-authored-by: Claude <claude@anthropic.com>
* ci: add codecov.yml and retain merged coverage artifact
Add codecov.yml with:
- Project coverage: track against parent commit, 1% threshold, advisory
- Patch coverage: 50% target for new code, advisory (warns, doesn't block)
- Ignore generated code (retrylayer, timerlayer, serial_gen, mocks,
storetest, plugintest, searchtest) — these inflate the denominator
from 146K to 100K statements, rebasing coverage from 36% to 53%
- PR comments on coverage changes with condensed layout
Save merged cover.out as artifact with 30-day retention (~3.5MB/run).
90-day retention was considered (~6.3GB total vs ~2.1GB at 30 days)
but deferred to keep storage costs low.
#### Release Note
```release-note
NONE
```
Co-authored-by: Claude <claude@anthropic.com>
* ci: add codecov.yml to exclude generated code and enable PR comments (#35748)
* ci: add codecov.yml to exclude generated code and enable PR comments
Add Codecov configuration to improve coverage signal quality:
- Exclude generated code from coverage denominator:
- store/retrylayer (~10k stmts, auto-generated retry wrappers)
- store/timerlayer (~14k lines, auto-generated timing wrappers)
- *_serial_gen.go (serialization codegen)
- **/mocks (mockery-generated mocks)
- Exclude test infrastructure:
- store/storetest (~63k lines, test helpers not production code)
- plugin/plugintest (plugin test helpers)
- Exclude thin wrappers:
- model/client4.go (~4k stmts, HTTP client methods tested via integration)
- Enable PR comments with condensed layout
- Set project threshold at 0.5% drop tolerance
- Set patch target at 60% for new/changed lines
This rebases the effective coverage metric from ~33.8% to ~43% by
removing ~50k non-production statements from the denominator, giving
a more accurate picture of actual test coverage.
Co-authored-by: Claude <claude@anthropic.com>
* Update codecov.yml
---------
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Jesse Hallam <jesse.hallam@gmail.com>
* fix: bump upload-artifact to v7 and add client4.go to codecov ignore
- Align upload-artifact pin with the rest of the workflow (v4 → v7)
- Add model/client4.go to codecov.yml ignore list as documented in PR description
Co-authored-by: Claude <claude@anthropic.com>
* fix(ci): address Jesse review feedback on coverage sharding
- Remove client4.go from codecov ignore list (coverage is meaningful)
- Remove historical comment block above test-coverage job
- Set fullyparallel back to true (safe per-shard since each runs
different packages; parallel test fixes tracked in #35751)
- Replace merge-coverage job with per-shard Codecov uploads using
flags parameter; configure after_n_builds: 4 so Codecov waits for
all shards before reporting status
- Add clarifying comment in run-shard-tests.sh explaining intra-shard
coverage merge (multiple gotestsum runs) vs cross-shard merge
(handled natively by Codecov)
- Simplify codecov.yml: remove verbose comments, use informational
status checks, streamlined ignore list
Co-authored-by: Claude <claude@anthropic.com>
* fix(ci): set fullyparallel back to false for coverage shards
Coverage shards 1-3 failed with hundreds of test failures because
fullyparallel: true causes panics and races in tests that use
t.Setenv, os.Setenv, and os.Chdir without parallel-safe alternatives.
The parallel-safety fixes are tracked in a separate PR chain:
- #35746: t.Setenv → test hooks
- #35749: os.Setenv → parallel-safe alternatives
- #35750: os.Chdir → t.Chdir
- #35751: flip fullyparallel: true (final step)
Once that chain merges, fullyparallel can be enabled for coverage too.
Co-authored-by: Claude <claude@anthropic.com>
* fix(ci): split fullyparallel and allow-failure into separate inputs
Previously fullyparallel controlled both parallel test execution AND
continue-on-error, meaning disabling parallelism also made coverage
failures blocking. Split into two independent inputs:
- fullyparallel: controls ENABLE_FULLY_PARALLEL_TESTS (test execution)
- allow-failure: controls continue-on-error (advisory vs blocking)
Coverage shards now run with fullyparallel: true (Claudio's original
approach) and allow-failure: true (failures don't block PRs until
parallel-safety fixes land in #35746 → #35751).
Co-authored-by: Claude <claude@anthropic.com>
* ci: use per-flag after_n_builds for server and webapp coverage
Replace the global after_n_builds: 2 with per-flag values:
- server: after_n_builds: 4 (one per shard)
- webapp: after_n_builds: 1 (single merged upload)
Tag the webapp Codecov upload with flags: webapp so each flag
independently waits for its expected upload count. This prevents
Codecov from firing notifications with incomplete data when the
webapp upload arrives before all server shards complete.
Addresses review feedback from @esarafianou.
Co-authored-by: Claude <claude@anthropic.com>
* fix: consolidate codecov config into .github/codecov.yml
Move all codecov configuration into the existing .github/codecov.yml
instead of introducing a duplicate file at the repo root. Merges
improvements from the root file (broader ignore list, informational
statuses, require_ci_to_pass: false) while preserving the webapp flag
from the original config. Updates after_n_builds to 5 (4 server + 1
webapp).
Co-authored-by: Claude <claude@anthropic.com>
---------
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Jesse Hallam <jesse.hallam@gmail.com>
* TestPool
* Store infra
* Store tests updates
* Bump maximum concurrent postgres connections
* More infra
* channels/jobs
* channels/app
* channels/api4
* Protect i18n from concurrent access
* Replace some use of os.Setenv
* Remove debug
* Lint fixes
* Fix more linting
* Fix test
* Remove use of Setenv in drafts tests
* Fix flaky TestWebHubCloseConnOnDBFail
* Fix merge
* [MM-62408] Add CI job to generate test coverage (#30284)
* Add CI job to generate test coverage
* Remove use of Setenv in drafts tests
* Fix flaky TestWebHubCloseConnOnDBFail
* Fix more Setenv usage
* Fix more potential flakyness
* Remove parallelism from flaky test
* Remove conflicting env var
* Fix
* Disable parallelism
* Test atomic covermode
* Disable parallelism
* Enable parallelism
* Add upload coverage step
* Fix codecov.yml
* Add codecov.yml
* Remove redundant workspace field
* Add Parallel() util methods and refactor
* Fix formatting
* More formatting fixes
* Fix reporting