Commit graph

5699 commits

Author SHA1 Message Date
Kubernetes Prow Robot
db63a581ca
Merge pull request #134366 from tallclair/feature-gates-test
Set multiple feature gates simultaneously in test
2025-10-13 13:11:33 -07:00
Kubernetes Prow Robot
b393d87d16
Merge pull request #134440 from pohly/e2e-volumebinding-watch-fix
integration test volume: fix restarting of watch
2025-10-09 03:05:09 -07:00
Kubernetes Prow Robot
7891d35ccf
Merge pull request #134399 from aojea/slice_headless
add integration test for endpoint and endpointslice controller labels propagation and headless services
2025-10-07 13:35:09 -07:00
Kubernetes Prow Robot
3a53784ecb
Merge pull request #133876 from kei01234kei/make_v1_version_fist_priotiry_inresource
make v1 resource version first priority in resource
2025-10-07 08:55:02 -07:00
Patrick Ohly
13cd40d718 E2E volume: fix restarting of watch
Presumably
https://github.com/kubernetes/kubernetes/pull/127260/files#r2405215911
was meant to continue polling after a watch was closed by the apiserver.
This is something that can happen under load.

However, returning the error has the effect that polling stops.
This can be seen as test failures when testing with race
detection enabled:

    persistent_volumes_test.go:1101: Failed to wait for all claims to be bound: watch closed
2025-10-07 10:22:35 +02:00
Antonio Ojea
2b220dffa7 add integration test for endpointslice controller headless services 2025-10-06 13:39:12 +00:00
Kubernetes Prow Robot
389507c723
Merge pull request #134294 from ania-borowiec/test_for_rollback
Fix for incorrect activation of preemptor pod waiting for deletion of victim, plus integration test verifying the fix
2025-10-03 03:38:58 -07:00
Ania Borowiec
7c59672213
Fix in code and integration test that verifies that when victim pod is stuck in binding, preemptor pod remains waiting in unschedulable queue until deletion of the victim pod is completed 2025-10-03 09:42:50 +00:00
Antonio Ojea
0b0a5974f8 integration test: webhook proxy behavior
adds a new integration test to verify that the API server's egress
to admission webhooks correctly respects the standard `HTTPS_PROXY`
and `NO_PROXY` environment variables.

It adds a new test util to implement a Fake DNS server that allows
to override DNS resolution in tests, specially useful for integration
test that can only bind to localhost the servers, that is ignored
by certain functionalities.
2025-10-02 22:31:08 +00:00
Kubernetes Prow Robot
8ac5701d3a
Merge pull request #134052 from Jefftree/cle-tests-rename
Rename CLE test files
2025-10-02 00:11:03 -07:00
Tim Allclair
4986abe0b8 Automated refactoring to use SetFeatureGatesDuringTest 2025-10-01 21:10:53 -07:00
Kubernetes Prow Robot
6a687c5ddc
Merge pull request #133339 from aojea/vap_servicecidr
ServiceCIDR ValidationAdmissionPolicy for implementing previous behavior
2025-09-30 16:20:15 -07:00
Kubernetes Prow Robot
4a1558c545
Merge pull request #133967 from pohly/dra-allocator-selection
DRA: allocator selection
2025-09-30 08:24:18 -07:00
Patrick Ohly
68205ff40c DRA scheduler_perf: run with specific allocator implementations
For some packages (in particular the code DRA), all allocator implementations
can handle the testcases. Some other packages are for less stable features and
work with fewer implementations. Now all unit tests are run with all suitable
implementations, to increase code coverage.

Benchmarks are fixed to the most mature implementation because they would be
costly to run in more than one. When promoting an allocation implementation we
can do before/after comparisons to detect potential performance regressions.

The downside of this approach is that we need to remember to extend the list
of supported implementations when promoting features, otherwise testing will miss
some new supported implementation.
2025-09-30 18:19:57 +02:00
Patrick Ohly
5832c915ac scheduler_perf: apply feature gates in deterministic, alphabetical order
Without this, the effect of the following feature gate config would be random:

    featureGates:
      AllBeta: false
      SomeBetaFeature: true

That's random because the order of iterating of the map is randomized by Go and
`AllBeta: false` would disable `SomeBetaFeature` if (and only if) applied last.

Now by sorting alphabetically, AllAlpha/Beta come first in practice. It's not a
complete solution, some future feature gate name might come before it.
2025-09-30 16:53:39 +02:00
Maciej Skoczeń
d2e6be440c Revert "Merge pull request #133213 from sanposhiho/second-trial-conor"
This reverts commit a2bf45b081, reversing
changes made to 2b2ea27250.
2025-09-24 11:05:16 +00:00
Kubernetes Prow Robot
d3cb6b539d
Merge pull request #133178 from liggitt/psa-emulation
make admission and pod-security-admission checks be informed by emulation version
2025-09-17 17:22:07 -07:00
Kubernetes Prow Robot
2c20282928
Merge pull request #133715 from cici37/MAPStorageVersionUpdate
Update MAP storage version to use v1beta1
2025-09-17 12:50:07 -07:00
Jordan Liggitt
55419eca7a
Plumb effective version into admission initializer 2025-09-17 15:23:31 -04:00
Jefftree
872981a205 Rename CLE test directories 2025-09-12 21:17:06 +00:00
Kubernetes Prow Robot
08bc37c755
Merge pull request #134021 from pohly/scheduler-perf-gc
scheduler_perf: run garbage collection before measurement
2025-09-12 03:18:15 -07:00
Kubernetes Prow Robot
601068b889
Merge pull request #134018 from tico88612/cleanup/new-indexer-informer-watcher-replace
Replace NewIndexerInformerWatcher with NewIndexerInformerWatcherWithLogger
2025-09-12 03:18:08 -07:00
Kubernetes Prow Robot
69e637f24c
Merge pull request #131755 from jpbetz/openapi-type-name-gen
Allow OpenAPI model package names to be declared by APIs
2025-09-11 12:26:08 -07:00
Kubernetes Prow Robot
ca78eafa24
Merge pull request #134010 from pohly/scheduler-perf-docs
scheduler_perf: KUBE_CACHE_MUTATION_DETECTOR=false in docs
2025-09-11 11:36:14 -07:00
ChengHao Yang
029d314e15
Replace NewIndexerInformerWatcher with NewIndexerInformerWatcherWithLogger
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-09-12 01:44:02 +08:00
Patrick Ohly
16fa150182 scheduler_perf: run garbage collection before measurement
The startup phase may have allocated memory that can be garbage-collected.
Forcing GC to run before measurements avoids noise if the garbage collection
kicks in during the measurement and potentially reduces the heap size reported
by metrics.

The exact effect has not been measured, it just seems useful.
2025-09-11 19:25:20 +02:00
Kubernetes Prow Robot
e7d7d8984e
Merge pull request #133996 from macsko/disable_too_short_scheduler_perf_workloads
Disable too short scheduler_perf workloads
2025-09-11 09:28:07 -07:00
Maciej Skoczeń
0c0acbc535 Disable too short scheduler_perf workloads 2025-09-11 11:48:45 +00:00
Patrick Ohly
9f31b00908 scheduler_perf: KUBE_CACHE_MUTATION_DETECTOR=false in docs
When looking at a CPU profile, the cache mutation detection stood out.  "make
test-integration" enables it by default. We try to benchmark "real" production
setups, therefore we have to prevent that by setting it to false ourselves.
2025-09-11 12:37:51 +02:00
Kubernetes Prow Robot
d433db0782
Merge pull request #133978 from aramase/aramase/i/fix_kms_test_133945
kmsv2: run `TestKMSv2ProviderKeyIDStaleness` tests in parallel
2025-09-10 15:05:55 -07:00
Anish Ramasekar
480fad996d
kmsv2: run TestKMSv2ProviderKeyIDStaleness in parallel
This change updates the NowFunc to be per KMS provider instead of global
to the API server. This allows integration tests that use distinct
provider names to run in parallel when simulating key expiry.

Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-09-10 14:15:43 -07:00
Joe Betz
fc091d93d5 Update tests that depend on internal model names
Signed-off-by: Joe Betz <jpbetz@google.com>
2025-09-10 15:52:59 -04:00
Kubernetes Prow Robot
5cef241d82
Merge pull request #133218 from nmn3m/kube-controller-manager-statuz
adds a list of available HTTP endpoints for the kube-controller-manag…
2025-09-10 11:48:22 -07:00
Kubernetes Prow Robot
2854e946c3
Merge pull request #131430 from carlory/follow-up-127017
remove v1beta3 flowcontrol from rest storage
2025-09-10 11:48:05 -07:00
Kubernetes Prow Robot
bbd859808d
Merge pull request #133921 from dims/update-prometheus-client-golang-and-common-packages
update prometheus' client_golang and common packages
2025-09-10 08:40:06 -07:00
carlory
7e6aafe157
fix intergation test
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-09-10 22:51:41 +08:00
Kubernetes Prow Robot
a9fe67a1dd
Merge pull request #133990 from macsko/minor_fixes_in_scheduler
Fix minor inconsistencies in scheduler
2025-09-10 06:58:11 -07:00
Patrick Ohly
83273e21b9 DRA scheduler_perf: clean up usage of steady-state pod scheduling
The steady-state pod scheduling is less suitable for integration tests because
the duration is either short (making the test potentially flaky if nothing gets
scheduled yet due to the time constraint) or long (making the test run too
long). It is more useful for benchmark testcase because of the bounded runtime.

Now a single workload definition can be used in both modes with a configuration
parameter for "steadyState".

Workload definitions get updated accordingly. While at it, their names get
simplified and some (in the case of the main DRA config) redundant testcases
get removed.
2025-09-10 13:47:08 +02:00
Patrick Ohly
9af3e86810 scheduler_perf: detect testcases with no pods scheduled
Some of the DRA testcases schedule pods in a steady state for a certain
duration. They pass even if no pods got scheduled at all because in contrast to
the non-steady-state variants they don't wait for fixed number of pods to be
scheduled. This made them unsuitable for integration testing because a real
problem is not flagged as test failure. Now "zero pods scheduled" is detected
for them.

However, they are still not good integration tests (either run quickly and then
risk being flaky or run for a longer time period and then are slow). Revisiting
how they are used in configurations will be done separately.
2025-09-10 13:47:08 +02:00
Maciej Skoczeń
3dfcda9afd Fix minor inconsistencies in scheduler 2025-09-10 11:40:10 +00:00
Kubernetes Prow Robot
2e13e70d5f
Merge pull request #133983 from sunya-ch/simplified-consumable-capacity-integration-test
DRA: Fix ConsumableCapacity shceduler perf test (simplified)
2025-09-10 02:16:14 -07:00
Sunyanan Choochotkaew
5483c52e10
DRA: Fix ConsumableCapacity shceduler perf test (simplified)
Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>
2025-09-10 13:41:37 +09:00
Keisuke Ishigami
6d0138d3f1 modify etcd data for integration test 2025-09-10 09:48:21 +09:00
Morten Torkildsen
81cb5b7df2 DRA: Fix PrioritizedList scheduler perf test 2025-09-09 22:13:32 +00:00
Davanum Srinivas
e2e7fa1799 switch our usage of expfmt.TextParser 2025-09-09 15:53:48 -04:00
Kubernetes Prow Robot
8548f32a46
Merge pull request #133941 from pohly/scheduler-perf-dra-resourceslices
scheduler_perf: treat ResourceSlice publishing as workload startup
2025-09-09 11:57:56 -07:00
Patrick Ohly
8b50c77eb6 scheduler_perf: measure DRA setup time
The time required for pulling ResourceSlices into the scheduler is relevant in
two cases:
- The scheduler was (re)started and waits for informers to sync.
- A driver got deployed and needs to inform the scheduler about its devices.

The new workload measures the second scenario. It's indirectly relevant for
the first one because it allows drawing conclusion about the code which is also
involved in the first one.
2025-09-09 15:15:29 +02:00
Kubernetes Prow Robot
1bec132e1e
Merge pull request #133939 from pohly/scheduler-perf-testing-B-metrics
scheduler_perf: reset and stop testing.B metrics
2025-09-09 02:59:31 -07:00
Patrick Ohly
8ff5cec261 scheduler_perf: block after creating ResourceSlices
After creating ResourceSlices, the workload was allowed to proceed even while
the scheduler was still busy receiving those new ResourceSlices. This blurred
the line between "setup" and "measurement" phase of DRA workloads. It's not
immediately clear how much that affected results, but it is cleaner to block.

This is done by returning the scheduler instance to the main scheduler_perf
loop and then pass the SharedDRAManager into the driver setup operation. There
it can be used to poll until that manager has processed all ResourceSlices.
2025-09-08 19:36:32 +02:00
Patrick Ohly
af6da561dd scheduler_perf: reset and stop testing.B metrics
Before, metrics gathered by testing.B (runtime_seconds,
-benchmem's B/op and allocs/op) covered the entire test case, including
starting the apiserver and the initialization steps of a workload. Now those
metrics are also limited to the period where the workload is configured to
collect metrics.
2025-09-08 19:17:24 +02:00