kubernetes/test/e2e_node/jenkins
Danielle Lancashire 0cc8af82a1 e2e_node: use upstream gpu installer
The current GPU installer was built in 2017, from source that no longer
exists in Kubernetes ([adding commit][1]. The image was built on 2017-06-13.

Unfortunately, this installer no longer appears to work. When debugging
on the same node type as used by test-infra, it failed to build the
driver as the kernel sha was no longer available.

This lead to needing to find a new way to install GPUs. The smallest
logical change was switching to [cos-gpu-installer][2]
. There is a newer version of this available on [googlesource][3] that
I have not yet tested as it's not clear what the state of the project
is, as I couldn't find docs outside of the source itself.

We install things to the same location as previously to avoid needing
extra downstream changes. There are a couple of weird issues here
however, like needing to run the container twice to correctly update the
LD Cache.

[1]: 1e77594958/cluster/gce/gci/nvidia-gpus/Dockerfile
[2]: https://github.com/GoogleCloudPlatform/cos-gpu-installer
[3]: https://cos.googlesource.com/cos/tools/+/refs/heads/master/src/cmd/cos_gpu_installer/
2021-08-26 14:09:45 +02:00
..
conformance Merge pull request #74488 from xichengliudui/fixshellcheck19022502 2019-03-01 12:49:08 -08:00
docker_validation clean up *.properties files 2018-04-17 21:44:32 -07:00
copy-e2e-image.sh fix shellcheck in test/e2e_node/jenkins/... 2019-02-26 01:01:48 -05:00
coreos-init.json Make coreos test images sshd not allow password login. 2017-08-25 11:49:34 -07:00
cos-init-disable-live-restore.yaml Add a cloud-init script to disable live-restore 2017-11-14 21:40:13 -08:00
cos-init-docker.yaml fix all the typos across the project 2018-02-11 11:04:14 +08:00
cos-init-live-restore.yaml [test/e2e_node]Redirect dl.k8s.io to the kubernetes-release GCS bucket 2017-11-02 12:18:50 +08:00
e2e-node-jenkins.sh opt out of module mode for builds 2019-11-06 17:39:05 -05:00
gci-init-gpu.yaml e2e_node: use upstream gpu installer 2021-08-26 14:09:45 +02:00
gci-init.yaml [test/e2e_node]Redirect dl.k8s.io to the kubernetes-release GCS bucket 2017-11-02 12:18:50 +08:00
OWNERS Add mrunalp as node approver 2020-11-04 15:48:30 -08:00
README.md Add a notice for node e2e config files 2017-10-20 16:10:13 -07:00
ubuntu-14.04-nvidia-install.sh fix shellcheck in test/e2e_node/jenkins/... 2019-02-26 01:01:48 -05:00
ubuntu-init-docker.yaml fix all the typos across the project 2018-02-11 11:04:14 +08:00

Node e2e job migration notice:

Sig-testing is actively migrating node e2e jobs from Jenkins to Prow, and we are moving *.property and image-config.yaml files to test-infra

If you want to update those files, please also update them in test-infra.

If you have any questions, please contact @krzyzacy or #sig-testing.

Here's where the existing node e2e job config live:

Image config files

Node test job args (.properties equivalent)