Bug #4839
closeddocker.io sometimes returns EOF, breaking our builds
100%
Description
We have plenty of situations where docker.io seemingly returns EOF (i.e. nothing) when pulling a base image like debian:stretch
. The failure to pull will cause our jenkins job (e.g. a TTCN3 test) to fail, despite no failure on our side.
This has appeared even before docker introduced rate limiting today, so it is unrelated to that.
Related issues
Updated by laforge over 3 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 10
I tried to use our docker/registry instance at registry.sysmocom.de as a 'pull-throug cache' as documented at https://docs.docker.com/registry/recipes/mirror/
This is broken, it is a known bug in docker since 2016, see https://github.com/docker/distribution/issues/1486 and many other reports like https://www.reddit.com/r/docker/comments/bek6yv/how_do_you_do_registrymirror_with_auth/
So what we are moving towards is a setup where:- one jenkins job does a daily pull of all our base images from docker.io, and pushes them to the private registry
- our jenkins jobs will then always pull directly from that private registry instead of the public one
If the pull from docker.io then fails occasionally, it will fail that re-sync jenkins job, but the (ttcn3 and other) jobs that verify osmocom software will not fail, and simply use the 1..N days old base image.
Updated by laforge over 3 years ago
https://gerrit.osmocom.org/c/docker-playground/+/21019 prepares our Dockerfiles with a way to override the registry when building images.
Updated by laforge over 3 years ago
- Status changed from In Progress to Resolved
- % Done changed from 10 to 100
Related patches all merged, hopefully those problems are now gone.
- https://gerrit.osmocom.org/c/docker-playground/+/21019
- https://gerrit.osmocom.org/c/osmo-ci/+/21021
- https://gerrit.osmocom.org/c/osmo-ci/+/21023
I've manually verified that the registry-update-base-images job works, and also executed ttcn3-stp-test once to see if it actually pulls from registry.osmocom.org now.
Updated by laforge over 3 years ago
- Related to Feature #4840: migrate osmo-gsm-tester docker images to registry.osmocom.org added
Updated by laforge over 3 years ago
And of course, on day 1 of this new mechansim, we see:
- the docker image update job failing:
[registry-update-base-images] $ /bin/sh -xe /tmp/jenkins5987388568045535390.sh + REGISTRY=registry.osmocom.org + IMAGES=debian:stretch debian:buster debian:jessie debian:sid ubuntu:zesty centos:centos8 + src=debian:stretch + dst=registry.osmocom.org/debian:stretch + echo + echo ======= debian:stretch ======= debian:stretch + docker pull debian:stretch Error response from daemon: Get https://registry-1.docker.io/v2/library/debian/manifests/stretch: EOF Build step 'Execute shell' marked build as failure
while all other builds succeed, using base images from registry.osmocom.org.
yay.
Updated by laforge over 3 years ago
- Related to Bug #4850: ttcn3-gbproxy-test* are not generated by jenkins-job-builder added