Project

General

Profile

Actions

Bug #5453

closed

debian 10 x32: apt is broken in debian bullseye docker images

Added by osmith 10 months ago. Updated 20 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
02/11/2022
Due date:
% Done:

100%

Spec Reference:

Description

apt is failing in debian bullseye images on the raspberry pis:

W: GPG error: http://security.debian.org/debian-security bullseye-security InRelease: At least one invalid signature was encountered

To fix this, we need to upgrade libseccomp to 2.4.2 or newer and Docker to 19.03.9 or newer.

See https://github.com/debuerreotype/docker-debian-artifacts/issues/116#issuecomment-776132682 and referenced issues.

The raspberries still run a raspbian based on debian 10, where these newer version are not available. They are available in debian 11.


Checklist

  • upgrade raspberry pis to raspbian 11
  • skip building debian 11 based docker image in gtp0-deb10build32 as workaround
  • replace gtp0-deb10build32 lxc with a debian 11 based lxc
Actions #2

Updated by osmith 10 months ago

  • Checklist item upgrade raspberry pis to raspbian 11 added
  • Checklist item skip building debian 11 based docker image in gtp0-deb10build32 as workaround added
  • Checklist item replace gtp0-deb10build32 lxc with a debian 11 based lxc added
  • Subject changed from raspberry pis: apt is broken in debian bullseye docker images to debian 10 x32: apt is broken in debian bullseye docker images
  • Status changed from New to In Progress
  • % Done changed from 0 to 30

Since this recently merged patch, we are building a debian 11 based debian-bullseye-erlang image as part of the update-osmo-ci-on-slaves jenkins job.

As a result, this job is now failing for all raspberry pis. And also for gtp0-deb10build32.

I'm upgrading the raspberries from raspbian 10 to 11 to fix it there.

For gtp0-deb10build32 we should probably create a new lxc container with deb11 in the name and retire the old one, but I can't do that as I don't have root permissions on gtp0. As a workaround, here is a patch that skips generating that docker image on debian 10 x86:
https://gerrit.osmocom.org/c/osmo-ci/+/27265

After this is merged, and I'm finished with upgrading the raspberry pis, the job should not fail anymore.

Actions #3

Updated by osmith 10 months ago

  • Checklist item upgrade raspberry pis to raspbian 11 set to Done
  • Checklist item skip building debian 11 based docker image in gtp0-deb10build32 as workaround set to Done
  • Status changed from In Progress to Stalled
  • Assignee changed from osmith to laforge
  • % Done changed from 30 to 70

Raspberries upgraded, workaround for gtp0-deb10build32 is merged. The jenkins job should be passing again.

Assigning to Harald, as I currently can't replace the gtp0 lxc due to lack of permissions.

Actions #4

Updated by laforge 7 months ago

  • Status changed from Stalled to In Progress
  • Assignee changed from laforge to osmith
  • % Done changed from 70 to 80

I've created a new gtp0-deb11build-i586 lxc container and assigned it to 10.34.2.104. Your SSH root key is installed. The lxc config is copied from the debian11 container.

I'm running the ansible playbook for jenkins build slaves at it right now, and after fixing one bug (https://gerrit.osmocom.org/c/osmo-ci/+/28145) it seems to run [further, it's not complete yet].

Please let me know if the deb9build32-i687 and/or deb10build32-i586 lxc containers on gtp0 are still needed or if they can be destroyed.

Actions #5

Updated by laforge 7 months ago

it failed now with

TASK [osmocom-jenkins-slave : install build utilities] ****************************************************
fatal: [gtp0-deb11build]: FAILED! => {
    "changed": false
}

MSG:

No package matching 'dh-systemd' is available

leaving it to you to fix the ansible playbook and add it as slave to jenkins etc.

Actions #6

Updated by osmith 7 months ago

  • Checklist item replace gtp0-deb10build32 lxc with a debian 11 based lxc set to Done
  • Assignee changed from osmith to laforge
  • % Done changed from 80 to 90

Additional patches, now ansible ran through:
https://gerrit.osmocom.org/q/topic:ansible-jenkins

I have added gtp0-deb11build32 as jenkins node and removed the previous gtp0-deb10build32 jenkins node. The configuration is based on the old one.

One thing I've noticed in the node configuration: "Usage" was set to "Only build jobs with label expressions matching this node". But the label was not used anywhere. I guess we should change it to "Use this node as much as possible"?

Please let me know if the deb9build32-i687 and/or deb10build32-i586 lxc containers on gtp0 are still needed or if they can be destroyed.

They are not needed anymore.

Actions #7

Updated by laforge 29 days ago

  • Assignee changed from laforge to osmith

osmith wrote in #note-6:

One thing I've noticed in the node configuration: "Usage" was set to "Only build jobs with label expressions matching this node". But the label was not used anywhere. I guess we should change it to "Use this node as much as possible"?

I think in general we should favor the build*.osmocom.org nodes over the gtp0 node. But to be honest, with our CI growth I don't rally have the kind of overview that you have.

Please let me know if the deb9build32-i687 and/or deb10build32-i586 lxc containers on gtp0 are still needed or if they can be destroyed.

They are not needed anymore.

ok, stopped + destroyed them.

Actions #8

Updated by osmith 28 days ago

  • Status changed from In Progress to Feedback
  • Assignee changed from osmith to laforge

laforge wrote in #note-7:

osmith wrote in #note-6:

One thing I've noticed in the node configuration: "Usage" was set to "Only build jobs with label expressions matching this node". But the label was not used anywhere. I guess we should change it to "Use this node as much as possible"?

I think in general we should favor the build*.osmocom.org nodes over the gtp0 node. But to be honest, with our CI growth I don't rally have the kind of overview that you have.

Well I'm wondering what the purpose of the node was/is/should be.

When looking at the jobs it runs, right now it doesn't seem to do anything useful. It spends most of the time building the debian-buster-jenkins docker image (update-osmo-ci-on-slaves). Sometimes it now also runs the pylint part of master-pysim (not intentionally, just because the label osmocom-master is used there, including this node), but I wouldn't expect this lint job to ever have a different output on x86 and x86_64.

Was the purpose to build the osmocom master jobs with external/vty tests on x86 from time to time to detect build errors on that arch? In that case I'd change the master jobs to include that node with its labels and thereby let jenkins randomly pick it from time to time.

Or maybe if there is no purpose and you would rather not have additional load on gtp0, maybe it makes sense to remove this node?

Actions #9

Updated by laforge 20 days ago

  • Assignee changed from laforge to osmith

osmith wrote in #note-8:

laforge wrote in #note-7:

osmith wrote in #note-6:

One thing I've noticed in the node configuration: "Usage" was set to "Only build jobs with label expressions matching this node". But the label was not used anywhere. I guess we should change it to "Use this node as much as possible"?

I think in general we should favor the build*.osmocom.org nodes over the gtp0 node. But to be honest, with our CI growth I don't rally have the kind of overview that you have.

Well I'm wondering what the purpose of the node was/is/should be.

I don't know or at least don't remember.

Or maybe if there is no purpose and you would rather not have additional load on gtp0, maybe it makes sense to remove this node?

that's probably the best. if you don't know why it exists, and I don't know it :)

Actions #10

Updated by osmith 20 days ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

ack, removed it from jenkins

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)