Project

General

Profile

Actions

Bug #5687

closed

Document and implement realtime scheduling hierarchy

Added by msuraev 2 months ago. Updated 14 days ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
-
Start date:
09/20/2022
Due date:
% Done:

100%

Spec Reference:
https://man7.org/linux/man-pages/man7/sched.7.html

Description

When running all the components as NITB setup it's not obvious which of those should get realtime scheduling and how to set priorities.

Let's document this in https://osmocom.org/projects/cellular-infrastructure/wiki/Osmocom_Network_In_The_Box#Realtime-scheduling-hierarchy and implement as default priority/policy values in corresponding systemd units.


Related issues

Related to Cellular Network Infrastructure - Feature #4107: Start systemd services as non-root userIn Progressmsuraev07/15/2019

Actions
Actions #1

Updated by msuraev 2 months ago

  • Related to Feature #4107: Start systemd services as non-root user added
Actions #2

Updated by msuraev 2 months ago

Note: according to https://man7.org/linux/man-pages/man7/sched.7.html POSIX.1 requires support of only 32 distinct priority levels so it's probably better to stick to those instead of full Linux range 1 (low) to 99 (high).

Actions #3

Updated by msuraev about 2 months ago

  • Spec Reference set to https://man7.org/linux/man-pages/man7/sched.7.html

Based on the discussion in #4107 it seems like we have 4 distinct scheduling classes in the order of importance

1. OsmoTRX
2. OsmoBTS (and OsmoPCU?)
3. OsmoMGW
4. Everything else - no realtime requirements.

pespin could you clarify which change in OsmoPCU you were referring to which made it time-sensitive?

Based on the above I propose following default realtime priorities:

1. OsmoTRX: 21
2. OsmoBTS: 14, OsmoPCU: 11
3. OsmoMGW: 1

Scheduling policy: round-robin, as was used before.

Note: higher number means higher scheduling priority, even lowest realtime priority takes precedence over non-realtime processes.

Actions #4

Updated by pespin about 2 months ago

msuraev , there's no "specific change" which made OsmoPCU time-sensitive, it always has been AFAICT. It's time sensitive because it acts as part of osmo-bts-trx (or direct phy in the case of sysmo/oc2g/lc15) and it is requested to send DL blocks at specific points in time (RTS), with a specific time deadline (the time advance), otherwise the DL block/burst is not prepared in time when having to send it over the air interface (specific FN+TN).

I usually use (and did most tests while heavily refactoring osmo-pcu) same rt prio for osmo-bts-trx and osmo-pcu. Having lower prio in osmo-pcu than in osmo-bts-trx makes no sense, since that means that osmo-bts-trx could starve osmo-pcu and hence PDCH blocks never sent at the right time but later than expected.

Actions #5

Updated by fixeria about 2 months ago

pespin wrote in #note-4:

Having lower prio in osmo-pcu than in osmo-bts-trx makes no sense, since that means that osmo-bts-trx could starve osmo-pcu and hence PDCH blocks never sent at the right time but later than expected.

I don't have a strong opinion here (running everything with default prio myself), but actually it could make sense to give osmo-bts-trx a higher priority. The main difference between osmo-bts-trx and osmo-pcu is that the former operates on bursts, while the later does on blocks of bursts. Because of that, osmo-bts-trx unblocks the select() loop at least 4 times more often than osmo-pcu.

Actions #6

Updated by pespin about 2 months ago

why do you think it matters the amount of times the select loop is run? It doesn't matter if it works on bursts or blocks, it will send nothing correctly if the block from osmo-pcu doesn't arrive in time.
So yeah, you have higher prio to send stuff which didn't arrive because whoever was expected to provide them is not being scheduled ASAP.
As a reminder, rts-advance exists for a reason (BTS<->PCU), and by lowering osmo-pcu prio you are potentially asking for a higher rts-advance.

Actions #7

Updated by msuraev 22 days ago

  • Status changed from New to In Progress
  • Assignee set to msuraev
  • % Done changed from 0 to 10
Actions #8

Updated by msuraev 22 days ago

  • % Done changed from 10 to 30

The proposal is implemented under https://gerrit.osmocom.org/q/topic:rtprio

Actions #9

Updated by msuraev 14 days ago

  • Status changed from In Progress to Closed
  • % Done changed from 30 to 100

All patches merged, wiki updated.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)