Project

General

Profile

Actions

Bug #6256

closed

osmo-bsc running out of file descriptors in production setups

Added by laforge 4 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
11/11/2023
Due date:
% Done:

100%

Spec Reference:

Description

We have a production setup where there are >= 200 BTs with each up to 4 TRX. In that setup, the OS default number of open file descriptors (1024) is hit, resulting in call drops.

Given that there is N+1 sockets for each N-TRX BTS (e.g. 5 for a 4TRX) plus a handful for the AoIP interface, for speaking MGCP to the MGW, CTRL/VTY, etc. it is thus completely reasonable to exceed 1024.

Now the 1024 is not a limit imposed by osmo-bsc. However, we might want to either
  • add a note to the user manual informing users that they should tune their system accordingly if they expect a high number of BTSs, and/or
  • consider tuning the limit for osmo-bsc using LimitNOFILE in the systemd unit we're shipping.
We might also want to look at other osmo-* programs that might encounter similar problems. AFAICT it's probably only
  • osmo-hnbgw (only one Iuh socket for each HNB, though)
  • osmo-mgw (2 sockets per endpoint (RTP+RTCP), resulting in 4 sockets for each call, but single-threaded mgw might not handle 256 calls anyway?
  • osmo-gbproxy (number of sockets depends on number of NS-VCs configured. But at least common setups probably only have few UDP sockets. Unlike TCP/SCTP, we don't need new sockets for each peer/connection). So probably not required
  • bts, trx, msc, hlr, sgsn, ggsn all have only a low number of sockets, from what I can tell.
Actions #1

Updated by laforge 3 months ago

in case anyone else from the watcher list wants to take over this task, feel free to.

Actions #2

Updated by neels 3 months ago

  • Assignee changed from dexter to neels
Actions #3

Updated by neels 3 months ago

  • consider tuning the limit for osmo-bsc using LimitNOFILE in the systemd unit we're shipping.

But tune it to what number?

I think 200 4trx BTS is already pretty high, so it actually seems sane to ship with a default of 1024.
Having that value in the systemd unit file may already help admins to adjust it.
Of course that means a hassle of having a modified unit file == a failure mode for each new upgrade.

So, if there isn't much harm done, we may as well crank it up to the largest number we currently know is in use.
What number would that be; is LimitNOFILE=4096 sufficiently huge? (65536?)

Actions #5

Updated by neels 3 months ago

  • % Done changed from 0 to 90
Actions #6

Updated by neels 3 months ago

similar for hnbgw: https://gerrit.osmocom.org/c/osmo-hnbgw/+/35177
(though it does seem uncommon to have >1000 HNB per HNBGW)

Actions #8

Updated by neels 3 months ago

  • Status changed from New to Resolved
  • % Done changed from 90 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)