Feature #4107
openStart systemd services as non-root user
40%
Description
Ideally, as far as possible, we should start them as non-root user
(which may require changes to our systemd service files, etc. in the
individual git repos - but that is fine!). Starting them as non-root
will also means that any writes to unintended directories like '/'will
be discovered as they then would make the program start fail.
Related issues
Updated by osmith over 4 years ago
- Related to Bug #3369: no automatic testing of Debian/Ubuntu packages added
Updated by laforge about 3 years ago
Programs like osmo-msc, osmo-sgsn, osmo-cbc, osmo-smlc, osmo-hlr have no real time requirements or special needs in terms of raw networks sockets or tun devices. All of those should be executed as normal, non-privileged user from the start. This could be done via the systemd unit files. This could be done via the systemd unit files, or explicitly inside the osmocom programs via a privilege dropping approach.
the only processes that need special privileges are (AFAICT):- osmo-gbproxy requires CAP_NET_RAW if IPPROTO_GTP sockets are required for FR/GRE/IP
- osmo-trx, osmo-bts, osmo-pcu requires CAP_SYS_NICE if SCHED_RR is to be used per command line argument (and is not done by e.g. systemd before starting it)
- osmo-ggsn requires CAP_NET_ADMIN for setting up the gtp0/tun0 devices (unless this is done externally before starting it)
- any program requires CAP_SYS_NICE if it uses the relatively new libosmocore/src/vty/cpu_sched_vty.c code to have user-configured scheduling
- at least drop all privileges except those we really ever need in the specific proram (CAP_NET_RAW / CAP_NET_ADMIN / CAP_SYS_NICE). We can first constrain the permitted capabilities using
cap_set_flag
, then useprctl(PR_SET_KEEPCAPS, 1L)
to keep capabilities while changing from root to non-root, and then change the user ID / group ID. https://stackoverflow.com/a/13186076 has a nice example - if it is sufficient to perform those privileged operations once on start-up, we could even drop those capabilities after perfoming the operations like creating netdev, binding socket, changing scheduler policy. This would mean that no subsequent changes can be made later on.
Updated by laforge about 3 years ago
- Assignee deleted (
osmith) - Priority changed from Low to High
Updated by keith about 3 years ago
- Related to Bug #4821: Update working dir in systemd unit files added
Updated by laforge almost 3 years ago
- Assignee set to osmith
- create an osmocom user during package installation (if it doesn't exist yet)
- alternatively call it osmo-cni if osmocom is deemed too generic
- modify the systemd.service files to run the processes as that user
- modify /etc/osmocom and its contents to be owned by that user
- modify /var/lib/osmocom (HLR + SMS databases) to be owned by that user
For some programs, this is a no-brainer (e.g. BSC, MSC, SGSN)
For some others (TRX, BTS but possibly also MGW: SCHED_RR; GGSN: tun devices) we should work with capabilities, as described above.
Updated by laforge almost 3 years ago
- Related to Bug #2250: OpenGGSN requires to run as root for no apparent reason added
Updated by osmith almost 3 years ago
laforge wrote:
IMHO, we should start by
- modify /etc/osmocom and its contents to be owned by that user
That's untypical - do we want the programs to be able to change their own configs?
Updated by pespin almost 3 years ago
osmith wrote:
laforge wrote:
IMHO, we should start by
- modify /etc/osmocom and its contents to be owned by that user
That's untypical - do we want the programs to be able to change their own configs?
We should, otherwise the user cannot store back the running-config to the .cfg file through VTY command.
Updated by laforge over 1 year ago
- Assignee changed from osmith to msuraev
re-assinging to msuraev as this has been without progress for too long.
Updated by msuraev over 1 year ago
osmith wrote in #note-8:
That's untypical - do we want the programs to be able to change their own configs?
Some configs in /etc has non-root group. Assuming we also create osmocom group, we can have /etc/osmocom owned by root:osmocom while /etc/osmocom/osmo.bsc.cfg owned by osmocom:osmocom - that's similar to how transmission-daemon handle its config files.
Updated by msuraev about 1 year ago
- Related to Bug #5669: Test .deb packages built by our OBS added
Updated by msuraev about 1 year ago
Updated by msuraev about 1 year ago
To make sure no project is left behind let's summarize the current state
Have I missed anything?
Updated by fixeria about 1 year ago
We may also want to run osmo-pcu with SCHED_RR.
Updated by pespin about 1 year ago
This may be of use to list the projects: https://osmocom.org/projects/cellular-infrastructure/wiki/Make_a_new_release#Dependency-graph
Updated by msuraev about 1 year ago
The example code adding user/group is available in https://gerrit.osmocom.org/c/osmo-hlr/+/29311
The following tests were made:- clean install
- upgrade from previous ("root") version
- upgrade from previous ("user") version
- writing config file via telnet
- package uninstall
- piuparts:
sudo piuparts osmo-hlr_1.5.0_amd64.deb libosmo-gsup-client0_1.5.0_amd64.deb libosmo-mslookup0_1.5.0_amd64.deb libosmocore19_1.7.0_amd64.deb libosmogsm18_1.7.0_amd64.deb
...
PASS: Installation, upgrade and purging tests.
In general, possible source of problem is mix-n-match between "root" and "user" packages where "root" package is installed after the "user", overriding permissions and disabling read/write access to config files. I'm not sure if it's worth investing time into dealing with that - seems like coordinating release so root->user transition happens simultaneously is easier.
Updated by msuraev about 1 year ago
laforge wrote in #note-3:
- osmo-ggsn requires CAP_NET_ADMIN for setting up the gtp0/tun0 devices (unless this is done externally before starting it)
At least for tun0 device we can install corresponding .network file in addition to .service with proper User/Group settings.
Updated by msuraev about 1 year ago
How should we deal with .spec files? Shall I update those as well?
Creating user during package install is a distro-specific thing. Are there some other distros we care about?
What about OE?
Updated by osmith about 1 year ago
msuraev wrote in #note-20:
How should we deal with .spec files? Shall I update those as well?
Creating user during package install is a distro-specific thing. Are there some other distros we care about?
What about OE?
As I understand, the systemd files get adjusted to expect the user to exist, and these systemd files are used in the rpms and on OE too. So we would need to make sure that the user exists there as well or else the systemd services wouldn't work there anymore.
Updated by msuraev about 1 year ago
- Related to Bug #5685: Dropping debian 10 (buster) added
Updated by msuraev about 1 year ago
Do we have some kind of hierarchy with regards to realtime scheduling? Like "osmo-pcu should have higher priority than osmo-trx" and such?
Updated by pespin about 1 year ago
msuraev I personally use:
osmo-trx-uhd.cfg: "policy rr 18"
osmo-bts-trx.cfg: "policy rr 1"
osmo-pcu.cfg: "policy rr 1"
Updated by laforge about 1 year ago
On Tue, Sep 20, 2022 at 09:11:35AM +0000, msuraev wrote:
no, but I think for CNI it's relatively "obvious" to me:Do we have some kind of hierarchy with regards to realtime scheduling? Like "osmo-pcu should have higher priority than osmo-trx" and such?
- osmo-trx should be higher than anything else
- osmo-bts-* below osmo-trx
- osmo-mgw below osmo-bts-*
- osmo-pcu below osmo-bts-*
- everything else isn't really timing critical.
Updated by msuraev about 1 year ago
- Related to Bug #5687: Document and implement realtime scheduling hierarchy added
Updated by pespin about 1 year ago
laforge osmo-pcu now depends on getting FNs on time to calculate when to send stuff regarding scheduling, that's why I use same prio for osmo-bts and osmo-pcu.
Updated by msuraev about 1 year ago
Seems like it's not that obvious so the topic deserve ticket of its own - see #5687.
Updated by msuraev about 1 year ago
Tested osmo-hlr.rpm built via https://obs.osmocom.org/project/show/home:msuraev:rpmtest on OpenSUSE Tumbleweed. User:Group are created as expected, the permissions are properly set during install time. The .rpm support matches that of .deb
Tested with:
zypper ar https://people.osmocom.org/packages/home:/msuraev:/rpmtest/openSUSE_Tumbleweed/ osmo
zypper in osmo-hlr
getent passwd osmocom
getent group osmocom
ls -alh /etc/osmocom
ls -alh /etc/ | grep osmo
Updated by msuraev about 1 year ago
laforge wrote in #note-3:
- osmo-gbproxy requires CAP_NET_RAW if IPPROTO_GTP sockets are required for FR/GRE/IP
Looking through the code I couldn't find where this is used. It's also unclear why gbproxy would require it but OsmoSGSN and OsmoGGSN wouldn't. Could you please clarify?
Updated by laforge about 1 year ago
msuraev wrote in #note-30:
laforge wrote in #note-3:
- osmo-gbproxy requires CAP_NET_RAW if IPPROTO_GTP sockets are required for FR/GRE/IP
Looking through the code I couldn't find where this is used. It's also unclear why gbproxy would require it but OsmoSGSN and OsmoGGSN wouldn't. Could you please clarify?
osmo-gbproxy is the only network element that "officially" supports Gb over frame relay over E1. We use it to convert from legacy RAN/BSS Gb/FR/E1 to Gb/UDP/IP, so that the SGSN can use normal Gb/UDP/IP.
Updated by msuraev about 1 year ago
laforge wrote in #note-31:
osmo-gbproxy is the only network element that "officially" supports Gb over frame relay over E1. We use it to convert from legacy RAN/BSS Gb/FR/E1 to Gb/UDP/IP, so that the SGSN can use normal Gb/UDP/IP.
How does that look like to Linux? Some specific network interface?
And how do we test it?
Would be nice to try and ensure it works instead of simply slapping CAN_NET_RAW and hoping it's enough.
Updated by laforge about 1 year ago
hdlcX net device. There's a wiki page documenting this, including how to set up a virtual loop back device like we use in Jenkins testing. libosmogb simply uses AF_PACKET sockets, so CAP_NET_RAW should do the trick.
Updated by msuraev about 1 year ago
- Related to Feature #5722: Migrate jenkins build slaves from docker to podman added
Updated by msuraev about 1 year ago
- % Done changed from 10 to 20
Testing OsmoGGSN in user mode (non GTP mode) https://gerrit.osmocom.org/c/osmo-ggsn/+/29412/ revealed following:
pass GGSN_Tests.TC_pdp4_act_deact
pass GGSN_Tests.TC_pdp4_act_deact_ipcp
pass GGSN_Tests.TC_pdp4_act_deact_ipcp_pap_broken
pass GGSN_Tests.TC_pdp4_act_deact_pcodns
pass GGSN_Tests.TC_pdp4_act_deact_gtpu_access
pass->FAIL GGSN_Tests.TC_pdp4_clients_interact_with_txseq
pass->FAIL GGSN_Tests.TC_pdp4_clients_interact_without_txseq
pass GGSN_Tests.TC_pdp4_act_deact_with_single_dns
pass GGSN_Tests.TC_pdp4_act_deact_with_separate_dns
pass GGSN_Tests.TC_pdp6_act_deact
pass GGSN_Tests.TC_pdp6_act_deact_pcodns
pass GGSN_Tests.TC_pdp6_act_deact_icmp6
pass->FAIL GGSN_Tests.TC_pdp6_act_deact_gtpu_access
pass GGSN_Tests.TC_pdp6_clients_interact
pass GGSN_Tests.TC_pdp46_act_deact
pass GGSN_Tests.TC_pdp46_act_deact_ipcp
pass GGSN_Tests.TC_pdp46_act_deact_icmp6
pass GGSN_Tests.TC_pdp46_act_deact_pcodns4
pass GGSN_Tests.TC_pdp46_act_deact_pcodns6
pass GGSN_Tests.TC_pdp46_act_deact_gtpu_access
pass GGSN_Tests.TC_pdp46_clients_interact
pass GGSN_Tests.TC_pdp46_act_deact_apn4
pass GGSN_Tests.TC_echo_req_resp
pass GGSN_Tests.TC_pdp_act2_recovery
pass GGSN_Tests.TC_act_deact_retrans_duplicate
pass GGSN_Tests.TC_pdp_act_restart_ctr_echo
NEW: PASS GGSN_Tests.TC_pdp4_act_deact_gtpu_access_wrong_saddr
NEW: PASS GGSN_Tests.TC_pdp4_act_deact_gtpu_access_ipv6_apn4
NEW: PASS GGSN_Tests.TC_pdp4_act_update_teic
NEW: PASS GGSN_Tests.TC_pdp4_act_update_teid
NEW: PASS GGSN_Tests.TC_pdp6_act_deact_gtpu_access_wrong_ll_saddr
NEW: PASS GGSN_Tests.TC_pdp6_act_deact_gtpu_access_wrong_global_saddr
NEW: PASS GGSN_Tests.TC_pdp6_act_deact_gtpu_access_ipv4_apn6
NEW: PASS GGSN_Tests.TC_pdp46_act_deact_gtpu_access_wrong_saddr_ipv4
NEW: PASS GGSN_Tests.TC_pdp46_act_deact_gtpu_access_wrong_ll_saddr_ipv6
NEW: PASS GGSN_Tests.TC_pdp46_act_deact_gtpu_access_wrong_global_saddr_ipv6
NEW: PASS GGSN_Tests.TC_echo_req_resp_gtpu
NEW: FAIL GGSN_Tests.TC_lots_of_concurrent_pdp_ctx
NEW: FAIL GGSN_Tests.TC_addr_pool_exhaustion
Summary:
pass->FAIL: 3
NEW: FAIL: 2
pass: 23
NEW: PASS: 11
Updated by msuraev about 1 year ago
- % Done changed from 20 to 40
Note: realtime scheduling can be verified for a service with tuna --show_threads | grep RR