Project

General

Profile

Bug #5249

osmo-bts: Avoid activating channels on TS not in NM ENABLED state (fix crash)

Added by pespin 12 days ago. Updated 12 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
10/06/2021
Due date:
% Done:

50%

Spec Reference:

Description

Crash triggered by some TTCN3 test in nightly ttcn3-bts-test suite:

20211006080515904 DOML <0001> l1_if.c:590 NM_BBTRANSC_OP(bts0-trx0)[0x55a8808e5e50]{DISABLED_OFFLINE}: Received Event OPSTART_ACK
20211006080515904 DOML <0001> nm_bb_transc_fsm.c:170 NM_BBTRANSC_OP(bts0-trx0)[0x55a8808e5e50]{DISABLED_OFFLINE}: Delay switch to operative state Enabled, wait for: rsl phy
20211006080515904 DOML <0001> oml.c:1428 (bts=0,trx=0): Rx IPA RSL CONNECT IP=172.18.9.10 PORT=3003 STREAM=0x00
...
20211006080515905 DLINP <0012> input/ipaccess.c:898 received ID_GET for unit ID 1234/0/0
20211006080515906 DOML <0001> bts_trx.c:213 NM_RCARRIER_OP(bts0-trx0)[0x55a8808e5b70]{DISABLED_OFFLINE}: Received Event RSL_UP
20211006080515906 DOML <0001> nm_radio_carrier_fsm.c:150 NM_RCARRIER_OP(bts0-trx0)[0x55a8808e5b70]{DISABLED_OFFLINE}: Delay switch to operative state Enabled, wait for: phy
20211006080515906 DOML <0001> bts_trx.c:214 NM_BBTRANSC_OP(bts0-trx0)[0x55a8808e5e50]{DISABLED_OFFLINE}: Received Event RSL_UP
20211006080515906 DOML <0001> nm_bb_transc_fsm.c:170 NM_BBTRANSC_OP(bts0-trx0)[0x55a8808e5e50]{DISABLED_OFFLINE}: Delay switch to operative state Enabled, wait for: phy
...
20211006080517284 DL1C <0006> trx_provision_fsm.c:66 TRX_PROV(phy0-0)[0x55a8808f9da0]{OPEN_POWEROFF}: Received Event TRX_PROV_EV_RXTUNE_CNF
20211006080517284 DL1C <0006> trx_provision_fsm.c:514 TRX_PROV(phy0-0)[0x55a8808f9da0]{OPEN_POWEROFF}: Delay poweron, wait for: tsc-ack txtune-ack nomtxpower-ack setformat-ack other-trx
...
20211006080517501 DRSL <0000> rsl.c:3744 (bts=0,trx=0,ts=1,pchan=TCH/F) ss=0 Rx RSL CHAN_ACTIV
20211006080517501 DRSL <0000> rsl.c:1839 (bts=0,trx=0,ts=1,ss=0) chan_nr=TCH/F on TS1 type=0x00=INITIAL mode=SIGNALLING
20211006080517501 DL1C <0006> l1sap.c:2008 (bts=0,trx=0,ts=1,ss=0) Activating channel TCH/F on TS1
20211006080517501 DL1C <0006> scheduler.c:1101 (bts=0,trx=0,ts=1,ss=0) Activating TCH/F
20211006080517501 DL1C <0006> scheduler.c:1101 (bts=0,trx=0,ts=1,ss=0) Activating SACCH/TF
20211006080517501 DL1C <0006> scheduler.c:1164 (bts=0,trx=0,ts=1) Set mode for TCH/F (rsl_cmode=3, tch_mode=0, handover=0)
20211006080517501 DTRX <000b> trx_if.c:256 phy0.0: Enqueuing TRX control command 'CMD NOHANDOVER 1 0'
20211006080517501 DL1C <0006> scheduler.c:1230 (bts=0,trx=0,ts=1,ss=0) Set A5/0 uplink for TCH/F
20211006080517501 DL1C <0006> scheduler.c:1230 (bts=0,trx=0,ts=1,ss=0) Set A5/0 uplink for SACCH/TF
20211006080517501 DL1C <0006> scheduler.c:1230 (bts=0,trx=0,ts=1,ss=0) Set A5/0 downlink for TCH/F
20211006080517501 DL1C <0006> scheduler.c:1230 (bts=0,trx=0,ts=1,ss=0) Set A5/0 downlink for SACCH/TF
20211006080517501 DL1C <0006> l1sap.c:809 (bts=0,trx=0,ts=1,ss=0) activate confirm chan_nr=TCH/F on TS1 trx=0
20211006080517501 DRSL <0000> rsl.c:1322 (bts=0,trx=0,ts=1,pchan=TCH/F) (ss=0) TCH_F Tx CHAN ACT ACK
20211006080517685 DTRX <000b> trx_if.c:672 phy0.0: Response message: 'RSP TXTUNE 0 1877000'
20211006080517685 DL1C <0006> trx_provision_fsm.c:71 TRX_PROV(phy0-0)[0x55a8808f9da0]{OPEN_POWEROFF}: Received Event TRX_PROV_EV_TXTUNE_CNF
20211006080517685 DTRX <000b> trx_if.c:256 phy0.0: Enqueuing TRX control command 'CMD NOMTXPOWER'
20211006080517685 DL1C <0006> trx_provision_fsm.c:514 TRX_PROV(phy0-0)[0x55a8808f9da0]{OPEN_POWEROFF}: Delay poweron, wait for: tsc-ack nomtxpower-ack setformat-ack other-trx
20211006080517685 DTRX <000b> trx_if.c:672 phy0.1: Response message: 'RSP NOMTXPOWER 0 50'
20211006080517685 DL1C <0006> trx_provision_fsm.c:88 TRX_PROV(phy0-1)[0x55a8808fb150]{OPEN_POWEROFF}: Received Event TRX_PROV_EV_NOMTXPOWER_CNF
20211006080517685 DL1C <0006> trx_provision_fsm.c:514 TRX_PROV(phy0-1)[0x55a8808fb150]{OPEN_POWEROFF}: Delay poweron, wait for: setformat-ack
20211006080517685 DTRX <000b> trx_if.c:672 phy0.3: Response message: 'RSP RFMUTE 0 0'
0: stopped pid 8 with status 139

Basically the TTCN3 is too quick and it's activating a CHAN when the BTS<->BSC didn't finish configuring....

In any case, the crash can be fixed by verifying in osmo-bts that CHAN ACT on a disabled TS is NACKed. The crash probably occurs because the phy is not yet ready and some null pointer is accessed.

139 is SIGSEGV.

Associated revisions

Revision c97a7f51 (diff)
Added by pespin 11 days ago

rsl: NACK Chan Activation for lchans on disabled TS

A broken BSC could send a Chan Activation on a TS which has not yet been
enabled (or even configured). This is the case with our TTCN3 tests,
where OML side is currently handled in parallel by an osmo-bsc while
TTCN3 takes care of the RSL side. This can actually be seen as a
malfunctioning BSC, but it was spotted that given this sequence of
events osmo-bts can crash (see ticket below).

Hence, let's NACK any attempt from a BSC to activate an lchan on a
disabled TS.

Related: OS#5249
Change-Id: I9c3b68487c12efc412a057728a561e061560c544

Revision 3e2b7fae (diff)
Added by pespin 7 days ago

rsl: Fix all shadow TS being Chan Act NACKed

The OML NM Channel FSM state only apply to primary timeslots, hence we
need to make sure we pick the primary TS (the non-shadow one).

Due to this bug, all channels on shadow TS where NACKed because the
related state was never "Enabled Ok".

Fixes: c97a7f51e1b15d40e39df4b7d07b3c6534540186
Related: OS#5249
Related: OS#5251
Change-Id: If47e4bdd45a05ed1b5709b6e3d541f2830723e37

History

#1 Updated by pespin 12 days ago

  • Status changed from New to Feedback
  • Assignee set to pespin
  • % Done changed from 0 to 50

Should be fixed by:
https://gerrit.osmocom.org/c/osmo-bts/+/25702 rsl: NACK Chan Activation for lchans on disabled TS

next step is waiting to see which TTCN3 start failing now due to receibing a CHAN ACT NACK, and fix the tests to delay the Chan Act.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)