Project

General

Profile

Actions

Bug #4694

closed

Radio link timeout would never expire after RF-lock (resource leak)

Added by fixeria over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
osmo-bts-trx
Target version:
-
Start date:
08/06/2020
Due date:
% Done:

100%

Spec Reference:

Description

I noticed that "RF-locking" a transceiver with active connections (e.g. voice calls) causes a resource leak: the BSC would continue to consider the associated logical channels occupied, so the MSC would also consider the CS connections active (if any). What's even more annoying is that the phones, that were previously connected and lost the signal, would be unable to attach again, because the MSC thinks that they still are.

23:34 < fixeria> interestingly, both BSC and MSC still consider the connection (silent call) as established
23:37 <@LaF0rge> fixeria: The BTS radio link timeout should still be counting down, which in turn should result in a channel releae after some seconds
23:37 <@LaF0rge> fixeria: if that doesn't happen, it's a bug (pleae file one)
23:38 <@LaF0rge> fixeria: so basically to the higher layers it doesn't matter why the RF connection is gone (no more signal, bad antenna/cable, broken power amplifier, 
                 or TRX locked) - the radio link is gone and it must detect that and handle it identical
23:39 <@LaF0rge> once the BTS reports radio link failure to the BSC, the BSC will start the release procedure for the A interface SCCP connection and the MSC will 
                 release all related resouces
23:39 < fixeria> LaF0rge: yep, looks like a resource leak
23:42 < fixeria> lol, after that osmo-msc refuses to accept a Location Update
23:43 < fixeria> "Cannot associate with VLR subscr, another connection is already active at IMSI-262423403******:MSISDN-******:TMSI-0x7CB52857:GERAN-A-2:PAGING_RESP" 
23:45 < fixeria> LaF0rge: I guess the problem is that rf-locking causes osmo-bts to reset the scheduler, from what I see in the logs
23:47 < fixeria> "DL1C NOTICE scheduler.c:616 Exit scheduler for trx=0" then "DL1C NOTICE scheduler.c:591 Init scheduler for trx=0" 
23:48 < fixeria> this is basically a result of trx_sched_reset() that calls trx_sched_exit() and then trx_sched_init()
23:49 < fixeria> so if we reset the scheduler, the BTS would simply "forget" all active connections and never report "radio link failure(s)" 
23:59 <@LaF0rge> fixeria: I think the radio link timeout etc. are implemented above L1SAP
23:59 <@LaF0rge> fixeria: but maybe if no more events are coming up from the bts-model part, those are never triggered.
00:00 < fixeria> LaF0rge: ok, I'll open a ticket with all the findings

I think we can fix this by sending the RSL Radio Link Failure for all (still) active DCCHs as soon as the ramping down is completed.
And this seems to be what the other BTS models are doing when a transceiver gets RF-locked.


Related issues

Related to OsmoBTS - Bug #4696: osmo-bts-trx keeps sending dummy bursts after being RF-lockedResolvedfixeria08/06/2020

Actions
Actions #1

Updated by fixeria over 3 years ago

  • % Done changed from 0 to 40

I think we can fix this by sending the RSL Radio Link Failure for all (still) active DCCHs as soon as the ramping down is completed.

https://gerrit.osmocom.org/c/osmo-bts/+/19536 rsl: constify the 'lchan' argument of rsl_tx_conn_fail()
https://gerrit.osmocom.org/c/osmo-bts/+/19537 osmo-bts-trx: fix resource leak in bts_model_trx_deact_rf()

This change still needs to be tested, and should ideally be accompanied by a TTCN-3 test case. Leaving this up to somebody else.

Actions #2

Updated by laforge over 3 years ago

  • Subject changed from Radio link temeout would never expire after RF-lock (resource leak) to Radio link timeout would never expire after RF-lock (resource leak)
Actions #3

Updated by laforge over 3 years ago

I'm not sure if we should introduce more "special purpose" code paths just for ramping down.

The fundamental problem seems to be that somehow the scheduler stops running, when all we want
is actually the radio transmitter (and possibly receiver) to stop running.

If the scheduler would continue to run, but not transmit or receive any
valid uplink bursts, the radio link timeout should happen automatically
without any special-case code paths.

Even if we only disabled transmission (or replaced all transmit bursts
with idle bursts), the UE would at some point stop transmitting, which
in turn means a radio link timeout happens.

Regards,
Harald

Actions #4

Updated by fixeria over 3 years ago

  • Related to Bug #4696: osmo-bts-trx keeps sending dummy bursts after being RF-locked added
Actions #5

Updated by pespin over 3 years ago

AS dar as I remember reading the OML specs, switching Administrative State to Locked shouldn't automatically end up in calls being dropped and specific messages being sent from BTS to BSC, the spec simply says the BTS stops transmitting on the air interface physically, that's all. I agree with Harald that we shouldn't send any RSL Radio Link Failure or alike, but simply let the usual way trigger it.

Actions #6

Updated by fixeria over 3 years ago

  • Status changed from New to Resolved
  • % Done changed from 40 to 100

Fixed bty pespin, radio link timeout is now triggered as expected:

DMEAS NOTICE l1sap.c:1192 (bts=0,trx=0,ts=2,ss=0) radio link timeout counter S reached zero, dropping connection

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)