Project

General

Profile

Bug #3513

unsolicited channel activation acknowledge

Added by stsp 2 months ago. Updated 23 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
osmo-bts-sysmo
Target version:
-
Start date:
08/31/2018
Due date:
% Done:

100%

Spec Reference:

Description

While running TTCN3 tests with a sysmobts and osmocom-bb, I ran into test failure "RSL for unknown Dchan"

Specifically, while the BTS is being configured, this code in RSL Emulation is triggered by an
unsolicited channel activation acknowledge RSL message:

                [not dchan_suspended] IPA_PT.receive(tr_RSL(tr_RSL_MsgTypeDR(?))) -> value rx_rsl {
                        /* dispatch to channel based on ChanId */
                        cid := f_cid_by_chan_nr(f_trx_by_streamId(rx_rsl.streamId),
                                                rx_rsl.rsl.ies[0].body.chan_nr);
                        if (cid != -1) {
                                CLIENT_PT.send(rx_rsl.rsl) to ConnectionTable[cid].comp_ref;
                        } else {
                                setverdict(fail, "RSL for unknown Dchan");
                                mtc.stop;
                        }
                }
Frame 396: 76 bytes on wire (608 bits), 76 bytes captured (608 bits) on interface 0
Ethernet II, Src: Sysmocom_00:00:0a (24:62:78:00:00:0a), Dst: WistronI_9e:0e:57 (f0:de:f1:9e:0e:57)
Internet Protocol Version 4, Src: 192.168.100.3, Dst: 192.168.100.11
Transmission Control Protocol, Src Port: 47256, Dst Port: 3003, Seq: 46, Ack: 11, Len: 10
IPA protocol ip.access, type: RSL
    DataLen: 7
    Protocol: RSL (0x00)
Radio Signalling Link (RSL)
    0000 100. = Message discriminator: Dedicated Channel Management messages (4)
    .... ...0 = T bit: Not considered transparent by BTS
    .010 0010 = Message type: CHANnel ACTIVation ACKnowledge (0x22)
    Channel number IE 
        Element identifier: Channel Number (0x01)
        0010 0... = C-bits: SDCCH/4 + ACCH (sub-chan 0) (4)
        .... .000 = Time slot number (TN): 0
    Frame Number IE
        Element identifier: Frame Number (0x08)
        0000 0... = T1': 0
        .... .011 001. .... = T3: 25
        ...1 1000 = T2: 24

Here is another example after changing the control channel 0 to BCCH:

Frame 53997: 76 bytes on wire (608 bits), 76 bytes captured (608 bits) on interface 0
Ethernet II, Src: Sysmocom_00:00:0a (24:62:78:00:00:0a), Dst: WistronI_9e:0e:57 (f0:de:f1:9e:0e:57)
Internet Protocol Version 4, Src: 192.168.100.3, Dst: 192.168.100.11
Transmission Control Protocol, Src Port: 47660, Dst Port: 3003, Seq: 46, Ack: 11, Len: 10
IPA protocol ip.access, type: RSL
    DataLen: 7
    Protocol: RSL (0x00)
Radio Signalling Link (RSL)
    0000 100. = Message discriminator: Dedicated Channel Management messages (4)
    .... ...0 = T bit: Not considered transparent by BTS
    .010 0010 = Message type: CHANnel ACTIVation ACKnowledge (0x22)
    Channel number IE 
        Element identifier: Channel Number (0x01)
        1000 0... = C-bits: BCCH (16)
        .... .000 = Time slot number (TN): 0
    Frame Number IE
        Element identifier: Frame Number (0x08)
        0000 0... = T1': 0
        .... .100 001. .... = T3: 33
        ...0 0110 = T2: 6

This unsolicited message should not be sent.

run.tar.bz run.tar.bz 63.6 KB pespin, 10/16/2018 05:45 PM

History

#1 Updated by stsp 2 months ago

  • Assignee changed from sysmocom to stsp

#2 Updated by neels 2 months ago

That rings a bell, I've a number of times wondered about Chan Activ Ack right upon BTS startup (sysmoBTS). I kind of assumed it was some bootstrapping necessity and didn't pay closer attention, but of course a Chan Activ Ack without a request shouldn't happen, right?

#3 Updated by laforge 2 months ago

On Sat, Sep 01, 2018 at 09:20:55PM +0000, neels [REDMINE] wrote:

That rings a bell, I've a number of times wondered about Chan Activ Ack right upon BTS startup (sysmoBTS). I kind of assumed it was some bootstrapping necessity and didn't pay closer attention, but of course a Chan Activ Ack without a request shouldn't happen, right?

It shouldn't happen, but it's a known sysmoBTS oddity for many years (maybe forever).

#4 Updated by pespin 30 days ago

I started adding support for osmo-bts-symo in the osmo-gsm-tester+TTCN3 setup, and I'm running into this issue when running most tests.

Here's related osmo-bts-sysmo log output:

20180823154258527 DL1C <0006> oml.c:1083 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.req (hL2=0x000004bb, FCCH TxDL)
20180823154258528 DLINP <0012> input/ipaccess.c:693 received ID get from 1/0/0
20180823154258529 DABIS <000d> abis.c:122 RSL Signalling link for TRX0 up
20180823154258529 DRSL <0000> rsl.c:271 Tx RSL RF RESource INDication
20180823154258530 DL1C <0006> oml.c:816 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (FCCH TxDL)
20180823154258531 DL1C <0006> oml.c:1083 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.req (hL2=0x000004bb, SCH TxDL)
20180823154258532 DL1C <0006> oml.c:816 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (SCH TxDL)
20180823154258532 DL1C <0006> oml.c:1083 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.req (hL2=0x000004bb, BCCH TxDL)
20180823154258533 DL1C <0006> oml.c:816 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (BCCH TxDL)
20180823154258533 DL1C <0006> oml.c:1083 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.req (hL2=0x000004bb, AGCH TxDL)
20180823154258534 DL1C <0006> oml.c:816 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (AGCH TxDL)
20180823154258534 DL1C <0006> oml.c:1083 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.req (hL2=0x000004bb, PCH TxDL)
20180823154258535 DL1C <0006> oml.c:816 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (PCH TxDL)
20180823154258536 DL1C <0006> oml.c:1083 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.req (hL2=0x000004bb, RACH RxUL)
20180823154258536 DL1C <0006> oml.c:816 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (RACH RxUL)
20180823154258537 DL1C <0006> l1sap.c:622 activate confirm chan_nr=0x40 trx=0
20180823154258537 DRSL <0000> rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK
20180823154258537 DL1C <0006> oml.c:1424 (bts=0,trx=0,ts=0,ss=4) SET_CIPHERING (ALG=0 TxDL)
20180823154258538 DL1C <0006> oml.c:1269 (bts=0,trx=0,ts=0,ss=4) MPH-CONFIG.conf (Configure ciphering params)
20180823154258538 DL1C <0006> oml.c:1424 (bts=0,trx=0,ts=0,ss=4) SET_CIPHERING (ALG=0 RxUL)
20180823154258539 DL1C <0006> oml.c:1269 (bts=0,trx=0,ts=0,ss=4) MPH-CONFIG.conf (Configure ciphering params)
20180823154258555 DLINP <0012> input/ipa.c:65 10.42.42.9:3003 connection closed with server
20180823154258555 DABIS <000d> abis.c:144 Signalling link down

The message is sent in l1sap.c l1sap_info_act_cnf, and it's triggered by this code:

/* any L1 MPH_INFO confirm prim recevied from bts model */
static int l1sap_mph_info_cnf(struct gsm_bts_trx *trx,
     struct osmo_phsap_prim *l1sap, struct mph_info_param *info)
{
    int rc = 0;

    switch (info->type) {
    case PRIM_INFO_ACTIVATE:
        rc = l1sap_info_act_cnf(trx, l1sap, &info->u.act_cnf);
        break;
    case PRIM_INFO_DEACTIVATE:
        rc = l1sap_info_rel_cnf(trx, l1sap, &info->u.act_cnf);
        break;
    default:
        LOGP(DL1C, LOGL_NOTICE, "unknown MPH_INFO cnf type %d\n",
            info->type);
        break;
    }

    return rc;
}

so my guess is that during initialization we get same notification from lower layer than when we receive a chan act, but in the first case we should avoid sending that.

#5 Updated by laforge 30 days ago

On Mon, Oct 15, 2018 at 06:00:57PM +0000, pespin [REDMINE] wrote:

so my guess is that during initialization we get same notification from lower layer than when we receive a chan act, but in the first case we should avoid sending that.

yes, it's very well undrstood what's causing this. We are "activating" the BCCH inside
osmo-bts using code that is normally used for activating dedicated channels via rsl.

I think we abuse ts->lchan4 for the BCCH, as a BCCH-carrying PCHAN can only ever have
up to four dedicated channels (SDCCH 0..3) and 4 is the first free index.

And once the PHY acknowledges the activation, we unconditionally send the ACK, as we don't
have any state that tells us whether it was an internal activation, or one triggered by RSL.

But given that it's not really creating any problem,
it really should be at the lowest of priorities.

#6 Updated by pespin 29 days ago

Well it makes almost all TTCN3 BTS_tests fail when running against a sysmobts. So either we fix this or we account for this in TTCN3.

#7 Updated by laforge 29 days ago

Hi Pau,

On Tue, Oct 16, 2018 at 10:55:56AM +0000, pespin [REDMINE] wrote:

Well it makes almost all TTCN3 BTS_tests fail when running against a sysmobts. So either we fix this or we account for this in TTCN3.

ok, then that's a new reason to fix it. So far, in real-world
installations/uses, it never has been a problem.

The related code is in osmo-bts/sysmo/oml.c around line 284:

/* ugly hack to auto-activate all SAPIs for the BCCH/CCCH on TS0 */

where we
1) set lchan->rel_act_kind = LCHAN_REL_ACT_OML

in the response path, we have rsl_tx_chan_act_acknack() being called, and
it checks if lchan->rel_act_kind != LCHAN_REL_ACT_RSL as a condition to
suppress the transmission of the RSL CHAN ACTIV ACK message. But
somehow that doesn't seem to work?

#8 Updated by pespin 29 days ago

That's the path which seems to be followed by the activation originating the transmission of the Chan Act ACK:

opstart_compl (mo->bts->c0->ts[0].lchan[CCCH_LCHAN].rel_act_kind = LCHAN_REL_ACT_OML)
    lchan_activate
        enqueue_sapi_act_cmd
            queue_sapi_command  (sapi_activate_cb)
                sapi_queue_send
                    sapi_queue_exeute
                        mph_send_activate_req
                            l1if_gsm_req_compl (lchan_act_compl_cb)
                                _l1if_req_compl

[lchan_act_compl_cb]
    sapi_queue_dispatch
        [sapi_activate_cb]
            lchan_set_state ?
            mph_info_chan_confirm (cause=0) !!!!
                l1sap_up (type=PRIM_INFO_ACTIVATE     PRIM_MPH_INFO, PRIM_OP_CONFIR)
                    l1sap_mph_info_cnf
                        l1sap_info_act_cnf
                            rsl_tx_chan_act_acknack(lchan, info_act_cnf->cause);
                                if (lchan->rel_act_kind != LCHAN_REL_ACT_RSL) {

It seems only the last SAPI activated in opstart_compl is provoking the transmission, see lchan_activate using sapis_for_lchan[lchan->type].

static const struct sapi_dir ccch_sapis[] = {
    { GsmL1_Sapi_Fcch,    GsmL1_Dir_TxDownlink },
    { GsmL1_Sapi_Sch,     GsmL1_Dir_TxDownlink },
    { GsmL1_Sapi_Bcch,    GsmL1_Dir_TxDownlink },
    { GsmL1_Sapi_Agch,    GsmL1_Dir_TxDownlink },
    { GsmL1_Sapi_Pch,    GsmL1_Dir_TxDownlink },
    { GsmL1_Sapi_Rach,    GsmL1_Dir_RxUplink },
};

However, it seems they don't follow exactly the same path, otherwise if rel_act_kind = LCHAN_REL_ACT_OML was checked at the end in rsl_tx_chan_act_acknack, it would print following message, and we don't see it:

        LOGP(DRSL, LOGL_NOTICE, "%s not sending CHAN ACT %s\n",
            gsm_lchan_name(lchan), cause ? "NACK" : "ACK");

They don't even reach l1sap_info_act_cnf, as otherwise we'd see "1sap.c:622 activate confirm chan_nr=..." as we do for UL RACH.

That's because this check in oml.c:sapi_activate_cb:

    if (!llist_empty(&lchan->sapi_cmds))
        return 0;

so only last one is pushed to upper layers through later call to mph_info_chan_confirm.

I'll have a deeper look at how lchan->rel_act_kind is managed and also do some tests with L1P and L1C with more verbose debug levels.

#9 Updated by pespin 29 days ago

Attaching run.tar.bz with related issue appearing when running a TTCN3 test.

#10 Updated by pespin 23 days ago

  • Assignee changed from stsp to pespin

#11 Updated by pespin 23 days ago

After adding extra logging, the issue seems to be how we convert lchan to chan_nr and back to lchan when passing it from sysmobts specific lower layers to l1sap. See how subslot is converted from 4 to 0 and lchan pointer thus changes:

20180830081800820 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-sysmo/oml.c:291 ====> trying to activate lchans of BCCH
20180830081800820 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/common/lchan.c:31 (bts=0,trx=0,ts=0,ss=4) state NONE -> ACTIVATION REQUESTED
...
20180830081800832 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-sysmo/oml.c:819 (bts=0,trx=0,ts=0,ss=4) MPH-ACTIVATE.conf (RACH RxUL)
20180830081800832 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-sysmo/oml.c:825 Successful activation of L1 SAPI RACH on TS 0
20180830081800833 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/common/lchan.c:31 (bts=0,trx=0,ts=0,ss=4) state ACTIVATION REQUESTED -> ACTIVE
20180830081800833 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/osmo-bts-sysmo/oml.c:60 Preparing L1 primitive chan_nr=SDCCH/8(0) on TS0 (0x40) lchan=(bts=0,trx=0,ts=0,ss=4) (0xb6d188c0)
20180830081800833 DL1C <0006> /home/pespin/dev/sysmocom/git/osmo-bts/src/common/l1sap.c:625 activate confirm chan_nr=SDCCH/8(0) on TS0 (0x40) trx=0 lchan=(bts=0,trx=0,ts=0,ss=0) (0xb6d0b1e0)
20180830081800833 DRSL <0000> /home/pespin/dev/sysmocom/git/osmo-bts/src/common/rsl.c:741 (bts=0,trx=0,ts=0,pchan=CCCH+SDCCH4) (ss=0) SDCCH Tx CHAN ACT ACK

So when checking for lchan->rel_act_kind != LCHAN_REL_ACT_RSL it checks for a different channel for which LCHAN_REL_ACT_OML was set in opstart_compl.

#12 Updated by pespin 23 days ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 90

I submitted following commit which fixes the issue:
https://gerrit.osmocom.org/#/c/osmo-bts/+/11427 Convert lchan CCCH_LCHAN to chan_nr BCCH and viceversa

#13 Updated by pespin 23 days ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

Merged, closing.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)