Project

General

Profile

Bug #4078

osmo-bsc: Sends duplicated BSSMAP Clear Request upon HO failure (TC_ho_out_fail_no_ho_detect)

Added by pespin about 2 months ago. Updated about 1 month ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Handover
Target version:
-
Start date:
06/26/2019
Due date:
% Done:

90%

Spec Reference:

Description

In BSC_Tests.TC_ho_out_fail_no_ho_detect(), after adding osmo-ttcn3-hacks commit Ic398896147a0b6b04ffeae56a23d25783b2b17fe, which improves tear down sequence for that HO scenario, the test fails because osmo-bsc sends BSSMAP ClearRequest twice.

It seems to be an issue coming from some message order and FSM implementation.

Time chart:
  • T8 triggers (HO timeout)
  • BSC sends DLCX (BTS-side), BSC->BTS RR Channel Release
  • BSC->MSC BSSSMAP Clear Request
  • BSC->BTS DEACIVATE SACCH
  • BSC->BTS RF CHANnel RELease
  • BSC->MSC BSSMAP Clear Request (second one!!!!)
  • BTS->BSC RF Channel Release ACK
  • MSC->BSC Clear Command
  • BSC->MSC Clear Complete

related osmo-bsc log:

20190626151044593 DRSL <0003> fsm.c:284 handover(interBSCout_conn1_IMSI001019876543210)[0x5594eae4ee90]{inter-BSC-OUT:WAIT_CLEAR}: Timeout of T8
20190626151044593 DRSL <0003> handover_fsm.c:1332 handover(interBSCout_conn1_IMSI001019876543210)[0x5594eae4ee90]{inter-BSC-OUT:WAIT_CLEAR}: (0-0-1-TCH_F-0-SPEECH_V1) --HO-> (BTS 0 to ARFCN 123 (any BSIC)) (subscr IMSI:001019876543210) HO-interBSC-Out: Handover failed in state HANDOVER, Failure (timeout): Timeout
20190626151044593 DCHAN <0010> lchan_fsm.c:1350 lchan_rtp(0-0-1-TCH_F-0)[0x5594eae4b930]{ESTABLISHED}: Received Event LCHAN_RTP_EV_ROLLBACK
20190626151044593 DCHAN <0010> lchan_rtp_fsm.c:574 lchan_rtp(0-0-1-TCH_F-0)[0x5594eae4b930]{ESTABLISHED}: Terminating (cause = OSMO_FSM_TERM_REGULAR)
20190626151044593 DCHAN <0010> lchan_rtp_fsm.c:574 lchan_rtp(0-0-1-TCH_F-0)[0x5594eae4b930]{ESTABLISHED}: Removing from parent lchan(0-0-1-TCH_F-0)[0x5594eae2e900]
20190626151044593 DLMGCP <0023> mgcp_client_endpoint_fsm.c:576 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{IN_USE}: rtpbridge/1@mgw CI[0] to-BTS CI=2D0A1020: DLCX: notify=NULL
20190626151044593 DLMGCP <0023> mgcp_client_endpoint_fsm.c:623 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{IN_USE}: rtpbridge/1@mgw CI[0] to-BTS CI=2D0A1020: DLCX: Scheduling
20190626151044593 DLMGCP <0023> mgcp_client_endpoint_fsm.c:626 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{IN_USE}: state_chg to WAIT_MGW_RESPONSE
20190626151044593 DLMGCP <0023> mgcp_client_endpoint_fsm.c:673 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{WAIT_MGW_RESPONSE}: rtpbridge/1@mgw CI[0] to-BTS CI=2D0A1020: Sending MGCP: DLCX 2D0A1020
20190626151044593 DRLL <0000> mgcp_client_fsm.c:722 MGCP_CONN(to-BTS)[0x5594eae564a0]{ST_READY}: Received Event EV_DLCX
20190626151044594 DLMGCP <0023> mgcp_client.c:963 Queued 52 bytes for MGCP GW
20190626151044594 DRLL <0000> mgcp_client_fsm.c:377 MGCP_CONN(to-BTS)[0x5594eae564a0]{ST_READY}: state_chg to ST_DLCX_RESP
20190626151044594 DLMGCP <0023> mgcp_client_endpoint_fsm.c:798 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{WAIT_MGW_RESPONSE}: rtpbridge/1@mgw Sent messages: 1
20190626151044594 DLMGCP <0023> mgcp_client_endpoint_fsm.c:724 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{WAIT_MGW_RESPONSE}: rtpbridge/1@mgw CI[1] to-MSC CI=21F9EDF7: CRCX 1.2.3.4:2342: done (rtpbridge/1@mgw:127.0.0.3:11002)
20190626151044594 DLMGCP <0023> mgcp_client_endpoint_fsm.c:742 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{WAIT_MGW_RESPONSE}: rtpbridge/1@mgw CI in use: 1, waiting for response: 0
20190626151044594 DLMGCP <0023> mgcp_client_endpoint_fsm.c:753 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{WAIT_MGW_RESPONSE}: state_chg to IN_USE
20190626151044594 DLMGCP <0023> mgcp_client_endpoint_fsm.c:724 mgw-endp(conn1_IMSI001019876543210)[0x5594eae4ff20]{IN_USE}: rtpbridge/1@mgw CI[1] to-MSC CI=21F9EDF7: CRCX 1.2.3.4:2342: done (rtpbridge/1@mgw:127.0.0.3:11002)
20190626151044594 DCHAN <0010> lchan_rtp_fsm.c:574 lchan_rtp(0-0-1-TCH_F-0)[0x5594eae4b930]{ESTABLISHED}: Freeing instance
20190626151044594 DCHAN <0010> fsm.c:535 lchan_rtp(0-0-1-TCH_F-0)[0x5594eae4b930]{ESTABLISHED}: Deallocated
20190626151044594 DCHAN <0010> lchan_rtp_fsm.c:574 lchan(0-0-1-TCH_F-0)[0x5594eae2e900]{ESTABLISHED}: Received Event LCHAN_EV_RTP_RELEASED
20190626151044594 DRR <0002> gsm_04_08_rr.c:316 Sending Channel Release: Chan: Number: 0 Type: 2
20190626151044594 DRSL <0003> abis_rsl.c:633 (bts=0,trx=0,ts=1,ss=0) DEACTivate SACCH CMD
20190626151044594 DCHAN <0010> lchan_fsm.c:1361 lchan(0-0-1-TCH_F-0)[0x5594eae2e900]{ESTABLISHED}: state_chg to WAIT_RF_RELEASE_ACK
20190626151044594 DMSC <0007> bsc_subscr_conn_fsm.c:694 SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]{HANDOVER}: conn detaches lchan lchan(0-0-1-TCH_F-0)[0x5594eae2e900]
20190626151044594 DCHAN <0010> bsc_subscr_conn_fsm.c:698 lchan(0-0-1-TCH_F-0)[0x5594eae2e900]{WAIT_RF_RELEASE_ACK}: conn SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0] detaches lchan (primary lchan)
20190626151044594 DMSC <0007> bsc_subscr_conn_fsm.c:140 SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]{HANDOVER}: Tx BSSMAP CLEAR REQUEST(EQUIPMENT FAILURE) to MSC
20190626151044594 DMSC <0007> osmo_bsc_sigtran.c:386 Tx MSC: BSSMAP: CLEAR REQUEST
20190626151044594 DMSC <0007> osmo_bsc_sigtran.c:409 Sending connection (id=1) oriented data to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP (00 04 22 04 01 20 )
20190626151044594 DLSCCP <0020> sccp_scoc.c:1711 Received SCCP User Primitive (N-DATA.request)
20190626151044594 DLSCCP <0020> sccp_scoc.c:1751 SCCP-SCOC(1)[0x5594eae4e8d0]{ACTIVE}: Received Event N-DATA.req
20190626151044594 DLSS7 <001f> sccp_scrc.c:398 sccp_scrc_rx_scoc_conn_msg:  HDR=(CO:CODT,V=0,LEN=0),
        PART(T=Routing Context,L=4,D=00000000),
        PART(T=Destination Reference,L=4,D=0078ba02),
        PART(T=Data,L=6,D=000422040120)
20190626151044594 DLSS7 <001f> osmo_ss7_hmrt.c:278 m3ua_hmdc_rx_from_l2(): dpc=185=0.23.1 not local, message is for routing
20190626151044594 DLSS7 <001f> osmo_ss7_hmrt.c:227 Found route for dpc=185=0.23.1: pc=0=0.0.0 mask=0x0=0.0.0 via AS as-clnt-msc-0 proto=m3ua
20190626151044594 DLSS7 <001f> osmo_ss7_hmrt.c:233 rt->dest.as proto is M3UA for dpc=185=0.23.1
20190626151044594 DLSS7 <001f> m3ua.c:507 XUA_AS(as-clnt-msc-0)[0x5594eae20ce0]{AS_ACTIVE}: Received Event AS-TRANSFER.req
20190626151044594 DCHAN <0010> lchan_fsm.c:1406 lchan(0-0-1-TCH_F-0)[0x5594eae2e900]{WAIT_RF_RELEASE_ACK}: lchan detaches from conn SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]
20190626151044594 DMSC <0007> lchan_fsm.c:1409 SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]{HANDOVER}: lchan lchan(0-0-1-TCH_F-0)[0x5594eae2e900] detaches from conn
20190626151044594 DRSL <0003> handover_fsm.c:845 handover(interBSCout_conn1_IMSI001019876543210)[0x5594eae4ee90]{inter-BSC-OUT:WAIT_CLEAR}: (0-0-0-?-0-?) --HO-> (BTS 0 to ARFCN 123 (any BSIC)) (subscr IMSI:001019876543210) HO-interBSC-Out: incrementing rate counter: interbsc_ho_out:timeout Handover timed out.
20190626151044594 DRSL <0003> handover_fsm.c:847 handover(interBSCout_conn1_IMSI001019876543210)[0x5594eae4ee90]{inter-BSC-OUT:WAIT_CLEAR}: (0-0-0-?-0-?) --HO-> (BTS 0 to ARFCN 123 (any BSIC)) (subscr IMSI:001019876543210) HO-interBSC-Out: Result: Failure (timeout)
20190626151044594 DMSC <0007> handover_fsm.c:854 SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]{HANDOVER}: Received Event HANDOVER_END
20190626151044594 DMSC <0007> bsc_subscr_conn_fsm.c:427 SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]{HANDOVER}: state_chg to ACTIVE
20190626151044594 DMSC <0007> bsc_subscr_conn_fsm.c:140 SUBSCR_CONN(conn1_IMSI001019876543210)[0x5594eae4baa0]{ACTIVE}: Tx BSSMAP CLEAR REQUEST(EQUIPMENT FAILURE) to MSC
20190626151044594 DMSC <0007> osmo_bsc_sigtran.c:386 Tx MSC: BSSMAP: CLEAR REQUEST

So it seems it sends it first in gscon_fsm_active_onenter(), and later on in gscon_fsm_active_onenter(). Not sure how to properly fix it though.

I added these 2 commits to help debugging:
https://gerrit.osmocom.org/c/osmo-bsc/+/14608 bsc_subscr_conn_fsm: Log Tx of BSSMAP Clear Request with cause
https://gerrit.osmocom.org/c/osmo-bsc/+/14607 bsc_subscr_conn_fsm: Use gscon_bssmap_clear() helper on send failure

History

#1 Updated by pespin about 2 months ago

  • Subject changed from osmo-bsc: Sends duplicated BSSMAP Clear Request uponh HO failure (TC_ho_out_fail_no_ho_detect) to osmo-bsc: Sends duplicated BSSMAP Clear Request upon HO failure (TC_ho_out_fail_no_ho_detect)

#2 Updated by pespin about 2 months ago

  • Assignee changed from pespin to neels

#3 Updated by pespin about 2 months ago

TTCN3 tests were merged, so this issue already shows up in jenkins runs.

#4 Updated by neels about 1 month ago

One underlying aspect to consider is that the BSSMAP Clear Request from the BSC is not imperative.
The BSC can send Clear Requests as much as it likes, the MSC is the one that decides to send a Clear Command when it sees fit.
So theoretically, it is not really an error to send more Clear Requests than strictly necessary.

On the other hand, if the BSC has one cause to want a Clear, it should ideally send only one Request, of course.

In fact we could adjust the ttcn3 to be fine with N Clear Requests from the BSC.
Making the BSC send only one Clear Request would, strictly speaking, be a kind of "cream on top".

In practice though it might be less effort to simply make the BSC send only one Clear Request.
So that's what I am going to investigate now.

#5 Updated by neels about 1 month ago

  • Status changed from New to In Progress

#6 Updated by neels about 1 month ago

...actually, if a handover out of the BSC fails, the lchan should just continue as before.
Need to take a closer look at that test.

#7 Updated by neels about 1 month ago

got a patch coming up to remove the double clear.
Created a new issue for the failure to continue on the old lchan for failed inter-BSC-out HO: #4093

#8 Updated by neels about 1 month ago

  • % Done changed from 0 to 90

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)