Project

General

Profile

Bug #2937

pmaier/fsm: OsmoBSC crash on BSSMAP RESET

Added by laforge 7 days ago. Updated about 7 hours ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
02/13/2018
Due date:
% Done:

0%

Spec Reference:

Description

After performing BSC_Tests.TC_ho_int once (successfully), executing any other test cases afterwards will crash osmo-bsc as follows:

Tue Feb 13 08:12:33 2018 DLCTRL <0018> control_if.c:497 accept()ed new CTRL connection from (r=127.0.0.1:9999<->l=127.0.0.1:4249)
Tue Feb 13 08:12:33 2018 DMSC <0008> osmo_bsc_sigtran.c:167 N-UNITDATA.ind(00 04 30 04 01 00 )
Tue Feb 13 08:12:33 2018 DMSC <0008> osmo_bsc_bssap.c:978 Rx MSC UDT: 00 04 30 04 01 00 
Tue Feb 13 08:12:33 2018 DMSC <0008> osmo_bsc_bssap.c:867 Rx MSC UDT BSSMAP RESET
Tue Feb 13 08:12:33 2018 DMSC <0008> osmo_bsc_bssap.c:216 RESET from MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP
Tue Feb 13 08:12:33 2018 DMSC <0008> osmo_bsc_sigtran.c:402 SUBSCR_CONN[0x8adcfe0]{ACTIVE}: Terminating (cause = OSMO_FSM_TERM_REQUEST)
Tue Feb 13 08:12:33 2018 DRLL <0000> osmo_bsc_sigtran.c:402 MGCP_CONN[0x8aefcd0]{ST_READY}: Terminating (cause = OSMO_FSM_TERM_PARENT)
Tue Feb 13 08:12:33 2018 DRLL <0000> osmo_bsc_sigtran.c:402 MGCP_CONN[0x8aefcd0]{ST_READY}: Removing from parent SUBSCR_CONN[0x8adcfe0]
Tue Feb 13 08:12:33 2018 DRLL <0000> mgcp_client_fsm.c:469 MGCP_CONN[0x8aefcd0]{ST_READY}: MGW/DLCX: aprupt FSM termination with connections still present, sending unconditional DLCX...
Tue Feb 13 08:12:33 2018 DRLL <0000> osmo_bsc_sigtran.c:402 MGCP_CONN[0x8aefcd0]{ST_READY}: Freeing instance
Tue Feb 13 08:12:33 2018 DRLL <0000> fsm.c:318 MGCP_CONN[0x8aefcd0]{ST_READY}: Deallocated
Tue Feb 13 08:12:33 2018 DRLL <0000> osmo_bsc_sigtran.c:402 MGCP_CONN[0x8ae5bf0]{ST_READY}: Terminating (cause = OSMO_FSM_TERM_PARENT)
Tue Feb 13 08:12:33 2018 DRLL <0000> osmo_bsc_sigtran.c:402 MGCP_CONN[0x8ae5bf0]{ST_READY}: Removing from parent SUBSCR_CONN[0x8adcfe0]
Tue Feb 13 08:12:33 2018 DRLL <0000> mgcp_client_fsm.c:469 MGCP_CONN[0x8ae5bf0]{ST_READY}: MGW/DLCX: aprupt FSM termination with connections still present, sending unconditional DLCX...
Tue Feb 13 08:12:33 2018 DRLL <0000> osmo_bsc_sigtran.c:402 MGCP_CONN[0x8ae5bf0]{ST_READY}: Freeing instance
Tue Feb 13 08:12:33 2018 DRLL <0000> fsm.c:318 MGCP_CONN[0x8ae5bf0]{ST_READY}: Deallocated
Tue Feb 13 08:12:33 2018 DMSC <0008> bsc_subscr_conn_fsm.c:783 SUBSCR_CONN[0x8adcfe0]{ACTIVE}: Disconnecting SCCP
Tue Feb 13 08:12:33 2018 DMSC <0008> bsc_subscr_conn_fsm.c:212 SUBSCR_CONN[0x8adcfe0]{ACTIVE}: tossing all MGCP connections...
==1779== Invalid read of size 8
==1779==    at 0x5D4173D: mgcp_conn_delete (mgcp_client_fsm.c:640)
==1779==    by 0x11995F: toss_mgcp_conn (bsc_subscr_conn_fsm.c:215)
==1779==    by 0x119B78: gscon_cleanup (bsc_subscr_conn_fsm.c:801)
==1779==    by 0x56D43B4: _osmo_fsm_inst_term (fsm.c:525)
==1779==    by 0x120007: osmo_bsc_sigtran_reset (osmo_bsc_sigtran.c:402)
==1779==    by 0x122E0B: bssmap_handle_reset (osmo_bsc_bssap.c:220)
==1779==    by 0x122E0B: bssmap_rcvmsg_udt (osmo_bsc_bssap.c:874)
==1779==    by 0x122E0B: bsc_handle_udt (osmo_bsc_bssap.c:992)
==1779==    by 0x11F148: handle_unitdata_from_msc (osmo_bsc_sigtran.c:146)
==1779==    by 0x11F148: sccp_sap_up (osmo_bsc_sigtran.c:168)
==1779==    by 0x5B15CF0: sclc_rx_cldt (sccp_sclc.c:195)
==1779==    by 0x5B15CF0: sccp_sclc_rx_from_scrc (sccp_sclc.c:260)
==1779==    by 0x5B14FEC: scrc_node_6.isra.6 (sccp_scrc.c:337)
==1779==    by 0x5B156B1: scrc_rx_mtp_xfer_ind_xua (sccp_scrc.c:459)
==1779==    by 0x5B185A4: mtp_user_prim_cb (sccp_user.c:176)
==1779==    by 0x5B103E2: m3ua_rx_xfer (m3ua.c:586)
==1779==    by 0x5B103E2: m3ua_rx_msg (m3ua.c:738)
==1779==  Address 0x8ae5c18 is 136 bytes inside a block of size 288 free'd
==1779==    at 0x4C2DDBB: free (vg_replace_malloc.c:530)
==1779==    by 0x505BE82: _talloc_free (in /usr/lib/x86_64-linux-gnu/libtalloc.so.2.1.10)
==1779==    by 0x56D441C: _osmo_fsm_inst_term (fsm.c:530)
==1779==    by 0x56D41C2: _osmo_fsm_inst_term_children (fsm.c:576)
==1779==    by 0x56D4351: _osmo_fsm_inst_term (fsm.c:512)
==1779==    by 0x120007: osmo_bsc_sigtran_reset (osmo_bsc_sigtran.c:402)
==1779==    by 0x122E0B: bssmap_handle_reset (osmo_bsc_bssap.c:220)
==1779==    by 0x122E0B: bssmap_rcvmsg_udt (osmo_bsc_bssap.c:874)
==1779==    by 0x122E0B: bsc_handle_udt (osmo_bsc_bssap.c:992)
==1779==    by 0x11F148: handle_unitdata_from_msc (osmo_bsc_sigtran.c:146)
==1779==    by 0x11F148: sccp_sap_up (osmo_bsc_sigtran.c:168)
==1779==    by 0x5B15CF0: sclc_rx_cldt (sccp_sclc.c:195)
==1779==    by 0x5B15CF0: sccp_sclc_rx_from_scrc (sccp_sclc.c:260)
==1779==    by 0x5B14FEC: scrc_node_6.isra.6 (sccp_scrc.c:337)
==1779==    by 0x5B156B1: scrc_rx_mtp_xfer_ind_xua (sccp_scrc.c:459)
==1779==    by 0x5B185A4: mtp_user_prim_cb (sccp_user.c:176)
==1779==  Block was alloc'd at
==1779==    at 0x4C2CB8F: malloc (vg_replace_malloc.c:299)
==1779==    by 0x505E150: _talloc_zero (in /usr/lib/x86_64-linux-gnu/libtalloc.so.2.1.10)
==1779==    by 0x56D3847: osmo_fsm_inst_alloc (fsm.c:210)
==1779==    by 0x56D40AB: osmo_fsm_inst_alloc_child (fsm.c:265)
==1779==    by 0x5D41408: mgcp_conn_create (mgcp_client_fsm.c:568)
==1779==    by 0x11AC46: gscon_fsm_active (bsc_subscr_conn_fsm.c:310)
==1779==    by 0x56D3E7E: _osmo_fsm_inst_dispatch (fsm.c:481)
==1779==    by 0x1224E8: bssmap_handle_assignm_req.isra.12 (osmo_bsc_bssap.c:840)
==1779==    by 0x12335D: bssmap_rcvmsg_dt1 (osmo_bsc_bssap.c:909)
==1779==    by 0x12335D: bsc_handle_dt (osmo_bsc_bssap.c:1012)
==1779==    by 0x11F334: handle_data_from_msc (osmo_bsc_sigtran.c:133)
==1779==    by 0x11F334: sccp_sap_up (osmo_bsc_sigtran.c:209)
==1779==    by 0x56D3E7E: _osmo_fsm_inst_dispatch (fsm.c:481)
==1779==    by 0x5B179D4: sccp_scoc_rx_from_scrc (sccp_scoc.c:1581)
==1779== 

History

#1 Updated by dexter about 7 hours ago

  • Status changed from New to Feedback
  • Assignee changed from dexter to laforge

It happens while the MGCP connections are tossed. This looks like it were caused because the child processes of the GSCON FSM are freed early. Since the MGCP connections are child processes as well and are unlinked in cleanup_cb it crashes.

As we now have resolved the problem (see #2915). I would suggest to try again with the recent state on pmaier/fsm3.

Also available in: Atom PDF