Open Source Mobile Communications: Issueshttps://projects.osmocom.org/https://projects.osmocom.org/favicon.ico?16647414092020-06-07T02:40:29ZOpen Source Mobile Communications
Redmine libosmocore - Feature #4591 (New): Export FSM states and statistics with countershttps://projects.osmocom.org/issues/45912020-06-07T02:40:29Zipse
<p>Working on borken stats export and thinking about adding monitoring to OsmoSTP and MGCP code, I realized that a lot of useful information can be acquired from FSMs if they supported generic statistics export.</p>
<p>This export can be made configurable through VTY on a per-FSM class level to tweak it to specific monitoring needs.</p>
Examples of generic counters/gauges I can think of immediately:
<ol>
<li>Number of instances of a given FSM (gauge for current and counter for allocation/deallocation). Useful for short living FSMs like SCCP connections.</li>
<li>Current FSM instance state (gauge). Useful for long-living FSMs like A-interface connections.</li>
<li>Counters for transitions to/from states (per state) and received events (per event). Useful for long-living FSMs like lchan's.</li>
</ol> libosmocore - Feature #4590 (New): Export talloc data as counters for better memory leakage monit...https://projects.osmocom.org/issues/45902020-06-07T02:28:10Zipse
<p><code>talloc</code> has a quite well-structured way of allocating memory with text descriptions, so it should be possible to reasonably export its statistics as gauges in a universal way.</p>
<p>E.g. we can export number or allocated elements per context and their total size.</p>
<p>This feature might need to be enabled/disabled from VTY.</p> OsmoBSC - Bug #4589 (New): osmo-bsc crashes with "lchan allocation failed in state WAIT_RF_RELEAS...https://projects.osmocom.org/issues/45892020-06-06T22:40:16Zipse
<p>The crash is due to <code>for_conn</code> dereference while it's NULL in the following part of the <code>_lchan_on_activation_failure()</code> code:</p>
<pre><code class="c syntaxhl"> <span class="k">case</span> <span class="n">FOR_ASSIGNMENT</span><span class="p">:</span>
<span class="n">LOG_LCHAN</span><span class="p">(</span><span class="n">lchan</span><span class="p">,</span> <span class="n">LOGL_NOTICE</span><span class="p">,</span> <span class="s">"Signalling Assignment FSM of error (%s)</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span>
<span class="n">lchan</span><span class="o">-></span><span class="n">last_error</span> <span class="o">?</span> <span class="o">:</span> <span class="s">"unknown error"</span><span class="p">);</span>
<span class="n">_osmo_fsm_inst_dispatch</span><span class="p">(</span><span class="n">for_conn</span><span class="o">-></span><span class="n">assignment</span><span class="p">.</span><span class="n">fi</span><span class="p">,</span> <span class="n">ASSIGNMENT_EV_LCHAN_ERROR</span><span class="p">,</span> <span class="n">lchan</span><span class="p">,</span>
<span class="n">file</span><span class="p">,</span> <span class="n">line</span><span class="p">);</span>
<span class="k">return</span><span class="p">;</span>
</code></pre>
<p>I think the reason for this crash is because the SCCP connection is already closed (or was never established?) while we're waiting in the WAIT_RF_RELEASE_ACK and timeout.</p>
<p>I tried to understand the logic of this code but I'm still not sure what is the right course of action is in this case.</p>
<p>Plus, I'm not sure why are we even getting into the <code>_lchan_on_activation_failure()</code> function while we're not activating the timeslot - we're waiting for its release instead.</p>
<p>The crash backtrace:<br /><pre>
(gdb) bt
#0 _lchan_on_activation_failure (lchan=lchan@entry=0x7f80d51b6e28, activ_for=<optimized out>, for_conn=0x0, line=line@entry=1354, file=0x565535fcfe19 "lchan_fsm.c") at lchan_fsm.c:116
#1 0x0000565535f8cb07 in _lchan_on_activation_failure (line=1354, file=0x565535fcfe19 "lchan_fsm.c", for_conn=<optimized out>, activ_for=<optimized out>, lchan=<optimized out>) at lchan_fsm.c:1354
#2 lchan_fsm_timer_cb (fi=0x565538060930) at lchan_fsm.c:1354
#3 0x00007f80d464d84a in fsm_tmr_cb (data=0x565538060930) at fsm.c:325
#4 0x00007f80d4647926 in osmo_timers_update () at timer.c:257
#5 0x00007f80d4647cda in _osmo_select_main (polling=0) at select.c:260
#6 0x00007f80d4648526 in osmo_select_main_ctx (polling=<optimized out>) at select.c:291
#7 0x0000565535f353ff in main (argc=<optimized out>, argv=<optimized out>) at osmo_bsc_main.c:940
</pre></p>
<p>And the <code>lchan</code> data:<br /><pre>
(gdb) p *lchan
$1 = {ts = 0x7f80d51b5cf8, nr = 0 '\000', name = 0x565537f28670 "(bts=5,trx=0,ts=7,ss=0)", last_error = 0x565537fae9c0 "lchan allocation failed in state WAIT_RF_RELEASE_ACK: Timeout",
fi = 0x565538060930, fi_rtp = 0x0, mgw_endpoint_ci_bts = 0x0, activate = {info = {activ_for = FOR_ASSIGNMENT, for_conn = 0x0, chan_mode = GSM48_CMODE_SPEECH_EFR, encr = {alg_id = 0 '\000',
key_len = 0 '\000', key = '\000' <repeats 15 times>}, s15_s0 = 0, requires_voice_stream = true, wait_before_switching_rtp = false, msc_assigned_cic = 0, re_use_mgw_endpoint_from_lchan = 0x0},
activ_ack = true, immediate_assignment_sent = false, concluded = true, gsm0808_error_cause = GSM0808_CAUSE_RADIO_INTERFACE_MESSAGE_FAILURE}, release = {requested = true, do_rr_release = false,
in_error = true, rsl_error_cause = 127 '\177', in_release_handler = false, is_csfb = false}, type = GSM_LCHAN_TCH_F, rsl_cmode = RSL_CMOD_SPD_SPEECH, tch_mode = GSM48_CMODE_SPEECH_EFR,
csd_mode = LCHAN_CSD_M_NT, bs_power = 0 '\000', ms_power = 0 '\000', encr = {alg_id = 0 '\000', key_len = 0 '\000', key = '\000' <repeats 15 times>}, mr_ms_lv = "\000\000\000\000\000\000",
mr_bts_lv = "\000\000\000\000\000\000", s15_s0 = 0, sapis = "\000\000\000\000\000\000\000", abis_ip = {bound_ip = 169494614, bound_port = 16682, connect_ip = 168234092, connect_port = 7066,
conn_id = 0, rtp_payload = 97 'a', rtp_payload2 = 0 '\000', speech_mode = 1 '\001', ass_compl = {rr_cause = 0 '\000', valid = false}}, rqd_ta = 0 '\000', neigh_meas = {{arfcn = 0, bsic = 0 '\000',
rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0,
last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000',
rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0,
last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000',
rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0,
last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000', rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}, {arfcn = 0, bsic = 0 '\000',
rxlev = "\000\000\000\000\000\000\000\000\000", rxlev_cnt = 0, last_seen_nr = 0 '\000'}}, meas_rep = {{lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'},
sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0,
ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {
lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'},
sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {
rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0,
{pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {
lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'},
sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {
rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0,
ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {
lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'},
sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {
rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0,
ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {
lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'},
sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {
rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0,
ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}, {
lchan = 0x0, nr = 0 '\000', flags = 0, ul = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'}, sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, dl = {full = {rx_lev = 0 '\000', rx_qual = 0 '\000'},
sub = {rx_lev = 0 '\000', rx_qual = 0 '\000'}}, bs_power = 0 '\000', ms_timing_offset = 0, ms_l1 = {pwr = 0 '\000', ta = 0 '\000'}, num_cell = 0, cell = {{rxlev = 0 '\000', bsic = 0 '\000',
neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000',
arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}, {
rxlev = 0 '\000', bsic = 0 '\000', neigh_idx = 0 '\000', arfcn = 0, flags = 0}}}}, meas_rep_idx = 0, meas_rep_count = 0, meas_rep_last_seen_nr = 255 '\377', rqd_ref = 0x0, conn = 0x0,
ch_mode_rate = {chan_mode = GSM48_CMODE_SIGN, chan_rate = CH_RATE_SDCCH, s15_s0 = 0}}
</pre></p> OsmoBTS - Bug #4586 (Resolved): osmo-bts-trx leaks memoryhttps://projects.osmocom.org/issues/45862020-06-06T09:40:39Zipse
<p>After some operations, <code>osmo-bts-trx</code> has an ever-growing list of records like this when you do <code>show talloc-context application brief</code>:<br /><pre>
struct trx_ctrl_msg contains 168 bytes in 1 blocks (ref 0) 0x55b7420379a0
</pre></p>
<p>I'm not sure of the source of this at the moment but it might be related to the use of dynamic <code>TCH/F_PDCH</code> channels.</p> OsmoMGW - Bug #4554 (New): Clean up stale endpoints after OsmoBSC crashhttps://projects.osmocom.org/issues/45542020-05-16T20:52:51Zipse
<p>When OsmoBSC crashes, OsmoMGW doesn't know about that and continues to maintain endpoints, allocated by the old OsmoBSC process forever.</p>
<p>I see two possible ways to resolve this:</p>
<p>1) RTP activity timeout (a standard feature of many VoIP servers). If we don't see RTP activity on a certain endpoint, assume it dead, file an alarm to O&M and deallocate resources. The trick here is that this might happen for various reasons even if OsmoBSC is live and kicking. So we should somehow let it know that the endpoint has failed. Otherwise, this is an important alarm in itself, besides handling crashed OsmoBSC.</p>
<p>2) Somehow maintain a connection between OsmoBSC and OsmoMGW. Maybe a form of a ping/pong with a random OsmoBSC ID. So that if OsmoBSC crashes and is restarted, it send a different ID and OsmoMGW knows it's time to deallocate endpoints allocated by the previous OsmoBSC instance.</p>
<p>PS I don't know MGCP but I would think it should have ways to handle such situations?</p>
<p>PPS A workaround in the situation with a single OsmoMGW handling a single OsmoBSC is to always restart them simultaneously. So that if one crashes, the other one is forced to restart as well.</p> OsmoBSC - Bug #4553 (New): OsmoBSC doesn't handle RESET IP RESOURCE BSSMAP messagehttps://projects.osmocom.org/issues/45532020-05-16T19:44:20Zipse
<p>According to TS 148 008 section 3.1.4.3.2 Reset IP Resource procedure initiated by the MSC:<br /><pre>
On reception of this message the BSS shall release locally the resources and references associated to the specific Call
Identifiers indicated in the received message. The BSS shall always return the RESET IP RESOURCE
ACKNOWLEDGE message to the MSC after all Call Identifier related resources and references have been released and
the BSS shall include the list of Call Identifiers. The list of Call Identifiers within the RESET IP RESOURCE
ACKNOWLEDGE message shall be in the same order as received in the RESET IP RESOURCE message. Unknown
Call Identifiers shall be reported as released.
</pre></p>
<p>Right now OsmoBSC doesn't handle this message and thus doesn't send RESET IP RESOURCE ACKNOWLEDGE to MSC, which looks like a violation of the standard.</p>
<p>In the attached trace you could see that MSC sends RESET IP RESOURCE right after the connection RESET because the BSC was power cycled during active operation. After OsmoBSC doesn't respond, the MSC retires twice and gives up.</p>
<p>At the minimum, it would be great to respond to at least those RESET IP RESOURCE messages for which we don't have known calls - this would cover the case above with a restarted OsmoBSC. We might want to blindly send DLCX to OsmoMGW in this case, to make sure it releases the relevant endpoints. Though I'm not sure this is a reliable way to reset stale OsmoMGW endpoints after an OsmoBSC crash.</p> OsmoMGW - Bug #4552 (New): ip-probing issue with BTS connection - no rebinding to a correct IP ad...https://projects.osmocom.org/issues/45522020-05-14T23:18:26Zipse
<p>Try configuring osmo-mgw to use ip-probing without configuring a default rtp binding address.</p>
<p>See the attached pcap file for the RSL+MGW trace of a new call setup initiation.</p>
<p>You can see in this trace that:<br />1) MGW first responds with IP address of 127.0.0.1 because the initial CRCX doesn't have any address information MGW could use to infer which IP it should be binding to.<br />2) Then MGW receives an MDCX with the other side's IP address and responds with a selected correct IP address on its side.</p>
<p>The issue is that while responding with the correct bind address, it doesn't actually bind to it, so the RTP stream is completely lost.</p>
<p>Possible solutions:</p>
<p>1) Only send CRCX to MGW when we know BTS's IP address (i.e. after exchanging "ip.access CRCX/MDCX" with the BTS). In this case MGW will bind to the correct IP address on its side from the beginning and there will be no issue.</p>
<p>2) Implement RTP stream re-binding on MDCX, and centralize local RTP bind address management. Right now mgcp_get_local_addr() function returns guesses the IP address and returns it, even if the RTP stream is not bound to it. So when we create MDCX response and call this function to get an address for our side of the RTP stream, it essentially misleads us.</p> OsmoMGW - Feature #4551 (New): Dynamic/runtime allocation of new endpointshttps://projects.osmocom.org/issues/45512020-05-14T22:56:41Zipse
<p>Right now endpoints are pre-allocated on the osmo-mgw startup and their number can't be changed without restarting the process.</p>
<p>Reasoning: In a situation when osmo-mgw is hit with more load than expected, it can run out of available endpoints with no way to increase them. Your only solution right now is to change the configuration and restart BSC+MGW processes which leads to the service disruption in the middle of a peak hour.</p>
<p>Decreasing number of endpoints might be nice to have too but it might be more difficult to implement, given that MGW would need to wait for the endpoints to become idle to deallocate them. And it looks like decreasing is less important.</p>
<p>Increasing the number of endpoints can be either (1) manual from VTY or (2) automatic/dynamic.</p>
<p>In the dynamic case, we configure a maximum amount in the configuration file. Then osmo-mgw allocates new endpoints if it can't find an available one. But only up to the configured maximum amount. This way we can default to a significantly large default maximum value and don't bother about this until we hit physical server limitation.</p> OsmoMGW - Bug #4530 (Closed): Failure to find a free endpoint is not reflected in stats countershttps://projects.osmocom.org/issues/45302020-05-05T13:26:28Zipse
<p>In spite of having a lot of messages like this in the log file:</p>
<pre>
DLMGCP DEBUG mgcp_msg.c:65 Received message: line #00: CRCX 1296 rtpbridge/*@mgw MGCP 1.0
DLMGCP DEBUG mgcp_msg.c:65 Received message: line #01: C: 1418
DLMGCP DEBUG mgcp_msg.c:65 Received message: line #02: L: p:20, a:GSM-EFR, nt:IN
DLMGCP DEBUG mgcp_msg.c:65 Received message: line #03: M: recvonly
DLMGCP ERROR mgcp_msg.c:202 Not able to find a free endpoint
DLMGCP ERROR mgcp_msg.c:322 Unable to find Endpoint `rtpbridge/*@mgw'
DLMGCP NOTICE mgcp_protocol.c:402 CRCX 1296: failed to find the endpoint
DLMGCP DEBUG mgcp_protocol.c:232 endpoint:0xffffffff Generated response: code=403
DLMGCP DEBUG mgcp_msg.c:65 Generated response: line #00: 403 1296 FAIL
</pre>
<p>Stats are still showing zeros for all crcx-related errors:</p>
<pre>
crxc statistics:
CRCX command processed successfully.: 21468 (0/s 10/m 1004/h 21325/d)
bad action in CRCX command.: 0 (0/s 0/m 0/h 0/d)
unhandled parameter in CRCX command.: 0 (0/s 0/m 0/h 0/d)
missing CallId in CRCX command.: 0 (0/s 0/m 0/h 0/d)
invalid connection mode in CRCX command.: 0 (0/s 0/m 0/h 0/d)
limit of concurrent connections was reached.: 0 (0/s 0/m 0/h 0/d)
unknown CallId in CRCX command.: 0 (0/s 0/m 0/h 0/d)
connection allocation failure.: 0 (0/s 0/m 0/h 0/d)
no opposite end specified for connection.: 0 (0/s 0/m 0/h 0/d)
failure to start RTP processing.: 0 (0/s 0/m 0/h 0/d)
connection rejected by policy.: 0 (0/s 0/m 0/h 0/d)
no osmux offered by peer.: 0 (0/s 0/m 0/h 0/d)
connection options invalid.: 0 (0/s 0/m 0/h 0/d)
codec negotiation failure.: 0 (0/s 0/m 0/h 0/d)
port bind failure.: 0 (0/s 0/m 0/h 0/d)
</pre> OsmoPCU - Bug #3395 (Resolved): Uplink CS/MCS control is broken osmo-pcu is used with osmo-bts-tr...https://projects.osmocom.org/issues/33952018-07-14T13:48:28Zipse
<p>Not sure which project to assign this to.</p>
<p><code>GprsMs::update_cs_ul()</code> function (osmo-pcu/src/gprs_ms.cpp) is used to control the UL CS/MCS mode and in its current implementation is relies on the Quality "Q" value reported by L1. Osmo-trx does not really calculate the SNR of the link and thus osmo-bts-trx always report Q as 0. This misleads the current algorithm which always drops the CS/MCS to the lowest possible mode (CS-1/MCS-1).</p>
<p>Possible solutions I see:<br />1) Use another metric, like BER which are reported by osmo-bts-trx. A similar method is already used by the DL CS/MCS control.<br />2) Implement SNR calculation in osmo-trx. A better way, but much more involved.</p> OsmoMGW - Feature #2411 (New): No support for RTP dynamic payload types - AMR/HR/EFR payload type...https://projects.osmocom.org/issues/24112017-07-30T22:22:45Zipse
<p>We haven't looked at the details, but I think it's worthwhile to file this bug to keep a note.</p>
<p>It seems OsmoNITB and OsmoBTS doesn't support dynamic RTP payload types (aka PT). Supporting dynamic PTs is a requirement to interface OsmoNITB/OsmoBTS with external SIP/RTP clients via MNCC socket if we want to use anything except GSM-FR.</p>
<p>Calls between OsmoBTS's work fine, because they always allocate the same PT - e.g. 98 for AMR. But when we deal with external clients we can't control PT allocation of the other side calls will fail or you will hear sound only in one direction (external party to OsmoBTS). Because the other party will ask to send AMR RTP with a PT it selects (e.g. 102 for the sake of an example),but OsmoBTS will send it with PT 98.</p> OsmoBTS - Bug #78 (Rejected): osmo-bts: Infinite loop in the MS power controlhttps://projects.osmocom.org/issues/782016-02-19T22:47:34Zipse
<p><strong>Report</strong></p>
<p>All the phones were offline, so i logged in and basically osmo-nitb was not getting anything from the bts there was nothing in the log on osmo-nitb, then i checked the bts and i could see only:<br /><pre>
2014-03-19_03:32:06.46033 <000b> loops.c:142 LOST SACCH frame of trx=0 chan_nr=0x0a, so we raise MS power
2014-03-19_03:32:06.47744 <000b> loops.c:142 LOST SACCH frame of trx=0 chan_nr=0x0a, so we raise MS power
2014-03-19_03:32:06.49496 <000b> loops.c:142 LOST SACCH frame of trx=0 chan_nr=0x0a, so we raise MS power
</pre></p>
<p>I've restarted osmo-bts and it started to work again. For some reason got in a loop because all i could see was just that message.</p>
<p><strong>Conclusion</strong></p>
<p>Needs further research. The environment where this happened has lots of phones at the very edge of coverage, which may trigger various bugs in the lost frames handling.</p>