Project

General

Profile

Bug #3296

TCH lchan allocation is non-modular and also riddled with holes

Added by neels 6 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
05/28/2018
Due date:
% Done:

100%

Spec Reference:

Description

To be able to add inter-BSC Handover, I need to allocate an lchan. Since common lchan allocation steps are currently duplicated in Assignment and intra-BSC Handover, I did a review of the current code state to identify the best way to continue.

A review of the current lchan allocation procedures during BSSMAP Assignment and intra-BSC Handover has shown that besides being non-modular, the lchan allocation and release procedures have numerous "holes" where we do not safeguard message communication with timeouts, that the sequence of events is not ideal, and at least one wrong action is taken during handover error handling.

I created message sequence charts of Assignment and Handover, and marked numerous errors by red (needs a fix) and orange (could be improved) notes.
See https://gerrit.osmocom.org/9350
In osmo-bsc/doc/, do 'make msc' to generate PNGs from the message sequence charts, or look for "red" in the .msc files.

I have also made a plan for a separate lchan allocation FSM that should fix most of the problems identified here.
Above review has convinced me that there is no good quick way around a proper FSM that can be re-used in a modular way.

lchan_fsm.c lchan_fsm.c 7.18 KB laforge, 05/28/2018 02:44 PM

Related issues

Related to OsmoBSC - Feature #3479: test "every single failure" in osmo-bsc FSMsNew2018-08-20

Blocks OsmoBSC - Bug #2283: Inter-BSC hand-over is missing (BSC side)Resolved2017-05-22

History

#1 Updated by neels 6 months ago

  • Blocks Bug #2283: Inter-BSC hand-over is missing (BSC side) added

#2 Updated by neels 6 months ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 20

#3 Updated by neels 6 months ago

  • Priority changed from Normal to Urgent

#4 Updated by laforge 6 months ago

FYI, some time ago I also did some brainstorming abou a lchan_fsm, see attached code
snippet. It was not intended to be used as-is but just some kind of "written notes".

Not sure if it's of any use.

#5 Updated by neels 5 months ago

  • Checklist item specify new lchan allocation FSM by message sequence chart set to Done
  • Checklist item implement new lchan allocation FSM set to Done
  • Checklist item implement TTCN3 tests to verify successful use for Assignment set to Done
  • Checklist item implement TTCN3 tests to verify successful use for intra-BSC Handover set to Done
  • % Done changed from 20 to 90

(the initial Assignment and intra-BSC handover ttcn3 tests already exist)

#6 Updated by neels 4 months ago

Status update...

The ttcn3-bsc-tests pass for all of AoIP, SCCPlite and LCLS.
Pau has sent me a tar of an osmo-gsm-tester run containing errors to be fixed.

Some code review items have been fixed, but various cosmetic review items are TBD.

When all is ready, before merging, make sure to tag a release.

#7 Updated by neels 4 months ago

  - so far relied on ttcn3 and osmo-gsm-tester test suites, now
    tested in detail with actual phones and BTSes;
  - there were still scores of problems. Fixed on branch neels/inter-bsc-ho;
    not submitted to gerrit yet.
    Test suite coverage doesn't catch these errors:
    - in reality messages come in different order than in ttcn3 tests.
    - I created RTP reflection loops instead of forwarding, test suite doesn't
      catch that. (we should probably verify MGCP messages' port information)
    - Osmocom style dyn TS failed to switch PCHAN mode after PDCH deactivation.
    - HO Failure message caused old lchan's RTP to be DLCX'd
  - Also identified a couple errors in ttcn3-bsc-tests. Patches on gerrit.

  - during handover, noticed large audio gap (several seconds) with AMR.
    With FR1, only a short gap. (I dimly remember some talk about shortening a
    timeout/sync? was actually using a slightly old osmo-bts-sysmo.)
  - Re-Refactored lchan FSM to start connecting RTP earlier, so that (other
    than the old code) we switch RTP to new lchan upon HO Detect (and roll back
    in case of later error). Didn't help that much with FR1 handover audio gap
    though. Could make sense to look at pcap and analyse timing in detail.
  - Handover often fails with RSL Handover Failed message, even though all
    BTSes and phones are in excellent reception conditions. Maybe there needs
    to be a little wait between Lchan Activ Ack of new lchan and Handover
    Command???

  - various cosmetic code review items still not resolved. Focusing on
    functional testing first.

  - Verified/fixed again that ttcn3-bsc-tests pass

#8 Updated by neels 3 months ago

  • Checklist item deleted (implement TTCN3 tests to verify proper timeout actions for each and every asynchronous messaging during Assignment)
  • Checklist item deleted (implement TTCN3 tests to verify proper timeout actions for each and every asynchronous messaging during Handover)
  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100

All changes have been merged to osmo-bsc master. "unfortunately" I also require very detailed ttcn3 tests for osmo-bsc to close this issue. Moving to #3479.

#9 Updated by neels 3 months ago

  • Related to Feature #3479: test "every single failure" in osmo-bsc FSMs added

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)