Project

General

Profile

Bug #3854

OsmoPCU uses wrong CellID in BSSGP

Added by laforge 6 months ago. Updated 21 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
03/21/2019
Due date:
% Done:

100%

Spec Reference:

Description

While writing some unrelated PCU tests, I discovered that somehow the CellID on the BSSGP side is wrong.

Whatever the BTS passes in on the PCU interface side ends uo endian-swapped in the BVC-RESET on BSSGP.

This is really odd, as all of PCUIF is in host byte order.

Looking at the commitlog, it appears that the following patch is to be blamed:

commit bdc55fad62e333951a0eb1fc7b96aaaec16dd6ff
Author: Neels Hofmeyr <neels@hofmeyr.de>
Date:   Wed Feb 21 00:39:07 2018 +0100

    implement support for 3-digit MNC with leading zeros

    Receive the mnc_3_digits flag from the PCU interface.

    Bump the PCU interface to 9.
    This is one part of the three identical pcuif_proto.h patches:
    - I49cd762c3c9d7ee6a82451bdf3ffa2a060767947 (osmo-bts)
    - I787fed84a7b613158a5618dd5cffafe4e4927234 (osmo-pcu)
    - I78f30aef7aa224b2e9db54c3a844d8f520b3aee0 (osmo-bsc)

    Add 3-digit flags and use the new RAI and LAI API from libosmocore throughout
    the code base to be able to handle an MNC < 100 that has three digits (leading
    zeros).

    Depends: Id2240f7f518494c9df6c8bda52c0d5092f90f221 (libosmocore),
             Ib7176b1d65a03b76f41f94bc9d3293a8a07d24c6 (libosmocore)

    Change-Id: I787fed84a7b613158a5618dd5cffafe4e4927234

And with all due respect, it's a prime example of how we fail

  1. a completely unrelated change (changing the endinanness of cell_id) is made as part of implementing 3 digits MNC. This should clearly have been unrelated patches
  2. the reviewer (myself in this case) didn't spot or at least not reject this "making two unrelated changes in one commit"
  3. not having test coverage for OsmoPCU

So for more than one year, we've been reporting the wrong CellID to the SGSN:/

We should learn from this that we absolutely never accept unrelated changes in the same patch. In order to introduce 3 digit MNC support, there was no need to shift the htons() around.


Related issues

Related to OsmoPCU - Bug #3925: Missing PCU_Tests.ttcn UL TBF testsNew04/15/2019

Related to OsmoBTS - Bug #4179: Race condition: OsmoBTS sends empty INFO_ind to PCU socket, if not all SI arrived from BSC via RSLNew08/29/2019

History

#1 Updated by laforge 6 months ago

  • Status changed from New to In Progress

#2 Updated by laforge 6 months ago

  • % Done changed from 0 to 80

#3 Updated by laforge 6 months ago

  • Status changed from In Progress to Resolved

#4 Updated by osmith 25 days ago

  • Status changed from Resolved to In Progress
  • Assignee changed from laforge to osmith
  • % Done changed from 80 to 50

Unfortunately, this issue is appearing again. I can see in both wireshark, and in the ttcn3 testsuite logs, that the CellID is endian-swapped again in BSSGP. That is, with osmo-pcu master, and strangely, after reverting your patch that fixed the issue earlier, the CellID is not swapped anymore.

I read through both your fix and bdc55f (Neels' patch linked above), and it is obvious that the swap was introduced into osmo-pcu in bdc55f and the fix does what it should.

This means, that some other component in the "osmo-bsc -- osmo-bts -- osmo-pcu" chain is mixing up the CellID. Looking at the osmo-pcu.log confirms, that osmo-pcu is already receiving the wrong cell ID from the BTS socket. I've looked at the RSL traces between osmo-bsc and osmo-bts, and there the right CellID is present. So it seems that the error happens in osmo-bts.

Then I searched through the code for mentions of cell_id in osmo-bts, but found only two places, and with debug prints added, they always showed the wrong ID.

My current guess is, that this is a bug in libosmocore, possibly related to the "struct osmo_cell_global_id", that internally combines mcc, mnc, lac and cell_id to pass all of them at once to various functions.

I can't simply try old versions or bisect libosmocore for this problem, because the various components won't build against much older libosmocore versions.

#5 Updated by osmith 25 days ago

  • Related to Bug #3925: Missing PCU_Tests.ttcn UL TBF tests added

#6 Updated by osmith 22 days ago

  • Related to Bug #4179: Race condition: OsmoBTS sends empty INFO_ind to PCU socket, if not all SI arrived from BSC via RSL added

#7 Updated by osmith 22 days ago

  • % Done changed from 50 to 90

Fix and TTCN3 test submitted: https://gerrit.osmocom.org/q/topic:cellid-fix

#8 Updated by osmith 21 days ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)