Project

General

Profile

Actions

Bug #5990

closed

osmo-bsc fails to start after osmo-sgsn: "Received RKM_REG_RSP with negative result"

Added by fixeria about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
-
Start date:
03/31/2023
Due date:
% Done:

0%

Spec Reference:

Description

If I start osmo-stp, osmo-msc, osmo-bsc, and then osmo-sgsn everything works fine. However, if I start osmo-sgsn before starting osmo-bsc, the later is having problems:

DRESET INFO osmo_bsc_sigtran.c:64 Sending RESET to MSC: RI=SSN_PC,PC=0.23.1,SSN=BSSAP
DLSS7 ERROR m3ua.c:508 XUA_AS(as-clnt-OsmoBSC-A)[0x561c69637920]{AS_INACTIVE}: Event AS-TRANSFER.req not permitted
DLSS7 INFO xua_rkm.c:439 0: asp-asp-clnt-OsmoBSC-A: Received RKM REG RES rctx=0 status=Invalid Routing Key
DLSS7 NOTICE xua_default_lm_fsm.c:246 xua_default_lm(asp-clnt-OsmoBSC-A)[0x561c69652ad0]{RKM_REG}: Received RKM_REG_RSP with negative result
DLSS7 INFO osmo_ss7.c:1608 0: asp-asp-clnt-OsmoBSC-A: Restarting ASP asp-clnt-OsmoBSC-A, r=127.0.0.1:2905<->l=0.0.0.0:0
DLSS7 INFO osmo_ss7.c:1841 0: asp-asp-clnt-OsmoBSC-A: Client connected (r=127.0.0.1:2905<->l=127.0.0.1:38305)
DLSS7 INFO xua_rkm.c:439 0: asp-asp-clnt-OsmoBSC-A: Received RKM REG RES rctx=0 status=Invalid Routing Key
DLSS7 NOTICE xua_default_lm_fsm.c:246 xua_default_lm(asp-clnt-OsmoBSC-A)[0x561c696548f0]{RKM_REG}: Received RKM_REG_RSP with negative result
DLSS7 INFO osmo_ss7.c:1608 0: asp-asp-clnt-OsmoBSC-A: Restarting ASP asp-clnt-OsmoBSC-A, r=127.0.0.1:2905<->l=0.0.0.0:0
DLSS7 INFO osmo_ss7.c:1841 0: asp-asp-clnt-OsmoBSC-A: Client connected (r=127.0.0.1:2905<->l=127.0.0.1:56621)

Interestingly enough, the routing context is 0 in the logging messages, while I have it set to 2 in the config file.

Below is the configuration snippets for osmo-stp, osmo-msc, osmo-bsc and osmo-sgsn:

Component point-code routing-key
osmo-msc 0.23.1 1 0.23.1
osmo-bsc 0.23.3 2 0.23.3
osmo-sgsn 0.23.4 0 0.23.4

/etc/osmocom/osmo-stp.cfg

cs7 instance 0
 xua rkm routing-key-allocation dynamic-permitted
 listen m3ua 2905
  accept-asp-connections dynamic-permitted
  local-ip 127.0.0.1
  local-ip ::1

/etc/osmocom/osmo-msc.cfg

cs7 instance 0
 point-code 0.23.1
 asp asp-clnt-OsmoMSC-A 2905 0 m3ua
  remote-ip 127.0.0.1
 as as-clnt-OsmoMSC-A m3ua
  asp asp-clnt-OsmoMSC-A
  routing-key 1 0.23.1

/etc/osmocom/osmo-bsc.cfg

cs7 instance 0
 point-code 0.23.3
 asp asp-clnt-OsmoBSC-A 2905 0 m3ua
  remote-ip 127.0.0.1
 as as-clnt-OsmoBSC-A m3ua
  asp asp-clnt-OsmoBSC-A
  routing-key 2 0.23.3

/etc/osmocom/osmo-sgsn.cfg

cs7 instance 0
 point-code 0.23.4
 asp asp-clnt-OsmoSGSN 2905 0 m3ua
  local-ip 127.0.0.1
  remote-ip 127.0.0.1
  sctp-role client
 as as-clnt-OsmoSGSN m3ua
  asp asp-clnt-OsmoSGSN
  routing-key 0 0.23.4

Software versions

Component Version
libosmocore.git 37dc995234c50cc5f3caf325598e7fddd52cb878
libosmo-sccp.git 79c64ab5556ca0d709a56f0d4f61e2f9044c4550
osmo-bsc.git 6d369665dda4160068e4f66851c3c30dcbd6133b
osmo-sgsn.git d5dca3a67f6c028b870315bc5139f39d938d1610

Files

osmo_stp_reg.pcapng.gz osmo_stp_reg.pcapng.gz 1.23 KB fixeria, 03/31/2023 12:29 PM
Actions #1

Updated by fixeria about 1 year ago

The attached PCAP contains M3UA REG REQ/RSP messages:

Frame # Routing Context Comment
1 1 osmo-msc sends REG REQ
2 1 osmo-stp sends REG RSP (success)
3 0 osmo-sgsn sends REG REQ
4 2 (why?) osmo-stp sends REG RSP (success)
5 2 osmo-bsc sends REG REQ
6 0 (why?) osmo-stp sends REG RSP (error: Invalid routing key)
7 2 osmo-bsc sends REG REQ
8 0 (why?) osmo-stp sends REG RSP (error: Invalid routing key)
Actions #2

Updated by fixeria about 1 year ago

  • Status changed from New to In Progress
  • Assignee set to fixeria
  • % Done changed from 0 to 20

The source code of the REG REQ handler in libosmo-sccp.git explains we do we see mismatching Routing Context:

/* SG: handle a single registration request IE (nested IEs in 'innner' */
static int handle_rkey_reg(struct osmo_ss7_asp *asp, struct xua_msg *inner,
                           struct msgb *resp, struct osmo_ss7_as **newly_assigned_as,
                           unsigned int max_nas_idx, unsigned int *nas_idx)
{
        uint32_t rk_id, rctx, _tmode, dpc;

        // ...

        /* ASP may already include a routing context value here */
        rctx = xua_msg_get_u32(inner, M3UA_IEI_ROUTE_CTX);

        // ...

        /* if the ASP did not include a routing context number, allocate
         * one locally (will be part of response) */
        if (!rctx)
                rctx = osmo_ss7_find_free_rctx(asp->inst);

        LOGPASP(asp, DLSS7, LOGL_INFO, "RKM: Registering routing key %u for DPC %s\n",
                rctx, osmo_ss7_pointcode_print(asp->inst, dpc));

As can be seen, handle_rkey_reg() calls xua_msg_get_u32() in order to obtain value of the optional Routing Context IE.

uint32_t xua_msg_get_u32(const struct xua_msg *xua, uint16_t iei)
{
        struct xua_msg_part *part = xua_msg_find_tag(xua, iei);
        if (!part)
                return 0;
        return xua_msg_part_get_u32(part);
}

This function returns 0 if xua_msg_find_tag() fails to find the given IEI (M3UA_IEI_ROUTE_CTX in our case).

osmo-sgsn connects (frames 3 and 4)

In this case the M3UA_IEI_ROUTE_CTX is present, and osmo-sgsn is indicating 0 as the Routing Context value. Function handle_rkey_reg() treats value 0 as if the M3UA_IEI_ROUTE_CTX was not present, and allocates a new Routing Context by calling osmo_ss7_find_free_rctx(). This is why we see Routing Context 2 in frame 4, it's a new routing context value.

osmo-bsc connects (frames 5 and 6)

In this case the M3UA_IEI_ROUTE_CTX is also present, and osmo-bsc is indicating 2 as the Routing Context value. However, this value was already allocated to osmo-sgsn, so osmo-stp responds negatively ("Invalid routing key"). The Routing Context value in REG RSP messages is set to 0 because RFC 4666 requires to do so in case of an error.

I checked RFC 4666 and could not find anything limiting the value range of the Routing Context IE. Value 0 appears to be a legal value, which needs to be handled properly. laforge does that make sense to you?

Actions #3

Updated by fixeria about 1 year ago

  • Status changed from In Progress to Feedback

This patch fixes the problem for me:

https://gerrit.osmocom.org/c/libosmo-sccp/+/32185 xua_rkm: handle_rkey_reg(): properly handle Routing Context IE [NEW]

Actions #4

Updated by laforge about 1 year ago

On Sun, Apr 02, 2023 at 12:34:49PM +0000, fixeria wrote:

I checked RFC 4666 and could not find anything limiting the value range of the Routing Context IE. Value 0 appears to be a legal value, which needs to be handled properly. laforge does that make sense to you?

The general "problem" is that there are two scenarios:
  • M3UA where no routing context is used at all
  • M3UA with a routing context

Conceptually, it's a bit like operating Ethernet with or without VLAN tags.

So far, libosmo-sigtran has - throughout the code - trated a routing_context == 0 as a special case fo
"no routing context shall be used". Yes, this is not covered by the M3UA spec, but I thought it's a
reasonable compromise to not introduce yet another config setting (and associated variable) everwhere.

So if routing context 0 is configured by the user (or e.g. used implicitly since no VTY command is set
to use a routing context), then no routing context IE should be used in all M3UA messaging. This is
required for compatibility with peers that don't implement routing contexts.

Whatever changes/fixes you propose, we must make sure that we can interoperate both with
  • peers that use (one or multiple) routing contexts on a given M3UA link, as well as
  • peers that do not use a routing context at all

It is valid to assume that any deployment using the "dynamic RKM registration" will use
non-zero routing contexts, if that makes our life simpler.

Regards,
Harald

Actions #5

Updated by fixeria about 1 year ago

Hi Harald,

thanks for a detailed explanation.

laforge wrote in #note-4:

So far, libosmo-sigtran has - throughout the code - trated a routing_context == 0 as a special case fo
"no routing context shall be used". Yes, this is not covered by the M3UA spec, but I thought it's a
reasonable compromise to not introduce yet another config setting (and associated variable) everwhere.

I must admit it was not obvious to me that this special "no routing context shall be used" case exists.
I checked https://downloads.osmocom.org/docs/osmo-stp/master/osmostp-usermanual.pdf, but could not find anything about the special case.

The interactive VTY command does not say anything that value 0 is somehow special, it doesn't even restrict the value range:

OsmoMSC(config-cs7-as)# routing-key?
  routing-key  Define a routing key
OsmoMSC(config-cs7-as)# routing-key 
  RCONTEXT  Routing context number
OsmoMSC(config-cs7-as)# routing-key foo ?
  DPC  Destination Point Code

I believe this should be clarified in both the VTY and the user manual.

So if routing context 0 is configured by the user (or e.g. used implicitly since no VTY command is set
to use a routing context), then no routing context IE should be used in all M3UA messaging. This is
required for compatibility with peers that don't implement routing contexts.

Given that the Routing Context IE is optional, I am still not certain about the expected behavior:

  • Routing Context IE is present and contains value > 0: use routing context in all M3UA messages,
  • Routing Context IE is present and contains value 0 (special case): no routing context shall be used,
  • Routing Context IE is not present: handle_rkey_reg() allocates a routing context dynamically?

Is my understanding correct?

Speaking of the peers that don't implement routing contexts, are they expected to omit the Routing Context IE in the REG REQ message?

fixeria wrote in #note-3:

This patch fixes the problem for me:

https://gerrit.osmocom.org/c/libosmo-sccp/+/32185 xua_rkm: handle_rkey_reg(): properly handle Routing Context IE [NEW]

I set this patch Work-in-Progress for now. I would like to have a better understanding before merging it.

Actions #6

Updated by laforge about 1 year ago

fixeria wrote in #note-5:

I must admit it was not obvious to me that this special "no routing context shall be used" case exists.

See Section 87.7.1 of the OsmoBSC user manual, titled M3UA without Routing Context IE / Routing Context 0

I checked https://downloads.osmocom.org/docs/osmo-stp/master/osmostp-usermanual.pdf, but could not find anything about the special case.

It appears that it was doumented only for the BSC, as that is where the feature was needed?

Given that the Routing Context IE is optional, I am still not certain about the expected behavior:

  • Routing Context IE is present and contains value > 0: use routing context in all M3UA messages,

ACK

  • Routing Context IE is present and contains value 0 (special case): no routing context shall be used,

I think we should reject such scenarios, as we don't support it?

  • Routing Context IE is not present: handle_rkey_reg() allocates a routing context dynamically?

Is my understanding correct?

I don't know, and I don't have the time to dig into this.

As I also mentioned earlier here or in gerrit, the "no routing context" situation is only imporant in situations where no RKM, or at least no "dynamic RKM registration" is used. The latter is a strictly osmocom specific extension of M3UA, and in suhc scenarios we can enforce a routing context always to be used.

Speaking of the peers that don't implement routing contexts, are they expected to omit the Routing Context IE in the REG REQ message?

The entire point of RKM is to manage routing contexts. I really don't see how RKM would ever make sense without a routing context.

AFAICT, the M3UA spec conceptually ever only works in the below 3 cases:
  • are not using routing contexts (and related IEs are never present), or
  • you are using routing contexts and have traditional, manual configuration on both sides without using RKM, or
  • you are using routing contexts with RKM (with or without osmocom specific dynamic registration/creation of them)
Actions #7

Updated by fixeria about 1 year ago

  • Status changed from Feedback to Closed
  • % Done changed from 20 to 0

Hi Harald,

laforge wrote in #note-6:

fixeria wrote in #note-5:

I must admit it was not obvious to me that this special "no routing context shall be used" case exists.

See Section 87.7.1 of the OsmoBSC user manual, titled M3UA without Routing Context IE / Routing Context 0

thanks, indeed it's there.

I checked https://downloads.osmocom.org/docs/osmo-stp/master/osmostp-usermanual.pdf, but could not find anything about the special case.

It appears that it was doumented only for the BSC, as that is where the feature was needed?

AFAICS, not only for osmo-bsc but also for osmo-{msc,sgsn,smlc} (grep for include of common/chapters/cs7-config.adoc). I would expect osmo-stp's user manual to shed some light on this, because osmo-stp does implement special treatment of rctx 0 too. Just sharing a bit of my user experience.

Given that the Routing Context IE is optional, I am still not certain about the expected behavior:

  • Routing Context IE is present and contains value > 0: use routing context in all M3UA messages,

ACK

  • Routing Context IE is present and contains value 0 (special case): no routing context shall be used,

I think we should reject such scenarios, as we don't support it?

  • Routing Context IE is not present: handle_rkey_reg() allocates a routing context dynamically?

Is my understanding correct?

I don't know, and I don't have the time to dig into this.

There is a bit of misunderstanding here, sorry. The three cases I was talking about here are all about handling of the M3UA REG REQ message, for which the Routing Context IE is also defined as optional. After reading up the code and playing with configs a bit more, I found out that the last two cases b) Routing Context IE being present and carrying value 0 and c) Routing Context IE not being present are treated equally meaning that the Routing Context must not be used.

I would probably be more logical to interpret case b) as a valid implicit routing context, and interpret case c) as "no Routing Context must not be used", but we cannot change the existing behavior without breaking stuff for people relying on it. This is what my patch is doing, and indeed it's breaking the special meaning of Routing Context 0, so I abandoned it.

For those facing this issue too, I solved the race condition problem by changing the routing context value in osmo-sgsn.cfg to a unique non-zero value. I am guessing setting routing context to 0 for all SS7-speaking components would also be an option in my case, but I have not tested it. Closing this ticket.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)