Project

General

Profile

Actions

Feature #6076

closed

Configure SCTP primary path explicitly (local-ip A.B.C.D [primary] && remote-ip A.B.C.D [primary])

Added by pespin 10 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
06/28/2023
Due date:
% Done:

100%

Spec Reference:

Description

The SCTP specs have the concept of "primary path", which is the path expected to be used in general to transmit all data as long as is considered active. Once considered inactive, other paths are selected until the primary path is considered active again (ie HEARTBEAT working again on that path).

Right now, in libosmo-sccp we are not explicitly configuring the "primary path" in any way, hence letting the linux kernel SCTP implementation decide the best primary path. That may be good in some cases, but some specific setups may have specific requirements about having a given primary path (well known and deterministic).

This ticket consists on 2 tasks:

[A] Investigate "implicit" primary path selection

Finding out, understanding and documenting the "implicit" primary path selection used by the linux kernel SCTP implementation when no "explicit" primary path is requested by userspace (the app, such as libosmo-sccp/osmo-bsc VTY config).

[B] Allow (optionally) configuring a excplicit "primary path" on libosmo-sccp VTY config

Extend the existing "local-ip" and "remote-ip" VTY commands under the "asp" node to have an optional parameter "[primary]" ("local-ip A.B.C.D [primary]" && "remote-ip A.B.C.D [primary]"), which if set, will instruct the kernel SCTP implementation to use that local and remote IP addresses and primary.
The last local-ip with "primary" set on it will replace the ones configured as "primary" before it, so that there's only 1 primary configured at any time. Same goes for the "remote-ip" set.

This feature is exposed in the socket API through setsockopt() SCTP_PRIMARY_ADDR and SCTP_SET_PEER_PRIMARY_ADDR.

Under the hood, this will:
- local-ip: Adds the "Set Primary IP Address" TLV (https://www.rfc-editor.org/rfc/rfc5061#section-4.2.4) to INIT/INIT-ACK. "It requests the receiver to mark the specified address as the primary address to send data to (see Section 5.1.2 of [RFC4960]). The receiver MAY mark this as its primary address upon receiving this request."
- remote-ip: Tells the kernel to select that IP address as primary if found in the INIT/INIT-ACK/ASCONF.

In theory this should all be feasible by calling setsockopt() after sctp_bind() & sctp_connect().
Dynamic SCTP association reconfiguration of addresses and primary flag through VTY will be described in a separate ticket.

Related RFCs:
https://datatracker.ietf.org/doc/html/rfc4960
https://www.rfc-editor.org/rfc/rfc5061


Files


Related issues

Related to libosmo-sccp + libosmo-sigtran - Feature #6077: Dynamic reconfiguration of SCTP IP addresses & primary flag in an active SCTP association through VTYResolvedpespin06/28/2023

Actions
Related to libosmo-sccp + libosmo-sigtran - Bug #4607: unable to remove a remote IP address of an ASPResolvedpespin06/10/2020

Actions
Actions #1

Updated by pespin 10 months ago

  • Description updated (diff)
Actions #2

Updated by pespin 10 months ago

  • Related to Feature #6077: Dynamic reconfiguration of SCTP IP addresses & primary flag in an active SCTP association through VTY added
Actions #4

Updated by pespin 10 months ago

[A] Investigate "implicit" primary path selection in SCTP kernel stack

  • Primary local address : Doesn't seem to be using/indicating local primary in INIT/INIT_ACK. The information about which local IP address is the primary one doesn't even seem to be stored in the sctp stack data model (matches non-existing getsockopt(SCTP_SET_PEER_PRIMARY_ADDR)). Only use case is setsokopt(SCTP_SET_PEER_PRIMARY_ADDR) which triggers transmission of ASCONF on the wire to notify the peer it should use that one. Hence, no "implicit" local primary address set/announced by default.
    The local IP address list is generated in:
    sctp_make_init() (sctp_make_init_ack() for server)
        sctp_bind_addrs_to_raw()
        retval->param_hdr.v = sctp_addto_chunk(retval, addrs_len, addrs.v);
    
  • Primary remote address :
    • CLIENT mode: Primary is assigned to the one where the INIT is transmitted (If INIT fails, SCTP_CMD_INIT_CHOOSE_TRANSPORT (sctp_assoc_choose_alter_transport()) will take next address and set is as primary and retry):
      SCTP_CMD_INIT_CHOOSE_TRANSPORT
          sctp_assoc_set_primary.
      

      INIT_ACK happens in sctp_sf_do_5_1C_ack, and since it comes from the source of INIT, primary doesn't need to be updated.
    • SERVER mode: source address of rx INIT is used as Primary by default:
      sctp_process_init() /* Handle init: */
          /* We must include the address that the INIT packet came from.
           * This is the only address that matters for an INIT packet.
           * When processing a COOKIE ECHO, we retrieve the from address
           * of the INIT from the cookie.
           */
          /* This implementation defaults to making the first transport
           * added as the primary transport.  The source address seems to
           * be a better choice than any of the embedded addresses.
           */
          sctp_assoc_add_peer(sctp_source(chunk))
              /* If we do not yet have a primary path, set one.  */
              if (!asoc->peer.primary_path) {
                  sctp_assoc_set_primary(asoc, peer);
                  asoc->peer.retran_path = peer;
              }
          sctp_process_param()
              sctp_assoc_add_peer()
              sctp_assoc_set_primary()
      
Other comments:
  • When the primary path is considered to be come inactive, a new path is marked as primary. This means the primary is overwritten!!! Then we need to listen on ADDR_ADDED and set it back when available:
    sctp_do_8_2_transport_strike()
        if (transport->error_count > transport->ps_retrans &&
            asoc->peer.primary_path == transport &&
            asoc->peer.active_path != transport)
            sctp_assoc_set_primary(asoc, asoc->peer.active_path);
    
  • When a remote address marked as primary is removed, then the next in the list is marked as primary.

Summary:

The kernel takes as remote primary the path used to do the initial handshake (INIT,INIT_ACK). If this path becomes inactive the primary may be changed to another path. If the address used as the primary goes down, the primary is changed to the next in the list. If the peer asks to change the primary through ASCONF, it will change (as long as the requested one is valid ofc).

Actions #5

Updated by pespin 9 months ago

  • Status changed from New to In Progress

Started work on this. WIP can be found at:
libosmo-netif.git branch "pespin/primary"
libosmo-sccp.git branch "pespin/primary"

Actions #6

Updated by pespin 9 months ago

  • % Done changed from 0 to 30

I submitted a bunch of patches to libosmo-netif fixing and improving several issues around existing osmo_stream code.

On top I have a patch which right now applying the Primary addresses configured in libosmo-sccp's VTY (osmo-bsc) using setsockopt (I still need to decide whether that code will belong to osmo_stream or will be handled by the osmo_stream user):

cs7 instance 0
 point-code 0.0.2
 asp asp0 2905 0 m3ua
  local-ip 127.0.0.2
  local-ip 127.123.0.2 primary
  local-ip ::1
  remote-ip 127.0.0.1
  remote-ip 127.123.0.1 primary
  remote-ip ::1
  role asp
  sctp-role client

ASP name now prints the configured primary address (marked with a "*"):

20230804203810956 DLSS7 osmo_ss7.c:1696 0: asp-asp0: Restarting ASP asp0, r=(127.0.0.1|127.123.0.1*|::1):2905<->l=(127.0.0.2|127.123.0.2*|::1):0

the setsockopt() are still failing for some reason I need to find out:

20230804203810957 DLINP stream_cli.c:353 CLICONN(asp0,r=::1:2905<->l=::ffff:127.0.0.2:40825){CONNECTING} connection established
20230804203810957 DLINP stream.c:106 sizes of 'struct sctp_event_subscribe': compile-time 14, kernel: 14
20230804203810957 DLINP stream_cli.c:368 CLICONN(asp0,r=::1:2905<->l=::ffff:127.0.0.2:40825){CONNECTED} Set Peer's Primary Address 127.123.0.2
20230804203810957 DLINP stream.c:228 setsockopt(SCTP_SET_PEER_PRIMARY_ADDR, 127.123.0.2) failed: Operation not permitted
20230804203810957 DLINP stream_cli.c:373 CLICONN(asp0,r=::1:2905<->l=::ffff:127.0.0.2:40825){CONNECTED} Set Primary Address 127.123.0.1
20230804203810957 DLINP stream.c:250 setsockopt(SCTP_PRIMARY_ADDR, 127.123.0.1) failed: Invalid argument

Related code is here:

+int stream_setsockopt_peer_primary_addr(int fd, const struct osmo_sockaddr *saddr)
+{
+       int rc;
+
+       struct sctp_setpeerprim so_sctp_setpeerprim = {0};
+
+       /* rfc6458 sec 8: "For the one-to-one style sockets and branched-off one-to-many
+        * style sockets (see Section 9.2), this association ID parameter is ignored" 
+        */
+
+       so_sctp_setpeerprim.sspp_addr = saddr->u.sas;
+       rc = setsockopt(fd, IPPROTO_SCTP, SCTP_SET_PEER_PRIMARY_ADDR,
+                       &so_sctp_setpeerprim, sizeof(so_sctp_setpeerprim));
+       if (rc < 0) {
+               char buf[128];
+               strerror_r(errno, (char *)buf, sizeof(buf));
+               LOGP(DLINP, LOGL_ERROR, "setsockopt(SCTP_SET_PEER_PRIMARY_ADDR, %s) failed: %s\n",
+                    osmo_sockaddr_to_str(saddr), buf);
+       }
+       return rc;
+}
+
+int stream_setsockopt_primary_addr(int fd, const struct osmo_sockaddr *saddr)
+{
+       int rc;
+
+       struct sctp_prim so_sctp_prim = {0};
+
+       /* rfc6458 sec 8: "For the one-to-one style sockets and branched-off one-to-many
+        * style sockets (see Section 9.2), this association ID parameter is ignored" 
+        */
+
+       so_sctp_prim.ssp_addr = saddr->u.sas;
+       rc = setsockopt(fd, IPPROTO_SCTP, SCTP_PRIMARY_ADDR,
+                       &so_sctp_prim, sizeof(so_sctp_prim));
+       if (rc < 0) {
+               char buf[128];
+               strerror_r(errno, (char *)buf, sizeof(buf));
+               LOGP(DLINP, LOGL_ERROR, "setsockopt(SCTP_PRIMARY_ADDR, %s) failed: %s\n",
+                    osmo_sockaddr_to_str(saddr), buf);
+       }
+       return rc;
+}

Actions #7

Updated by pespin 9 months ago

After adding the ports to the sockaddress:

osmo_sockaddr_set_port(&cli->local_primary_addr.u.sa, cli->local_port);
osmo_sockaddr_set_port(&cli->rem_primary_addr.u.sa, cli->port);

"Set Primary Adress" (our Tx one) is working fine.

However, setting the peer's one is not working yet:

20230804205146112 DLINP stream_cli.c:369 CLICONN(asp0,r=::1:2905<->l=::ffff:127.0.0.2:51340){CONNECTED} Set Peer's Primary Address 127.123.0.2
20230804205146112 DLINP stream.c:228 setsockopt(SCTP_SET_PEER_PRIMARY_ADDR, 127.123.0.2) failed: Operation not permitted

Actions #8

Updated by pespin 9 months ago

I wasn't applying the local port correctly, because cli->local_port=0 if picked by the kernel. I used the following:

                if (cli->local_port == 0) {
                    osmo_sock_get_local_ip_port(fd, port_buf, sizeof(port_buf));
                    osmo_sockaddr_set_port(&cli->local_primary_addr.u.sa, atoi(port_buf));
                } else {
                    osmo_sockaddr_set_port(&cli->local_primary_addr.u.sa, cli->local_port);
                }

Now the port shows up correctly (checked with wireshark the SCTP assoc), but still same error:

20230804210105062 DLINP stream_cli.c:375 CLICONN(asp0,r=::1:2905<->l=::ffff:127.0.0.2:53005){CONNECTED} Set Peer's Primary Address 127.123.0.2:53005
20230804210105062 DLINP stream.c:228 setsockopt(SCTP_SET_PEER_PRIMARY_ADDR, 127.123.0.2:53005) failed: Operation not permitted
20230804210105062 DLINP stream_cli.c:381 CLICONN(asp0,r=::1:2905<->l=::ffff:127.0.0.2:53005){CONNECTED} Set Primary Address 127.123.0.1:2905

I need to check the kernel whether this is actually implemented in my kernel version (6.4.7).

Actions #9

Updated by pespin 9 months ago

The "Operation not permitted" during "setsockopt(SCTP_SET_PEER_PRIMARY_ADDR)" failed due to sysctl "net.sctp.addip_enable" being set to 0 (the default, meaning ASCONF is disabled by default). Setting it to "1" allows applying the setsockopt and one can see ASCONF being sent to the peer at that time.

I also had to enable sysctl "net.sctp.auth_enable=1" together with net.sctp.addip_enable, otherwise the stack sends INIT with SupportedExtensions=ASCONF,ASCONF_ACK, but the spec says that in order to support the feature it MUST also have "AUTH" extension in the list. If that's not added, the server side answers with an ABORT (unless sysctl net.sctp.addip_noauth_enable is set to 1 afaik).

All this is specified in RFC 5061 section 4.2.7 "Supported Extensions Parameter", and also mentioned in the linux kernel at sctp_make_init():
https://github.com/torvalds/linux/blob/master/net/sctp/sm_make_chunk.c#L254

I attach a pcap showing the feature in action (configured through the "primary" keyword in the VTY configs shared previously in this ticket).

One can see that the primary addresses are not used starting from the instant moment when they are requested, but only after they have been confirmed through HEARTBEAT ping-pong. One can see both sides initially using the first address in the VTY list (implementation detail, if first would fail during INIT, the second would be taken, and so on), Then, after applying primary address and also applying peer's primary addr (ASCONF), after HEARTBEATs occur, then both start using those primary addresses (127.123.0.{1,2}).

Actions #10

Updated by pespin 9 months ago

when peer remote addr goes down (after some HEARTBEAR failure timer once I removed the IP address from the interface):

479    17:53:24.098415804    127.0.0.1    35791    127.0.0.1    4729    GSMTAP    257    0: asp-asp-dyn-3: xUA SRV SCTP_PEER_ADDR_CHANGE: ADDR_UNREACHABLE [::ffff:192.168.123.2]:3000 err=FAILED_THRESHOLD

when remote addr goes up (when I read the ip address on the interface and next HEARTBEAT round happens):

1460    17:56:26.711580068    127.0.0.1    35791    127.0.0.1    4729    GSMTAP    243    0: asp-asp-dyn-3: xUA SRV SCTP_PEER_ADDR_CHANGE: ADDR_AVAILABLE [::ffff:192.168.123.2]:3000 err=None

Actions #11

Updated by pespin 9 months ago

I have implemented the initial applying of (Peer) Primary Address here:
https://gerrit.osmocom.org/c/libosmo-sccp/+/34111 asp: Allow setting IP address as SCTP primary upon conn establishment

Also, another patch which monitors state of the addresses and re-applies Primary Address once it is available again, or if the peer attempts to change it through ASCONF:
https://gerrit.osmocom.org/c/libosmo-sccp/+/34112 asp: Monitor SCTP_PEER_ADDR_CHANGE events to re-apply configured Primary Address

I attach a new pcap showing the behavior, using similar configs as shared above in this ticket (now with the primary address being 192.168.123.{1,2}). What I do in this test is have those 192.168.123.{1,2} addresses assign to some given interface in my system, and after a while, I remove 192.168.123.1 from the interface. Eventually, the client (who whas the primary set to it) find out and logs about it. Then I re-add the IP address and it can be seen how the client finds about it becoming available again and hence re-applies it as Primary Address.

Actions #12

Updated by pespin 9 months ago

  • % Done changed from 30 to 70

I am now enabling ACONF (dependend AUTH) feature per-socket using this patch:
https://gerrit.osmocom.org/c/libosmocore/
/34113 socket: Add osmo_sock_init flag to enable SCTP ASCONF features
https://gerrit.osmocom.org/c/libosmo-netif/+/34114 stream: Use new flag OSMO_SOCK_F_SCTP_ASCONF_SUPPORTED for SCTP sockets

Actions #13

Updated by pespin 8 months ago

  • % Done changed from 70 to 80

All patches implementing the feature have been merged.

Only missing part: Document the new "primary" parameter in user manual, and provide a small description on what's it's meaning for local and remote IP addresses.

Actions #14

Updated by pespin 8 months ago

  • Status changed from In Progress to Resolved
  • % Done changed from 80 to 100

Documentation available in OsmoBSC/MSC/STP User Manual, eg https://ftp.osmocom.org/docs/osmo-bsc/master/osmobsc-usermanual.pdf section "8.5.6 SCTP Primary Address"

Actions #15

Updated by pespin 7 months ago

  • Related to Bug #4607: unable to remove a remote IP address of an ASP added
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)