Project

General

Profile

Actions

Support #6434

open

D-channel do not recover after underrun

Added by pfassberg 25 days ago. Updated 2 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
firmware
Target version:
-
Start date:
04/04/2024
Due date:
% Done:

0%

Spec Reference:

Description

The two units I use are in locations where there is no GNSS coverage.

After some hours of operation there is 7500 over/underruns reported and the server side restart the connection (at least it start over the ECHO sequence numbers).

It seems that no alarms are sent to the PBX so it don't restart the D-channel. After manual restart of osmo-e1d the D-channel is activated again.

What is the best way of overcoming this?

Log from client (PBX) side:

Thu Apr  4 18:36:05 2024 DLINP octoi_clnt_fsm.c:242 OCTOI_CLIENT(N11MD)[0x555613be18e0]{ACCEPTED}: Rx OCTOI ECHO_RESP (seq=1952, rtt=258)
Thu Apr  4 18:36:10 2024 DLINP octoi_clnt_fsm.c:306 OCTOI_CLIENT(N11MD)[0x555613be18e0]{ACCEPTED}: More than 7500 RIFO underruns per second: Your clock appears to be too fast. Disconnecting.
Thu Apr  4 18:36:10 2024 DLINP octoi_fsm.c:186 OCTOI_CLIENT(N11MD)[0x555613be18e0]{WAIT_RECONNECT}: Event RX_TDM_DATA not permitted
Thu Apr  4 18:36:10 2024 DLINP octoi_fsm.c:186 OCTOI_CLIENT(N11MD)[0x555613be18e0]{WAIT_RECONNECT}: Event RX_TDM_DATA not permitted
.
.
.
Thu Apr  4 18:36:13 2024 DLINP octoi_fsm.c:186 OCTOI_CLIENT(N11MD)[0x555613be18e0]{WAIT_RECONNECT}: Event RX_TDM_DATA not permitted
Thu Apr  4 18:36:13 2024 DLINP octoi_fsm.c:186 OCTOI_CLIENT(N11MD)[0x555613be18e0]{WAIT_RECONNECT}: Event RX_TDM_DATA not permitted
Thu Apr  4 18:36:20 2024 DLINP octoi_clnt_fsm.c:268 OCTOI_CLIENT(N11MD)[0x555613be18e0]{WAIT_RECONNECT}: Re-starting connection
Thu Apr  4 18:36:20 2024 DLINP octoi_sock.c:169 192.71.31.42:4271: Tx SERVICE_REQ
Thu Apr  4 18:36:20 2024 DLINP octoi_clnt_fsm.c:100 OCTOI_CLIENT(N11MD)[0x555613be18e0]{SVC_REQ_SENT}: Rx SERVICE_ACK (service=1, server_id='TODO-SRV', software_id='osmo-e1d', software_version='0.6.0.14-ff2c7'
Thu Apr  4 18:36:30 2024 DLINP octoi_clnt_fsm.c:242 OCTOI_CLIENT(N11MD)[0x555613be18e0]{ACCEPTED}: Rx OCTOI ECHO_RESP (seq=1953, rtt=288)

Log from server (MGW) side:

Thu Apr  4 18:36:05 2024 DLINP octoi_srv_fsm.c:309 OCTOI_SERVER(N11MD)[0x5555fae6d8e0]{ACCEPTED}: Rx OCTOI ECHO_RESP (seq=886, rtt=300)
Thu Apr  4 18:36:13 2024 DLINP octoi_srv_fsm.c:383 OCTOI_SERVER(N11MD)[0x5555fae6d8e0]{ACCEPTED}: More than 7500 RIFO underruns per second: Peer clock is too slow. Disconnecting.
Thu Apr  4 18:36:13 2024 DE1D e1oip.c:181 (I0:L0) Peer disconnected
Thu Apr  4 18:36:20 2024 DLINP octoi_sock.c:375 192.71.31.51:4270: peer created
Thu Apr  4 18:36:20 2024 DLINP octoi_srv_fsm.c:78 OCTOI_SERVER[0x5555fae6d8e0]{INIT}: Rx SERVICE REQ (service=1, subscriber='N11MD', software='osmo-e1d'/'0.6.0.14-ff2c7', capabilities=0x00000000)
Thu Apr  4 18:36:20 2024 DE1D e1oip.c:136 (I0:L0) New OCTOI client connection for N11MD
Thu Apr  4 18:36:20 2024 DLINP octoi_sock.c:237 N11MD: Tx SERVICE_ACK
Thu Apr  4 18:36:30 2024 DLINP octoi_srv_fsm.c:309 OCTOI_SERVER(N11MD)[0x5555fae6d8e0]{ACCEPTED}: Rx OCTOI ECHO_RESP (seq=1, rtt=293)

Actions #1

Updated by laforge 22 days ago

  • Assignee deleted (laforge)
Actions #2

Updated by laforge 22 days ago

On Thu, Apr 04, 2024 at 06:50:48PM +0000, pfassberg wrote:

The two units I use are in locations where there is no GNSS coverage.

This is currently not really intended/supported, I'm sorry.

In theory it should be possible to for one side to run "free-running" and the other end to recover timing from UDP packet arrival times/intervals, and then adjust its built-in VCTCXO to track that timing. I think the firmware/gateware exposes the required commands to adjust the coarse and fine tuning values. Nobody has yet implemented this so far. I believe cquirin wanted to put some of his students up to doing this as a project/thesis?

What is the best way of overcoming this?

The best way would be to use inter-packet arrival timing to tune the local oscillator on one side, as described above.

A crude hack could be the signaling of alarms for a certain amount of time. I don't know the detailed behaviour off my head right now, but I would expect that when osmo-e1d is stopped, the icE1usb would send some alarm to the local E1 interface? If that's the case, then maybe we could do something similar in this situation. Either by closing the USB endpoints, or by manually transmitting blue (all timeslots 0xff) until the connection is re-established?

For many weeks to come (sysmocom company move, then OsmoDevCon) I will not have any time to look at any issues, sorry.

Actions #3

Updated by tnt 2 days ago

I have some very ugly hack that attempts that. It was just a "viability" test to see how well this would work.
It's not actually based off the packet timing, but just using the FIFO level ...

It might need some adaptation for newer code.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)