Project

General

Profile

Bug #3502

measurement.c:563 (...) No measurements for SUB!!!

Added by fixeria about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/27/2018
Due date:
% Done:

100%

Spec Reference:

Description

I just rebased to the recent version of OsmoBTS, and noticed the following output during a voice call:

<0000> rsl.c:741 (bts=0,trx=0,ts=2,pchan=TCH/F) (ss=0) TCH_F Tx CHAN ACT ACK
<0011> lapd_core.c:920 Store content res. (dl=0x7fdce7d24478)
<0007> scheduler.c:926 058538/44/12/41/14 Substituting all-zero burst (current_fn=58539, elapsed=2
<0007> scheduler.c:926 058564/44/12/16/40 Substituting all-zero burst (current_fn=58565, elapsed=2
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<000e> rsl.c:1954 (bts=0,trx=0,ts=2,ss=0) IPAC set RTP socket parameters: 1
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!
<000e> l1sap.c:96 RTP clock out of sync with lower layer: 480 vs 160 (61039->61052)
<0000> rsl.c:1656 (bts=0,trx=0,ts=2,ss=0) Sending RTP delete indication: cause = Normal event, unspecified
<000e> trau/osmo_ortp.c:632 Closing RTP socket on Channel Release RTP Tx(642 pkts, 28890 bytes) Rx(505 pkts, 22725 bytes, 0 late, 0 loss, 0 qmax)
<0000> rsl.c:720 (bts=0,trx=0,ts=2,pchan=TCH/F) (ss=0) TCH_F Tx CHAN REL ACK

Note: this was observed while using OsmocomBB, which has some problems with Measurement Reports,
please see: http://lists.osmocom.org/pipermail/baseband-devel/2018-August/005572.html

meas.pcapng.gz meas.pcapng.gz 4.16 KB fixeria, 08/27/2018 06:02 AM
osmo_bts_trx.log osmo_bts_trx.log 128 KB fixeria, 08/29/2018 07:35 PM
osmo-bts-sysmo.log osmo-bts-sysmo.log 14.2 KB dexter, 09/05/2018 01:05 PM

History

#1 Updated by laforge about 1 year ago

  • Assignee set to dexter

#2 Updated by laforge about 1 year ago

On Mon, Aug 27, 2018 at 06:02:29AM +0000, fixeria [REDMINE] wrote:

I just rebased to the recent version of OsmoBTS, and noticed the following output during a voice call:

Using which osmo-bts backend/hardware?

<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!

This message means that the BTS (common part above L1SAP) didn't reeceive any
uplink measurements (i.e. measurements the BTS PHY takes itself) for those
frames that are used for the -SUB mesaurements. This is typically (at least)
the uplink SACCH. So if the message is correct, then no uplink SACCH was received
during one measurement period (SACCH multiframe, which is 102 TDMA frames long)

<0004> measurement.c:563 (bts=0,trx=0,ts=2,ss=0) No measurements for SUB!!!

[...]

and this behavior seems to continue again and again.

Note: this was observed while using OsmocomBB, which has some problems with Measurement Reports,
please see: http://lists.osmocom.org/pipermail/baseband-devel/2018-August/005572.html

This discussion you're linking to is about downlink measuremnt reports which the MS sends to the BTS.

I don't [yet?] see any connection between the two issues.

I'm assigning this to dexter, as he's been working on fixing various issues
in the uplink measurement processing of OsmoBTS recently.

#3 Updated by dexter about 1 year ago

Hello fixeria. It would be very interesting to know which codec was in use? Did you use AMR here?

#4 Updated by fixeria about 1 year ago

Hi,

sorry for so brief issue description, had no time to describe in details...

Using which osmo-bts backend/hardware?

  • FakeTRX with osmo-bts-trx
  • OsmocomBB via trxcon

This message means that the BTS (common part above L1SAP) didn't
reeceive any uplink measurements [...]

This discussion you're linking to is about downlink measuremnt
reports which the MS sends to the BTS.

Ok, thanks for the explanation!

I don't [yet?] see any connection between the two issues.

No connection, I thought it's related to the DL measurements.

It would be very interesting to know which codec was in use?
Did you use AMR here?

The problem was observed on a TCH/F channel with both FR and EFR
codecs in use. I didn't (yet) test AMR as it isn't supported by
OsmocomBB/trxcon...

Feel free to check the attached capture ;)

#5 Updated by fixeria about 1 year ago

The problem is also observed with a regular phone and OsmoTRX.

#6 Updated by dexter about 1 year ago

I am a bit puzzled here. I have now checked through measurement.c and increased the unit-test coverage. There were indeed some bugs, but none of them explains the behavior we see here. When you see the message "No measurements for SUB!!!" it means that there are incoming measurements, otherwise the calculation routine would not trigger. New is that we now can detect if we missed a SACCH interval. This means also a random measurement can trigger the computation algorithm and the computation then runs with whatever is in the buffer.

I have a few questions:
Are you sure that you have good reception so that we can assume that all frames are received and the problem is indeed a software problem? (I assume yes)
Can you increase the log level (DMEAS to debug) so that we can see at which frame numbers measurements are added?
Do you have DTX enabled?

I also tried to reproduce the problem And I think this is probably similar or equal to yours:

Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302743/982/13/50/07 (bts=0,trx=0,ts=2,ss=0) adding measurement (is_sub=0), num_ul_meas=0
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302743/982/13/50/07 RX L1 frame (bts=0,trx=0,ts=3,ss=0) fn=1302743 chan_nr=0x0b MS pwr=16dBm rssi=0.0 dBFS ber=0.00% (0/378 bits) L1_ta=0 rqd_ta=0 toa256=153
Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302743/982/13/50/07 (bts=0,trx=0,ts=3,ss=0) adding measurement (is_sub=0), num_ul_meas=3
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302747/982/17/03/11 RX L1 frame (bts=0,trx=0,ts=2,ss=0) fn=1302747 chan_nr=0x0a MS pwr=16dBm rssi=0.0 dBFS ber=0.00% (0/378 bits) L1_ta=0 rqd_ta=0 toa256=58
Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302747/982/17/03/11 (bts=0,trx=0,ts=2,ss=0) adding measurement (is_sub=0), num_ul_meas=1
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302747/982/17/03/11 RX L1 frame (bts=0,trx=0,ts=3,ss=0) fn=1302747 chan_nr=0x0b MS pwr=16dBm rssi=0.0 dBFS ber=0.00% (0/378 bits) L1_ta=0 rqd_ta=0 toa256=161
Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302747/982/17/03/11 (bts=0,trx=0,ts=3,ss=0) adding measurement (is_sub=0), num_ul_meas=4
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302751/982/21/07/15 RX L1 frame (bts=0,trx=0,ts=2,ss=0) fn=1302751 chan_nr=0x0a MS pwr=16dBm rssi=0.0 dBFS ber=0.00% (0/378 bits) L1_ta=0 rqd_ta=0 toa256=58
Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302751/982/21/07/15 (bts=0,trx=0,ts=2,ss=0) adding measurement (is_sub=0), num_ul_meas=2
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302751/982/21/07/15 RX L1 frame (bts=0,trx=0,ts=3,ss=0) fn=1302751 chan_nr=0x0b MS pwr=16dBm rssi=1.0 dBFS ber=0.00% (0/378 bits) L1_ta=0 rqd_ta=0 toa256=157
Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302751/982/21/07/15 (bts=0,trx=0,ts=3,ss=0) adding measurement (is_sub=0), num_ul_meas=5
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302756/982/00/12/20 RX L1 frame (bts=0,trx=0,ts=2,ss=0) fn=1302756 chan_nr=0x0a MS pwr=16dBm rssi=0.0 dBFS ber=0.00% (0/456 bits) L1_ta=0 rqd_ta=0 toa256=56
Wed Aug 29 15:24:10 2018 <0004> measurement.c:418 1302756/982/00/12/20 (bts=0,trx=0,ts=2,ss=0) adding measurement (is_sub=1), num_ul_meas=3
Wed Aug 29 15:24:10 2018 <0011> lapd_core.c:1556 N(S) sequence error: N(S)=1, V(R)=2 (dl=0x7fd4f2094478 state LAPD_STATE_MF_EST)
Wed Aug 29 15:24:10 2018 <0004> l1_if.c:515 1302756/982/00/12/20 RX L1 frame (bts=0,trx=0,ts=3,ss=0) fn=1302756 chan_nr=0x0b MS pwr=16dBm rssi=1.0 dBFS ber=0.00% (0/378 bits) L1_ta=0 rqd_ta=0 toa256=157
Wed Aug 29 15:24:10 2018 <0004> measurement.c:623 1302756/982/00/12/20 (bts=0,trx=0,ts=3,ss=0) measurement interval ending missed, catching up...
Wed Aug 29 15:24:10 2018 <0004> measurement.c:297 (bts=0,trx=0,ts=3,ss=0) meas period end fn:1302755, fn_mod:25, status:1, pchan:TCH/F
Wed Aug 29 15:24:10 2018 <0004> measurement.c:565 (bts=0,trx=0,ts=3,ss=0) No measurements for SUB!!!

There are two calls, FR on TS2 and TS3. I do DTX, so I expect to only a very few measurements when the voice is silent. I am also ok with the double frame numbers for the TCH data, as it is on two different timeslots, but in parallel, so it is on the same frame. What I am not ok with are the totally offset frame numbers and the SACCH measurement I get for the Idle channel. (I have identified the SUB measurements by the fact that they are 4 frame numbers apart form each other.)

== For TS2: ==
Fn:            Fn%104
1302743        39        SACCH TS2 ?
1302747        43        DTX/SUB, but in wrong position ?
1302751        47        DTX/SUB, but in wrong position ?
1302756        52        SACCH TS3, shouldn't this be Idle ?

== For TS3: ==
Fn:            Fn%104
1302743        39        SACCH TS2, shouldn't this be Idle ?
1302747        43        DTX/SUB, but in wrong position ?
1302751        47        DTX/SUB, but in wrong position ?
1302756        52        SACCH TS3

I need to investigate this further, If you have any hints for me, please let me know.

#7 Updated by fixeria about 1 year ago

Hi,

Are you sure that you have good reception

Yep. Especially in case of FakeTRX/trxcon. When I have been testing with
a regular phone, I was pretty close to the TRX, and there were no
missing bursts observed.

Do you have DTX enabled?

DTX is not supported by trxcon at the moment. Regarding to a regular
phone, I have no idea how to check it. Is it indicated anywhere?

Can you increase the log level (DMEAS to debug) so that we can see
at which frame numbers measurements are added?

Sure. Please see attached logs with DMEAS of a TCH/F call (signalling only).

P.S. Please note that this problem doesn't appear on SDCCH channels.
P.P.S. I didn't manage to grab DMEAS messages only as the logging is broken :/

#8 Updated by fixeria about 1 year ago

Hmm, BTW, regarding to the first line with ~50% BER:

l1_if.c:515 002574/01/00/24/14 RX L1 frame (bts=0,trx=0,ts=2,ss=0)
fn=2574 chan_nr=0x0a MS pwr=5dBm rssi=-70.0 dBFS ber=50.26% (190/378 bits)
L1_ta=0 rqd_ta=0 toa256=0

This is a bit unrelated, but very interesting detail, actually related
to the TCH burst interleaving. In case of TCH/F, a L2 frame (456 bits)
is interleaved over 8 bursts by taking even bits of the first 4 bursts
and odd bits of the last 4 bursts. So, each burst on TCH/F basically
carries 57 bits of one frame, and 57 bits of another.

As soon as a TCH/F logical channel is activated, the first L2 frame
is encoded and interleaved over 8 bursts, but unlike the usual case,
where each burst carries 57 bits of one frame, and 57 bits of another,
there is only one frame available, so the first 4/8 bursts will carry
the coded bits on even bits, and dummy zeros on odd bits.

This behaviour is specific to block diagonal interleaving only, which
is used for both TCH/F and TCH/H channels. And this is why we see
this ~50% BER! GSM 05.03 doesn't contain any recommendations how
to avoid this, but I have an idea.

On TCH/F a burst decoding attempt is triggered every 4 bursts, on
TCH/H - every 2 bursts (a bit more complex interleaving). I think
we can somehow skip the first (after lchan activation) decoding
attempts on TCH/F, and two first attempts on TCH/H. This would
allow to avoid such fake BER results and save some CPU resource ;)

#9 Updated by laforge about 1 year ago

On Wed, Aug 29, 2018 at 07:35:20PM +0000, fixeria [REDMINE] wrote:

DTX is not supported by trxcon at the moment. Regarding to a regular
phone, I have no idea how to check it. Is it indicated anywhere?

you will see it in the channel activation (RR for the phone, RSL for the BTS), there
are separate flags whether DTXd and DTXu are active or not.

#10 Updated by laforge about 1 year ago

On Wed, Aug 29, 2018 at 08:02:52PM +0000, fixeria [REDMINE] wrote:

On TCH/F a burst decoding attempt is triggered every 4 bursts, on
TCH/H - every 2 bursts (a bit more complex interleaving). I think
we can somehow skip the first (after lchan activation) decoding
attempts on TCH/F, and two first attempts on TCH/H. This would
allow to avoid such fake BER results and save some CPU resource ;)

The problem is that the activation on BTS side and on MS side are asynchronous
and you have no idea when the phone actually transmits the first valid burst
after the BTS has activated the channel.

#11 Updated by dexter about 1 year ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 90

I found out what went wrong. The logic that checks if the measurement of the SACCH block is causing the problem by detecting a missed interval end at the wrong position. What happens is that the measurement computation is triggered early. This generates the measurement result, sets the flags and resets the array that holds the individual measurements. Unfortunately due to another bug no RSL measurement report is triggered. Which means the results from this spurious computation run are just tossed.

Since the official end of the measurement reporting period is not reached new measurement reports are added. When the end is reached the computation runs with the data that had been collected between the spurious end and the correct end. This interval may lack SUB-Measurements and thats why we see "No measurements for SUB!!!". We will still get measurement reports in the correct intervals but with wrong results.

See also:
https://gerrit.osmocom.org/#/c/osmo-bts/+/10760 measurement: fix SACCH interval remapping in test_is_meas_overdue()
https://gerrit.osmocom.org/#/c/osmo-bts/+/10761 measurement: send measurement reports for missed interval ends
https://gerrit.osmocom.org/#/c/osmo-bts/+/10762 measurement: display fn_mod when measuremnet is added

Unfortunately gerrit #10761 causes problems with TTCN3 tests, thats why I marked it as WIP for now. The following tests do fail:
BTS_Tests.TC_rll_rel_ind_DCCH_0
BTS_Tests.TC_rll_rel_ind_DCCH_3
skipped BTS_Tests.TC_rll_unit_data_req_DCCH
skipped BTS_Tests.TC_rll_unit_data_req_ACCH
skipped BTS_Tests.TC_rll_unit_data_ind_DCCH
skipped BTS_Tests.TC_rll_unit_data_ind_ACCH

The other two should be fine, the docker tests show no regressions.

#12 Updated by dexter about 1 year ago

As suggested I have tested the patches on osmo-bts-sysmo. The results are not very promising. It looks like the timing between the voice measurements and the SACCH measurements is shifted differently here. I wonder if and how we can resolve this. Presumably every phy vendor does its own timing here. The sacch measurement timing in osmo-bts-sysmo is offset by one compared to the timing of osmo-bts-trx.

          1         2         3         4         5         6         7         8         9         0
01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
            v            v            v            v            v            v            v            v

MEASUREMENT REPORTING TCH/F TS2
             interval end  interval start
-------------------------||-----------------------------------------------------------------------------
TTTTTTTTTTTTSTTTTTTTTTTTTITTTTTTTTTTTTSTTTTTTTTTTTTITTTTTTTTTTTTSTTTTTTTTTTTTITTTTTTTTTTTTSTTTTTTTTTTTTI
M   M   M    M   M   M    M   M   M   MM   M   M    M   M   M    M   M   M    M   M   M    M   M   M
     ---    A                         .
      |     |______remaps to__________|
      |                               |
      |                               |
      |______measurement received_____|
                somewhere here?

Expected sequence: 0, 4, 8, 38, 13, 17
Sysmobts sequence: 0, 4, 38, 8, 13, 17

(see also attached logfile from osmo-bts-sysmo)

#13 Updated by dexter about 1 year ago

It looks like the currently existing approach to check for lost SACCH blocks is mismatching the reality. There is an update concerning this in #2975

#14 Updated by dexter about 1 year ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100

I think we can set this to resolved. The problem particular problem has been fixed. For further information on the underlaying problem see: #2975

#15 Updated by fixeria about 1 year ago

Yep, I am happy to confirm that I don't see the error messages anymore.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)