Project

General

Profile

Bug #3728

Problem using compressed ACK/NACK bitmaps or uncompressed bitmaps without length

Added by laforge 10 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Urgent
Assignee:
Target version:
-
Start date:
12/13/2018
Due date:
% Done:

100%

Spec Reference:
TS 44.060 (9.1.8|12.3)

Description

The issue is with uplink ACK/NACK coding in case of using compressed bitmaps or uncompressed bitmaps without length.

This issue is reproducible, kluchnikov made tests with FTP uploading and after some period of time uplink just stalled forever, until it timed out. They made a workaround for this issue, forced uncompressed bitmaps with length and uplink worked in this mode, but other modes should be fixed.


Checklist

  • check if RLC tests are possible in our TTCN3

Related issues

Related to OsmoPCU - Bug #4052: Test uncompressed bitmaps without lengthClosed06/07/2019

History

#1 Updated by msuraev 10 months ago

kluchnikov please share as much details on how to reproduce this as possible:
- config files
- how processes are started
- what have to be changed to enable the workaround
- any particular reason for choosing FTP upload (vs http, scp etc) for test case?
- how long we have to wait, how do we know issue is triggered (particular pattern in the logs?)
etc.

#2 Updated by laforge 6 months ago

  • Assignee set to lynxis
  • Priority changed from Normal to Urgent

#3 Updated by lynxis 6 months ago

  • Status changed from New to In Progress

#4 Updated by lynxis 6 months ago

  • Subject changed from Problem using compressed ACK/NACK bitmaps or uncompressed bitmaps witout length to Problem using compressed ACK/NACK bitmaps or uncompressed bitmaps without length

#5 Updated by lynxis 6 months ago

  • Spec Reference set to TS 44.060

#6 Updated by lynxis 6 months ago

  • Checklist item check if RLC tests are possible in our TTCN3 added
  • Checklist item create a TTCN3 testplan added

#7 Updated by lynxis 6 months ago

  • Spec Reference changed from TS 44.060 to TS 44.060 (9.1.8|12.3)

#8 Updated by lynxis 5 months ago

  • Checklist item deleted (create a TTCN3 testplan)

I've stopped implementing TTCN3 tests and switched over to an experiment based approach.

So far uploading of a 2MB to a FTP works, however in my lab environment, there was no packet lost and because of this, there were no compressed ack, even when activated.
While testing, I've noticed a high retransmission statistics counter, while looking closer, the PCU is sending a single RSL package multiple times without using the window size.

#9 Updated by laforge 5 months ago

Hi Lynxis,

On Sat, May 25, 2019 at 09:49:58AM +0000, lynxis [REDMINE] wrote:

however in my lab environment, there was no packet lost and because of
this, there were no compressed ack, even when activated.

use a wired setup with attenuators or otherwise get close to the
sensitivity limit? We have plenty of attenuators including rotary step
attenuators that can be manually controlled.

While testing, I've noticed a high retransmission statistics counter, while looking closer, the PCU is sending a single RSL package multiple times without using the window size.

are you sure that's not simply the "blind pre-emptive retransmission if nothing else is
to be transmitted"? See https://osmocom.org/issues/2408 - if it is, it might be worth
first implementing a VTY option to disable that to avoid confusing you in the traces.

#10 Updated by ipse 5 months ago

laforge wrote:

Hi Lynxis,

On Sat, May 25, 2019 at 09:49:58AM +0000, lynxis [REDMINE] wrote:

however in my lab environment, there was no packet lost and because of
this, there were no compressed ack, even when activated.

use a wired setup with attenuators or otherwise get close to the
sensitivity limit? We have plenty of attenuators including rotary step
attenuators that can be manually controlled.

I'm not sure which BTS you're using for the tests but if it's osmo-bts-trx then it defaults to MCS-1 due to lack of the C/I data coming to the PCU. Try forcing it to MCS-9 to have a higher probability of packet loss - this really helped in our testing.

#11 Updated by lynxis 5 months ago

laforge I'll look into it, if this is a "blind pre-emptive retransmission" and add a vty for it.

Try forcing it to MCS-9 to have a higher probability of packet loss - this really helped in our testing.

ipse Thanks, good idea.

#13 Updated by lynxis 5 months ago

  • % Done changed from 0 to 10

#14 Updated by ipse 5 months ago

lynxis wrote:

kluchnikov can you try this patch? https://gerrit.osmocom.org/#/c/osmo-pcu/+/14302/

Were you able to reproduce the issue in your side and test this patch on your side?

#15 Updated by laforge 5 months ago

I'm sorry for the lack of updates to this ticket. We had a discussion at the office
today and it seems the results haven't been stated here yet.

  • lynxis still had problems provoking the bug, even with artificially introducing
    block erasure rate by dropping mac blocks
  • he did however discover "a" problem with CRRB that is related to CRRB masks
    of non-integral-multiple of 8 bitmask lengths
  • this problem could be reproduced and it has been confirmed fixed by his patch

So with some luck, it is the same issue as yours. However, we cannot be 100% certain.

What also came up is that we are lacking some context information, such as
  • how exactly the problem could be reproduced
  • specific osmo-pcu version used
  • specific phone used
  • any particular behavior that could trigger it
  • does it appear with one MS already, or might there be contention between multiple MS
    that plays into it?

If it was the same issue that was fixed: Great. If not, we have to continue digging.
However, even if it's a separate issue: The now fixed issue was a serious problem and
for sure it would also have caused problems in exactly the same area.

Regards,
Harald

#16 Updated by lynxis 5 months ago

So far, by now I've a good way to reproduce the bug.
My test system now does

  • loosing every 13th packet
  • force compressed RBB
  • force MCS9

Without the patch

  • the traffic is broken up and get stuck indefinite
  • Or if lucky, the MS is doing a RACH (every 10-20 sec), to manage to send some packets.
  • ftp transfer aborts after some time (most likely tcp timeout)

With the patch

  • The MS still stalls for seconds, but continues the traffic.
  • Some RACH (every 60 sec) are still happening.
  • ftp transfer completes (3.6 kb/s in 460 sec (7:40))

#17 Updated by lynxis 5 months ago

  • % Done changed from 10 to 70

kluchnikov do you had the time to test the patch?

#18 Updated by ipse 5 months ago

We haven't tested this yet but since you tested and it works for you, it's already a good sign.

Were you able to test uncompressed bitmap without length as well? It wasn't working for us.

#19 Updated by lynxis 5 months ago

  • Related to Bug #4052: Test uncompressed bitmaps without length added

#20 Updated by lynxis 5 months ago

Were you able to test uncompressed bitmap without length as well? It wasn't working for us.

No. I've created another ticket for that issue (#4052). I'll test it and write my results into the new issue.

#21 Updated by lynxis 3 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 70 to 100

The compressed bitmaps has been fixed in https://gerrit.osmocom.org/c/osmo-pcu/+/14302

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)