Bug #3728
closedProblem using compressed ACK/NACK bitmaps or uncompressed bitmaps without length
Added by laforge over 5 years ago. Updated almost 5 years ago.
100%
Description
The issue is with uplink ACK/NACK coding in case of using compressed bitmaps or uncompressed bitmaps without length.
This issue is reproducible, kluchnikov made tests with FTP uploading and after some period of time uplink just stalled forever, until it timed out. They made a workaround for this issue, forced uncompressed bitmaps with length and uplink worked in this mode, but other modes should be fixed.
Related issues
Updated by msuraev over 5 years ago
kluchnikov please share as much details on how to reproduce this as possible:
- config files
- how processes are started
- what have to be changed to enable the workaround
- any particular reason for choosing FTP upload (vs http, scp etc) for test case?
- how long we have to wait, how do we know issue is triggered (particular pattern in the logs?)
etc.
Updated by laforge about 5 years ago
- Assignee set to lynxis
- Priority changed from Normal to Urgent
Updated by lynxis about 5 years ago
- Subject changed from Problem using compressed ACK/NACK bitmaps or uncompressed bitmaps witout length to Problem using compressed ACK/NACK bitmaps or uncompressed bitmaps without length
Updated by lynxis about 5 years ago
- Checklist item check if RLC tests are possible in our TTCN3 added
- Checklist item create a TTCN3 testplan added
Updated by lynxis about 5 years ago
- Spec Reference changed from TS 44.060 to TS 44.060 (9.1.8|12.3)
Updated by lynxis almost 5 years ago
- Checklist item deleted (
create a TTCN3 testplan)
I've stopped implementing TTCN3 tests and switched over to an experiment based approach.
So far uploading of a 2MB to a FTP works, however in my lab environment, there was no packet lost and because of this, there were no compressed ack, even when activated.
While testing, I've noticed a high retransmission statistics counter, while looking closer, the PCU is sending a single RSL package multiple times without using the window size.
Updated by laforge almost 5 years ago
Hi Lynxis,
On Sat, May 25, 2019 at 09:49:58AM +0000, lynxis [REDMINE] wrote:
however in my lab environment, there was no packet lost and because of
this, there were no compressed ack, even when activated.
use a wired setup with attenuators or otherwise get close to the
sensitivity limit? We have plenty of attenuators including rotary step
attenuators that can be manually controlled.
While testing, I've noticed a high retransmission statistics counter, while looking closer, the PCU is sending a single RSL package multiple times without using the window size.
are you sure that's not simply the "blind pre-emptive retransmission if nothing else is
to be transmitted"? See https://osmocom.org/issues/2408 - if it is, it might be worth
first implementing a VTY option to disable that to avoid confusing you in the traces.
Updated by ipse almost 5 years ago
laforge wrote:
Hi Lynxis,
On Sat, May 25, 2019 at 09:49:58AM +0000, lynxis [REDMINE] wrote:
however in my lab environment, there was no packet lost and because of
this, there were no compressed ack, even when activated.use a wired setup with attenuators or otherwise get close to the
sensitivity limit? We have plenty of attenuators including rotary step
attenuators that can be manually controlled.
I'm not sure which BTS you're using for the tests but if it's osmo-bts-trx then it defaults to MCS-1 due to lack of the C/I data coming to the PCU. Try forcing it to MCS-9 to have a higher probability of packet loss - this really helped in our testing.
Updated by lynxis almost 5 years ago
kluchnikov can you try this patch? https://gerrit.osmocom.org/#/c/osmo-pcu/+/14302/
Updated by ipse almost 5 years ago
lynxis wrote:
kluchnikov can you try this patch? https://gerrit.osmocom.org/#/c/osmo-pcu/+/14302/
Were you able to reproduce the issue in your side and test this patch on your side?
Updated by laforge almost 5 years ago
I'm sorry for the lack of updates to this ticket. We had a discussion at the office
today and it seems the results haven't been stated here yet.
- lynxis still had problems provoking the bug, even with artificially introducing
block erasure rate by dropping mac blocks - he did however discover "a" problem with CRRB that is related to CRRB masks
of non-integral-multiple of 8 bitmask lengths - this problem could be reproduced and it has been confirmed fixed by his patch
So with some luck, it is the same issue as yours. However, we cannot be 100% certain.
What also came up is that we are lacking some context information, such as- how exactly the problem could be reproduced
- specific osmo-pcu version used
- specific phone used
- any particular behavior that could trigger it
- does it appear with one MS already, or might there be contention between multiple MS
that plays into it?
If it was the same issue that was fixed: Great. If not, we have to continue digging.
However, even if it's a separate issue: The now fixed issue was a serious problem and
for sure it would also have caused problems in exactly the same area.
Regards,
Harald
Updated by lynxis almost 5 years ago
So far, by now I've a good way to reproduce the bug.
My test system now does
- loosing every 13th packet
- force compressed RBB
- force MCS9
Without the patch¶
- the traffic is broken up and get stuck indefinite
- Or if lucky, the MS is doing a RACH (every 10-20 sec), to manage to send some packets.
- ftp transfer aborts after some time (most likely tcp timeout)
With the patch¶
- The MS still stalls for seconds, but continues the traffic.
- Some RACH (every 60 sec) are still happening.
- ftp transfer completes (3.6 kb/s in 460 sec (7:40))
Updated by lynxis almost 5 years ago
- % Done changed from 10 to 70
kluchnikov do you had the time to test the patch?
Updated by ipse almost 5 years ago
We haven't tested this yet but since you tested and it works for you, it's already a good sign.
Were you able to test uncompressed bitmap without length as well? It wasn't working for us.
Updated by lynxis almost 5 years ago
- Related to Bug #4052: Test uncompressed bitmaps without length added
Updated by lynxis almost 5 years ago
Were you able to test uncompressed bitmap without length as well? It wasn't working for us.
No. I've created another ticket for that issue (#4052). I'll test it and write my results into the new issue.
Updated by lynxis almost 5 years ago
- Status changed from In Progress to Closed
- % Done changed from 70 to 100
The compressed bitmaps has been fixed in https://gerrit.osmocom.org/c/osmo-pcu/+/14302