Project

General

Profile

Actions

Bug #4681

closed

>= 100 BTS_Tests.ttcn failures / regressions since July 23rd

Added by laforge over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
Start date:
07/26/2020
Due date:
% Done:

100%

Spec Reference:

Description

since July 23rd, almost all of our tests are failing with a regression (100 new failures from July 22nd -> 23rd): https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/963/#showFailuresLink

Regressions of this s cale are not acceptable at all, particularly not if they are not resolved for several days in a row.


Files

junit-xml-19.log junit-xml-19.log 21.9 KB fixeria, 07/26/2020 09:00 PM
Actions #1

Updated by laforge over 3 years ago

  • Assignee changed from 4368 to pespin
test failures are quite different from case to case:
  • "BTS_Tests.ttcn:2388 : No MEAS RES received at all"
  • "BTS_Tests.ttcn:233 : Timeout waiting for RSL bring up"
  • Timeout waiting for { rsp := { verb := "FAKE_TOA", status := ?, params := * } } on port BTS_TRXC

I currently only see patches from pespin merged around this timeframe. Pleaes investigate ASAP and revert changes if the issues cannot be resolved quickly.

Actions #2

Updated by fixeria over 3 years ago

Hi all,

it could potentially be related to [1], but after looking at [2]:

Traceback (most recent call last):
  File "/tmp/osmocom-bb/src/target/trx_toolkit/fake_trx.py", line 543, in <module>
    app.run()
  File "/tmp/osmocom-bb/src/target/trx_toolkit/fake_trx.py", line 461, in run
    self.burst_fwd.forward_msg(trx, msg)
  File "/tmp/osmocom-bb/src/target/trx_toolkit/burst_fwd.py", line 70, in forward_msg
    trx.handle_data_msg(src_trx, rx_msg, tx_msg)
  File "/tmp/osmocom-bb/src/target/trx_toolkit/fake_trx.py", line 255, in handle_data_msg
    Transceiver.handle_data_msg(self, msg)
  File "/tmp/osmocom-bb/src/target/trx_toolkit/transceiver.py", line 281, in handle_data_msg
    self.data_if.send_msg(msg, legacy = True)
  File "/tmp/osmocom-bb/src/target/trx_toolkit/data_if.py", line 109, in send_msg
    msg.validate()
  File "/tmp/osmocom-bb/src/target/trx_toolkit/data_msg.py", line 597, in validate
    raise ValueError("RSSI %d is out of range" % self.rssi)
ValueError: RSSI -122 is out of range

I would not think so. Moreover, I've tested [1] locally before submitting.

ValueError: RSSI -122 is out of range

So fake_trx.py crashes due to an out of range RSSI value. RSSI has nothing to do with my recent refactoring changes, I am pretty sure it would have crashed before. I'll prepare a patch to handle such errors properly. Although, it would still be good to investigate where this RSSI value is coming from.

[1] https://git.osmocom.org/osmocom-bb/commit/?id=d4ed09df57b3461470af501e9687ddd80eb78838
[2] https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/966/artifact/logs/fake_trx/

Actions #3

Updated by fixeria over 3 years ago

  • Status changed from New to In Progress
  • Assignee changed from pespin to fixeria
  • % Done changed from 0 to 10

I've managed to reproduce the crash on my machine, and it seems to be related to the power ramping. The container with fake_trx.py dies after BTS_Tests.TC_tx_power_ramp_adm_state_change is finished and BTS_Tests.TC_rsl_bs_pwr_static_ass is started. Ramping is a relatively new feature, and there were some new changes merged to osmo-bts recently, so I assume that's why we did not hit this problem before.

Actions #4

Updated by fixeria over 3 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 10 to 80

The crash should be fixed now, waiting for code review:

https://gerrit.osmocom.org/c/osmocom-bb/+/19400 trx_toolkit/data_if.py: do not validate TRXD message twice
https://gerrit.osmocom.org/c/osmocom-bb/+/19401 trx_toolkit/data_if.py: fix: handle encoding exceptions

Actions #5

Updated by fixeria over 3 years ago

  • % Done changed from 80 to 90

The crash should be fixed now, waiting for code review: [...]

Tested the fix on my machine (in Docker), fake_trx.py survives during power ramping now (yay!).
I've just merged it to the upstream, let's wait for a new build on Jenkins (next morning).

Actions #7

Updated by fixeria over 3 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)