Bug #4927
closedpaging related osmo-pcu ttcn3 tests have plenty of sporadic failures
100%
Files
Updated by fixeria over 3 years ago
Jenkins is not always telling us why a given test case did not pass, so I did some analysis.
TTCN3-centos / build 218¶
https://jenkins.osmocom.org/jenkins/view/TTCN3-centos/job/TTCN3-centos-pcu-test/lastBuild/
TC_paging_ps_from_sgsn_sign_ptmsi¶
05:04:12.675654 263 BSSGP_Emulation.ttcnpp:1132 Dynamic test case error: Sending data on the connection of port BVC to 261:BVC failed. (Broken pipe) 05:04:12.675663 260 - Final verdict of PTC: none 05:04:12.675691 263 BSSGP_Emulation.ttcnpp:1132 setverdict(error): none -> error 05:04:12.675716 263 BSSGP_Emulation.ttcnpp:1132 Performing error recovery.
TC_paging_ps_from_sgsn_ptp¶
05:04:22.755280 283 - Terminating component type PCUIF_Components.RAW_PCUIF_CT. 05:04:22.755288 mtc GPRS_Components.ttcn:220 Connection of port BSSGP_GLOBAL[0] to 279:GLOBAL was closed unexpectedly by the peer. 05:04:22.755298 281 BSSGP_Emulation.ttcnpp:1132 Dynamic test case error: Sending data on the connection of port BVC to 279:BVC failed. (Broken pipe) 05:04:22.755305 283 - Removing unterminated mapping between port PCU and system:PCU. 05:04:22.755312 282 - Port NSE was stopped. 05:04:22.755314 279 - Disconnected from MC. 05:04:22.755318 mtc GPRS_Components.ttcn:220 Port BSSGP_GLOBAL[0] was disconnected from 279:GLOBAL. 05:04:22.755325 280 - Terminating component type NS_Emulation.NSVC_CT. 05:04:22.755325 282 - Removing unterminated mapping between port IPL4 and system:IPL4. 05:04:22.755327 281 BSSGP_Emulation.ttcnpp:1132 setverdict(error): none -> error 05:04:22.755332 279 - TTCN-3 Parallel Test Component finished. 05:04:22.755345 281 BSSGP_Emulation.ttcnpp:1132 Performing error recovery.
TTCN3-debian / builds 698, 699, 700¶
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-pcu-test/698/testReport/(root)/PCU_Tests/TC_paging_cs_from_sgsn_ptp/
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-pcu-test/699/testReport/(root)/PCU_Tests/TC_paging_cs_from_sgsn_ptp/
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-pcu-test/700/testReport/(root)/PCU_Tests/TC_paging_cs_from_sgsn_sign/
Failed to match Packet Paging Request: { ctrl := { mac_hdr := { payload_type := MAC_PT_RLCMAC_NO_OPT (1), rrbp := RRBP_Nplus13_mod_2715648 (0), rrbp_valid := false, usf := 0 }, opt := omit, payload := { msg_type := PACKET_DL_DUMMY_CTRL (37), u := { dl_dummy := { page_mode := PAGE_MODE_NORMAL (0), persistence_levels_present := '0'B, persistence_levels := omit } } } } } vs { ctrl := { mac_hdr := { payload_type := MAC_PT_RLCMAC_NO_OPT (1), rrbp := ?, rrbp_valid := ?, usf := ? }, opt := *, payload := { msg_type := PACKET_PAGING_REQUEST (34), u := { paging := { page_mode := ?, persistence_levels_present := ?, persistence_levels := *, nln_present := ?, nln := *, repeated_pageinfo := *, repeated_pageinfo_term := '0'B } } } } } PCU_Tests.ttcn:3527 PCU_Tests control part PCU_Tests.ttcn:2391 TC_paging_cs_from_sgsn_sign testcase
This is caused by a race condition problem that I tried to fix in:
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/20461 pcu/GPRS_Components: work around a race condition in f_rx_rlcmac_dl_block()
but never had time to finish. This change makes the situation even worse :/
Updated by pespin over 3 years ago
The tests look more stable over last days, probably something was taking resources and disrupting normal operation. Until know we were lucky tests are mainly passing fine in jenkins slave. I agree though this needs to be fixed, but it will require some dev time.
That's indeed a know problem which I think we agreed should be solved by moving PCU_Tests infrastructure to use alt steps instead of functions to be able to cope better with that kind of timing issues.
I started to play with some ideas regarding that in osmo-ttcn3-hacks.git branch "pespin/pcu-altstep" but nothing really usable yet. I'll probably need to discuss ideas with fixeria too since he's also more used to using altstep features right now.
Updated by pespin over 3 years ago
- Status changed from New to In Progress
This patch should hopefully fix most of the issues we see, which usually happen in tests when expecting an Assignment Requet on PACH, due to the PCUIF RTS is sent too early, before the PCU ctually received the BSSGP message we sent to it.
https://gerrit.osmocom.org/c/osmo-ttcn3-hacks/+/22313
The main problem towards moving current tests to altsteps is that BTS.receive() provides use with PCUIF_Message tr_PCUIF_DATA_REQ() for all RLCMAC blocks, and then in a 2nd step we need to call dec_RlcmacDlBlock(pcu_msg.u.data_req.data). We should instead be able to decode pcu_msg.u.data_req.data automatically and be able to match those through altsteps providing templates.
Updated by pespin almost 3 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
These instabilities in pagign tests are not long showing up, closing the ticket.