Project

General

Profile

Actions

Bug #4463

closed

osmo-pcu crash after re-enabling MS RA capabilities parsing from SGSN messages

Added by pespin about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
03/20/2020
Due date:
% Done:

100%

Spec Reference:

Description

Today I was running a network setup with osmo-pcu on my laptop with 2 mobiles phones registering, and osmo-pcu crashed.

It seems related to the RA Cap messages we enabled recently comin from osmo-sgsn in osmo-pcu.

<000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:321 NSVCI=65534 Creating NS-VC with Signal weight 1, Data weight 1
20200320204116517 DLGLOBAL <000e> /home/pespin/dev/sysmocom/git/libosmocore/src/vty/telnet_interface.c:104 Available via telnet 127.0.0.1 4240
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/osmobts_sock.cpp:211 Opening OsmoPCU L1 interface to OsmoBTS
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/osmobts_sock.cpp:229 osmo-bts PCU socket /tmp/pcu_bts has been connected
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:136 Sending 0.8.0.81-570f TXT as PCU_VERSION to BTS
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:501 BTS available
20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:2070 Listening for nsip packets from 192.168.30.1:23000 on 0.0.0.0:23020
20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:2094 NS UDP socket at 0.0.0.0:23020
20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:321 NSVCI=1800 Creating NS-VC with Signal weight 1, Data weight 1
20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:2113 NSEI=1800 RESET procedure based on API request
20200320204116517 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:559 NSEI=1800 Tx NS RESET (NSVCI=1800, cause=O&M intervention)
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:148 Sending activate request: trx=0 ts=6
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:627 PDCH: trx=0 ts=6
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:148 Sending activate request: trx=0 ts=7
20200320204116517 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:627 PDCH: trx=0 ts=7
20200320204116518 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:1354 NSVCI=1800 Rx NS RESET ACK (NSEI=1800, NSVCI=1800)
20200320204116518 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:704 NSEI=1800 Tx NS UNBLOCK (NSVCI=1800)
20200320204116518 DNS <000b> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_ns.c:1806 NSEI=1800 Rx NS UNBLOCK ACK
20200320204116518 DPCU <000d> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:576 NS-VC 1800 is unblocked.
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:857 Sending reset on BVCI 0
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_bssgp_bss.c:300 BSSGP (BVCI=0) Tx BVC-RESET CAUSE=O&M intervention
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:323 Rx BSSGP BVCI=0 (SIGN) BVC_RESET_ACK
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:865 Sending reset on BVCI 1800
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_bssgp_bss.c:300 BSSGP (BVCI=1800) Tx BVC-RESET CAUSE=O&M intervention
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:323 Rx BSSGP BVCI=0 (SIGN) BVC_RESET_ACK
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:874 Sending unblock on BVCI 1800
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/libosmocore/src/gb/gprs_bssgp_bss.c:281 BSSGP (BVCI=1800) Tx BVC-UNBLOCK
20200320204116518 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:337 Rx BSSGP BVCI=0 (SIGN) BVC_UNBLOCK_ACK
20200320204531628 DL1IF <0001> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pcu_l1_if.cpp:442 RACH request received: sapi=1 qta=-1, ra=118, fn=1307419, cur_fn=1307423, is_11bit=0
20200320204532025 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5026 csnStreamDecoder (type=5):
20200320204532025 DRLCMAC <0002> /home/pespin/dev/sysmocom/git/osmo-pcu/src/pdch.cpp:609 MS supports EGPRS multislot class 12.
20200320204532025 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:992 Allocating UL TBF: MS_CLASS=12/12
20200320204532026 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:541 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) Setting Control TS 6
20200320204532026 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:948 TBF(TFI=0 TLLI=0x00000000 DIR=UL STATE=NULL) Allocated: trx = 0, ul_slots = 40, dl_slots = 00
20200320204532048 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:1374 TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=ASSIGN) start Packet Uplink Assignment (PACCH)
20200320204532048 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5185 csnStreamDecoder (type=10):
20200320204532048 DTBFDL <0009> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:782 TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=ASSIGN) Scheduled UL Assignment polling on PACCH (FN=1307553, TS=7)
20200320204532264 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5026 csnStreamDecoder (type=1):
20200320204532264 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:544 TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=FLOW) Changing Control TS 6
20200320204532481 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf_ul.cpp:404 LLC [PCU -> SGSN] TBF(TFI=0 TLLI=0x8faaadbd DIR=UL STATE=FLOW) len=52
20200320204532482 DCSN1 <0000> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5792 csnStreamDecoder (RAcap):
20200320204532482 DRLCMACDATA <0003> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gsm_rlcmac.cpp:5800 Got 7 remaining bits unhandled by decoder at the end of bitvec
20200320204532482 DBSSGP <000c> /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:163 LLC [SGSN -> PCU] = TLLI: 0x8faaadbd IMSI: 000 len: 9
20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:1071 Allocating DL TBF: MS_CLASS=12/12
20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:541 TBF(TFI=0 TLLI=0x00000000 DIR=DL STATE=NULL) Setting Control TS 6
20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/tbf.cpp:948 TBF(TFI=0 TLLI=0x8faaadbd DIR=DL STATE=NULL) Allocated: trx = 0, ul_slots = 40, dl_slots = 40
20200320204532482 DTBF <0008> /home/pespin/dev/sysmocom/git/osmo-pcu/src/bts.cpp:898 TBF(TFI=0 TLLI=0x8faaadbd DIR=DL STATE=ASSIGN) TX: START Immediate Assignment Downlink (PCH)
*** stack smashing detected ***: terminated

Program received signal SIGABRT, Aborted.
0x00007ffff77b7ce5 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff77b7ce5 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff77a1857 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff77fb2b0 in __libc_message () from /usr/lib/libc.so.6
#3  0x00007ffff788b06a in __fortify_fail () from /usr/lib/libc.so.6
#4  0x00007ffff788b034 in __stack_chk_fail () from /usr/lib/libc.so.6
#5  0x0000555555581e4f in gprs_bssgp_pcu_rx_dl_ud (msg=0x55555572fce0,
    tp=0x7fffffffbc80)
    at /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:167
#6  0x0000555500000000 in ?? ()
#7  0x00007ffff7f6cf40 in ?? ()
   from /home/pespin/dev/sysmocom/build/new/out/lib/libosmogsm.so.13
#8  0x000055555572e6d0 in ?? ()
#9  0x00007fffffffbc80 in ?? ()
#10 0x000055555572fce0 in ?? ()
#11 0x00000000ffffbc30 in ?? ()
#12 0x0000070800000000 in ?? ()
#13 0x000055555572fd80 in ?? ()
#14 0x460dab82121f6200 in ?? ()
#15 0x000055555565d380 in ?? ()
#16 0x00005555556aced0 in ?? ()
#17 0x00007fffffffcca0 in ?? ()
#18 0x000055555558303c in gprs_bssgp_pcu_rcvmsg (
    msg=<error reading variable: Cannot access memory at address 0xabd8>)
--Type <RET> for more, q to quit, c to continue without paging--
    at /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:465
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) l
173                     quit = 1;
174                     break;
175             case SIGABRT:
176                     /* in case of abort, we want to obtain a talloc report
177                      * and then return to the caller, who will abort the process
178                      */
179             case SIGUSR1:
180             case SIGUSR2:
181                     talloc_report_full(tall_pcu_ctx, stderr);
182                     break;
(gdb) frame 5
#5  0x0000555555581e4f in gprs_bssgp_pcu_rx_dl_ud (msg=0x55555572fce0,
    tp=0x7fffffffbc80)
    at /home/pespin/dev/sysmocom/git/osmo-pcu/src/gprs_bssgp_pcu.cpp:167
167     }
(gdb) l
162
163             LOGP(DBSSGP, LOGL_INFO, "LLC [SGSN -> PCU] = TLLI: 0x%08x IMSI: %s len: %d\n", tlli, imsi, len);
164
165             return gprs_rlcmac_dl_tbf::handle(the_pcu.bts, tlli, tlli_old, imsi,
166                             ms_class, egprs_ms_class, delay_csec, data, len);
167     }
168
169     static int gprs_bssgp_pcu_rx_paging_cs(struct msgb *msg, struct tlv_parsed *tp)
170     {
171             const uint8_t *mi;

Files

crashing_packets.pcapng crashing_packets.pcapng 1000 Bytes pespin, 03/20/2020 07:59 PM
Actions #1

Updated by pespin about 4 years ago

Actions #2

Updated by pespin about 4 years ago

  • Subject changed from osmo-pcu to osmo-pcu crash after re-enabling MS RA capabilities parsing from SGSN messages
Actions #3

Updated by pespin about 4 years ago

Copied the content of the RA Cap field to a unit test and I can reproduce in there the same stack smashing seen in osmo-pcu:
https://gerrit.osmocom.org/c/osmo-pcu/+/17548 RLCMACTest: Reproduce stack smashing bug

Actions #4

Updated by fixeria about 4 years ago

==908769==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdccbd3654 at pc 0x55acc4386ee3 bp 0x7ffdccbd2e60 sp 0x7ffdccbd2e50                              
WRITE of size 1 at 0x7ffdccbd3654 thread T0                                                                                                                              
    #0 0x55acc4386ee2 in csnStreamDecoder /home/wmn/wmn/osmocom/osmo-pcu/src/csn1.c:511                                                                                  
    #1 0x55acc4390264 in csnStreamDecoder /home/wmn/wmn/osmocom/osmo-pcu/src/csn1.c:1361                                                                                 
    #2 0x55acc43679f5 in decode_gsm_ra_cap(bitvec*, MS_Radio_Access_capability_t*) /home/wmn/wmn/osmocom/osmo-pcu/src/gsm_rlcmac.cpp:5793                                
    #3 0x55acc435da46 in testRAcap2(void*) rlcmac/RLCMACTest.cpp:409                                                                                                     
    #4 0x55acc435dd8b in main rlcmac/RLCMACTest.cpp:439                                                                                                                  
    #5 0x7f80c0a88022 in __libc_start_main (/usr/lib/libc.so.6+0x27022)                                                                                                  
    #6 0x55acc43535ed in _start (/home/wmn/wmn/osmocom/osmo-pcu/tests/rlcmac/RLCMACTest+0xa45ed)                                                                         

Address 0x7ffdccbd3654 is located in stack of thread T0 at offset 180 in frame                                                                                           
    #0 0x55acc435d928 in testRAcap2(void*) rlcmac/RLCMACTest.cpp:284                                                                                                     

  This frame has 1 object(s):
    [32, 180) 'data' (line 286) <== Memory access at offset 180 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/wmn/wmn/osmocom/osmo-pcu/src/csn1.c:511 in csnStreamDecoder
Shadow bytes around the buggy address:
  0x100039972670: f2 f2 00 04 f2 f2 00 04 f2 f2 00 00 00 00 00 00
  0x100039972680: 00 00 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00
  0x100039972690: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1
  0x1000399726a0: 04 f2 00 04 f3 f3 00 00 00 00 00 00 00 00 00 00
  0x1000399726b0: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00
=>0x1000399726c0: 00 00 00 00 00 00 00 00 00 00[04]f3 f3 f3 f3 f3
  0x1000399726d0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000399726e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000399726f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100039972700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100039972710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==908769==ABORTING
Actions #5

Updated by pespin about 4 years ago

I updated the patch with the fix for it.

TODO:
  • test it with osmo-pcu and the same real phone
  • send similar patch to wireshark rclmac part.
Actions #6

Updated by pespin about 4 years ago

fixeria it's now fixed, but I'm wondering why do you get clear output from Asan while I don't. Perhaps because you use clang?

Actions #7

Updated by fixeria about 4 years ago

fixeria it's now fixed

I came up with a similar fix, but you were faster :D

I'm wondering why do you get clear output from Asan while I don't. Perhaps because you use clang?

Nope, I am using GCC. Here is my build configuration:

$ gcc -v
gcc version 9.3.0 (Arch Linux 9.3.0-1)

$ ./configure --enable-sanitize CFLAGS="-O0 -g" CXXFLAGS="-O0 -g" 
Actions #8

Updated by pespin about 4 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

Fixed by commits 81b40cbaf3070f70954663f68375100128bdc77e..e50ce6e45c4509805807d599cadf1a1b23d37f63.

Actions #9

Updated by pespin about 4 years ago

  • Status changed from Resolved to In Progress
  • % Done changed from 100 to 90

Actually, keeping it open since I need to port those patches to wireshark.

Actions #10

Updated by pespin about 4 years ago

Ports to wireshark.git submitted here:
remote: https://code.wireshark.org/review/36571 rlcmac: Don't pass array element to CSN1 descriptors
remote: https://code.wireshark.org/review/36572 csn1: Validate recursive array max size during decoding
remote: https://code.wireshark.org/review/36573 rlcmac: Fix bug receiving RA cap
remote: https://code.wireshark.org/review/36574 rlcmac: Introduce MS Radio Access Capabilities 2 to fix related spare bits

Actions #11

Updated by pespin about 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100

Wireshark commits merged, closing the ticket.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)