Bug #6299
closedlogging: segmentation fault in _output_buf()
100%
Description
ttcn3-bts-test is broken for a few days. The Console Output indicates that something is wrong with the virtphy container:
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/2240/console
Error response from daemon: Cannot kill container: jenkins-ttcn3-bts-test-2240-virtphy: No such container: jenkins-ttcn3-bts-test-2240-virtphy
The artifacts contain a core file:
https://jenkins.osmocom.org/jenkins/view/TTCN3/job/ttcn3-bts-test/2240/artifact/logs/virtphy/
I can reproduce the problem locally, virtphy crashes immediately after being started:
(gdb) r Starting program: /home/fixeria/projects/osmocom/bb/src/host/virt_phy/src/virtphy [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7d31468 in ?? () from /usr/lib/libc.so.6 (gdb) bt #0 0x00007ffff7d31468 in ?? () from /usr/lib/libc.so.6 #1 0x00007ffff7d0be16 in snprintf () from /usr/lib/libc.so.6 #2 0x00007ffff7d799c7 in ?? () from /usr/lib/libc.so.6 #3 0x00007ffff7d79b2f in ctime_r () from /usr/lib/libc.so.6 #4 0x00007ffff7f67037 in _output_buf (buf=buf@entry=0x7fffffffcc90 "", buf_len=buf_len@entry=4096, target=target@entry=0x555555588140, subsys=subsys@entry=2, level=level@entry=3, file=file@entry=0x5555555643ce "virtphy.c", line=248, cont=0, format=0x5555555643e0 "Virtual physical layer starting up...\n", ap=0x7fffffffdd00) at ../../../../src/libosmocore/src/core/logging.c:520 #5 0x00007ffff7f675de in _output (target=target@entry=0x555555588140, subsys=subsys@entry=2, level=level@entry=3, file=file@entry=0x5555555643ce "virtphy.c", line=line@entry=248, cont=cont@entry=0, format=0x5555555643e0 "Virtual physical layer starting up...\n", ap=0x7fffffffdd00) at ../../../../src/libosmocore/src/core/logging.c:606 #6 0x00007ffff7f678ee in osmo_vlogp (subsys=<optimized out>, level=3, file=0x5555555643ce "virtphy.c", line=248, cont=0, format=0x5555555643e0 "Virtual physical layer starting up...\n", ap=0x7fffffffdd70) at ../../../../src/libosmocore/src/core/logging.c:698 #7 0x00007ffff7f67adf in logp2 (subsys=<optimized out>, level=<optimized out>, file=<optimized out>, line=<optimized out>, cont=<optimized out>, format=<optimized out>) at ../../../../src/libosmocore/src/core/logging.c:734 #8 0x0000555555557bd8 in main (argc=1, argv=0x7fffffffdf78) at virtphy.c:248
Related issues
Updated by fixeria 5 months ago
- Status changed from New to In Progress
- Priority changed from Normal to High
It's a regression in libosmocore, reverting this patch makes the problem go away:
commit d71331bc20dd69ed092240c079bf2ac07463675e (origin/neels/log, gerrit/neels/log) Author: Neels Hofmeyr <nhofmeyr@sysmocom.de> Date: Mon Dec 4 07:51:15 2023 +0100 logging: fix nul octets in log output / use osmo_strbuf
Looks like only specific logging configurations are being affected, the problem also goes away if I:
- either comment out
log_set_print_timestamp(osmo_stderr_target, 1)
, - or comment out
log_set_use_color(osmo_stderr_target, 0)
.
BTW, Coverity CID#336550 "Explicit null dereferenced" is exactly about a problem in _output_buf()
.
Updated by fixeria 5 months ago
- Project changed from OsmocomBB to libosmocore
- Subject changed from virtphy: segmentation fault in ms_log_init() to logging: segmentation fault in _output_buf()
- Category deleted (
virtphy)
Any other application using the logging API would also crash with this logging configuration:
log stderr logging filter all 1 logging timestamp 1 logging color 1
Moving this ticket to libosmocore.
Updated by fixeria 5 months ago
- % Done changed from 0 to 80
The problem is that we're indeed passing NULL to ctime_r()
, here is a patch:
https://gerrit.osmocom.org/c/libosmocore/+/35309 logging: fix NULL pointer dereference in _output_buf()
I also noticed that OSMO_STRBUF_REMAIN
macro is:
- safe against passing a
struct osmo_strbuf
with.buf
being NULL, but - not safe against passing a
struct osmo_strbuf
with.pos
being NULL.
So the following code:
char buf[256];
struct osmo_strbuf sb = { .buf = &buf[0], .len = sizeof(buf) };
printf("Remaining: %zu\n", OSMO_STRBUF_REMAIN(sb));
gives me the following output:
Remaining: 140733300489524
I will also propose a patch while on it.
Updated by fixeria 5 months ago
- Status changed from In Progress to Resolved
- % Done changed from 80 to 100
The fix has been merged by Neels:
https://cgit.osmocom.org/libosmocore/commit/?id=9c603e64bf553ac282af8534a2d7592f719d092f
fixeria wrote in #note-3:
I also noticed that
OSMO_STRBUF_REMAIN
macro is:
- safe against passing a
struct osmo_strbuf
with.buf
being NULL, but- not safe against passing a
struct osmo_strbuf
with.pos
being NULL.
https://gerrit.osmocom.org/c/libosmocore/+/35312 utils: fix OSMO_STRBUF_REMAIN to handle sb.pos == NULL correctly
https://gerrit.osmocom.org/c/libosmocore/+/35313 utils: improve readability of OSMO_STRBUF_CHAR_COUNT
I expect ttcn3-bts-test to recover tomorrow, marking as resolved.