Bug #1733
closednat: Memory leak in osmo-bsc_nat?
100%
Description
The osmo-bsc_nat process has been selected to be killed but it is not clear that it was the process that consumed too much memory (it might just be the unlucky one asking for memory). Look into the memory leak. Right now I can only think of two places that changed. It is Osmux and my new token based auth (that is not used but might not be freed).
It currently sits at 10MB of resident memory. Let's have a look in a bit.
Updated by zecke almost 8 years ago
-full talloc report on 'nat' (total 5366856 bytes in 33246 blocks) +full talloc report on 'nat' (total 5525325 bytes in 33358 blocks) telnet_connection contains 1 bytes in 1 blocks (ref 0) 0x11dcc60 - struct bsc_nat contains 3851016 bytes in 32875 blocks (ref 0) 0x1143a60 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1a58ad0 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1a2e820 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1a2e750 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1a4ea80 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1a69620 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1913040 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1a2c860 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1917ef0 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x19f4cf0 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1a3ce10 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1919b20 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1a2c750 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x19b0200 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1a4ef00 - struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1a7d5c0 - XXX contains 16 bytes in 1 blocks (ref 0) 0x1a3cd90 + struct bsc_nat contains 3853396 bytes in 32953 blocks (ref 0) 0x1143a60 + struct nat_sccp_connection contains 112 bytes in 1 blocks (ref 0) 0x1a2ccd0 + struct bsc_connection contains 472 bytes in 1 blocks (ref 0) 0x1909dc0 + struct nat_sccp_connection contains 128 bytes in 2 blocks (ref 0) 0x1a2cf50 + XXX contains 16 bytes in 1 blocks (ref 0) 0x1911bc0
This seems to be IMSIs stolen here:
con->filter_state.con_type = con_type; con->filter_state.imsi_checked = filter; bsc_nat_extract_lac(bsc, con, parsed, msg); if (imsi) con->filter_state.imsi = talloc_steal(con, imsi);
So we need to see how/why it sometimes remains scoped by:
+ struct bsc_connection contains 665 bytes in 12 blocks (ref 0) 0x1a7a250 + XXX contains 16 bytes in 1 blocks (ref 0) 0x1a1cb60 + XXX contains 16 bytes in 1 blocks (ref 0) 0x198e090 + XXX contains 16 bytes in 1 blocks (ref 0) 0x1a2c8d0
But as this is either scoped by the nat_sccp_connection or the bsc_connection the memory is freed once the TCP connection is dead or it can be forced by resetting the connection.
Updated by zecke almost 8 years ago
The other leak seems to be in the Osmux/RTP code.
- msgb contains 1515551 bytes in 362 blocks (ref 0) 0x11432a0 + msgb contains 1671640 bytes in 396 blocks (ref 0) 0x11432a0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bccb30 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bcba50 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bca970 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc9890 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc65f0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bbac50 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb79b0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc1190 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc3350 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bd84d0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb8a90 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc76d0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc4430 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bbefd0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb1470 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc87b0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc5510 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc2270 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bbdef0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bbbd30 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bc00b0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb9b70 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bbce10 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb0390 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1baf2b0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bad0f0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bac010 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1baaf30 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb57f0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb4710 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bae1d0 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba5ad0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb3630 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba49f0 - OSMUX test contains 165 bytes in 1 blocks (ref 0) 0x190d210 - OSMUX test contains 165 bytes in 1 blocks (ref 0) 0x1919cf0 - OSMUX test contains 165 bytes in 1 blocks (ref 0) 0x1919bf0 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba3910 - RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba1750 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba0670 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba6bb0 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb2550 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba1750 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba8d70 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba9e50 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1ba7c90 + RTP contains 4232 bytes in 1 blocks (ref 0) 0x1bb68d0 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1b9f590 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1b9e4b0 RTP contains 4232 bytes in 1 blocks (ref 0) 0x1b9d3d0
git grep '"RTP"' libmgcp/mgcp_network.c: was_rtcp ? "RTCP" : "RTP", libmgcp/mgcp_osmux.c: msg = msgb_alloc(4096, "RTP");
not sure how we go from 4096 to 4232 allocated bytes but I think it is most likely this place. As this is scoped by the msgb context, the bytes will never go away. The diff was taken with 30 minutes apart, no RTP packet should stay that long in a queue (and I don't think the address just happens to be recycled as the total memory usages grows).
Updated by zecke almost 8 years ago
IMSI "leak":
- We analyze access-lists before there is a tracked SCCP connection
- The libfilter/ code is shared by BSC/NAT but:
filter = bsc_nat_filter_sccp_cr(bsc, msg, parsed, &con_type, &imsi, &cause); ... will call bsc_nat_filter.c:bsc_nat_filter_sccp_cr .. which will fill out the bsc_filter_request and set the req.ctx to bsc (of type struct bsc_connection). .. then it goes to libfilter into code like this: *imsi = talloc_strdup(ctx, mi_string);
So this explains why the IMSI is scoped by the bsc_connection. It doesn't explain why it is not freed. So it takes another path as well. I think it comes from later Identity Requests.
Updated by zecke almost 8 years ago
IMSI "leak":
diff --git a/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c b/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c index 393aea3..e735290 100644 --- a/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c +++ b/openbsc/src/osmo-bsc_nat/bsc_nat_filter.c @@ -109,7 +109,7 @@ int bsc_nat_filter_dt(struct bsc_connection *bsc, struct msgb *msg, if (!hdr48) return -1; - req.ctx = bsc; + req.ctx = con; req.black_list = &bsc->nat->imsi_black_list; req.access_lists = &bsc->nat->access_lists; req.local_lst_name = bsc->cfg->acc_lst_name;
Updated by daniel almost 8 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 30
osmux: There is an issue when a circuit is deleted while it still has msgs in the buffer. The buffer contains a list of circuits which contain a list of msgs.
When the circuit is deleted the msgs are lost. The proposed fix is to dequeue and free the msgs if a circuit is deleted.
https://gerrit.osmocom.org/#/c/119
https://gerrit.osmocom.org/#/c/120
Updated by laforge almost 8 years ago
- Status changed from In Progress to Closed
- % Done changed from 30 to 100
daniel wrote:
osmux: There is an issue when a circuit is deleted while it still has msgs in the buffer. The buffer contains a list of circuits which contain a list of msgs.
When the circuit is deleted the msgs are lost. The proposed fix is to dequeue and free the msgs if a circuit is deleted.https://gerrit.osmocom.org/#/c/119
https://gerrit.osmocom.org/#/c/120
the "proposed" fix has long been merged, resolving the ticket. please don't wait for me to spot such things...