Project

General

Profile

Actions

Bug #4430

open

firmware can get in endless out-of-memory loop on OUT EP flood

Added by laforge about 4 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
Category:
firmware
Target version:
-
Start date:
03/01/2020
Due date:
% Done:

10%

Spec Reference:

Description

When flooding the OUT EP with too many messages, the firmware can get into an OOM situation from which it doesn't recover anymore. All it will do is print the below messages:

-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!
-E- _talloc_zero() out of memory!

I'm currently reproducing this with a test case that sends 1000 bogus OUT EP transfers to the device.

Actions #1

Updated by laforge about 4 years ago

This may actually not be as bad as it sounds.

The main loop is trying to continuously allocate USB buffers from the pool in order to hand them to the USB Rx code for the OUT EP. Obviously, if the host PC is sending more data than we can process, and so we cannot refill those buffers fast enough.

However, it seems that the situation is not recovering even after the host PC stops sending data. That's a bug.

In terms of the buffer refill under overload, we could implement a logic that would stop/stall the OUT EP until we have free'd the first USB buffer?

Actions #2

Updated by laforge about 4 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 10

We could try to deactivate the 'slow path' in dispatch_received_usb_msg(). It attempts to re-assemble an OUT EP transfer that's split over sevreal incoming msgbs. Not sure if that feature was ever used. Maybe it's doing harm in this situation. It may be sufficient to support only transfers smaller than the usb buffer / msgb size.

Actions #3

Updated by laforge almost 4 years ago

  • Status changed from In Progress to New
Actions #4

Updated by laforge about 3 years ago

  • Assignee changed from laforge to Hoernchen
Actions #5

Updated by Hoernchen about 3 years ago

Can probably be fixed by (not) touching RX_DATA_BKx so the ep naks further communication attempts by the host, there is no reason to allow enqueueing arbitrary amounts of commands anyway.

Actions #6

Updated by laforge over 1 year ago

  • Assignee changed from Hoernchen to laforge
Actions #7

Updated by laforge over 1 year ago

Can probably be fixed by (not) touching RX_DATA_BKx so the ep naks further communication attempts by the host, there is no reason to allow enqueueing arbitrary amounts of commands anyway.

for the record: tracing this through UDP, USBD_HAL etc. up to our firmware, this means we'd no longer call usb_refill_from_host() for the OUT EP unless we have successfully processed the previous OUT EP message. This should lead to the device NAKing any OUT transfers until the device is capable to process the next one.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)