Bug #6274
openyate is eating 260% CPU while doing nothing
0%
Description
Something seems wrong if an idle yate doing nothing consumes about 260% CPU.
Looking at the yate threads, we can see that each Zap Worker
consumes 10..12% of CPU while doing nothing:
This is quite weird, especially considering that osmo-e1d with 23 trunkdevs in parallel only uses 20% CPU - while actually processing packets/frames for all of those trunks.
a strace of such a single Zap Worker thread looks like this:
select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0 select(11, [10], NULL, [10], {tv_sec=0, tv_usec=100}) = 0 (Timeout) clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=0}, NULL) = 0
So basically we're using select to check if data on a single FD has arrived, time-out after 100 microseconds and then call a zero-nanoseconds sleep and start all-over again.
Looking at the source code, this looks like extremely poor programming style:
// Process incoming data
bool ZapInterface::process()
{
if (!m_device.select(100))
return false;
if (!m_device.canRead()) {
if (m_device.event())
checkEvents();
return false;
}
int r = m_device.recv(m_buffer,m_bufsize + ZAP_CRC_LEN);
if (r == -1) {
if (m_device.event())
checkEvents();
return false;
}
if (r < ZAP_CRC_LEN + 1) {
Debug(this,DebugMild,"Short read %u bytes (with CRC) [%p]",r,this);
return false;
}
s_ifaceNotifyMutex.lock();
m_notify = 0;
s_ifaceNotifyMutex.unlock();
DataBlock packet(m_buffer,r - ZAP_CRC_LEN,false);
#ifdef XDEBUG
String hex;
hex.hexify(packet.data(),packet.length(),' ');
Debug(this,DebugAll,"Received data: %s [%p]",hex.safe(),this);
#endif
receivedPacket(packet);
packet.clear(false);
return true;
}
void ZapWorkerThread::run()
{
if (!m_client)
return;
DDebug(&plugin,DebugAll,"%s is running for client (%p): %s",
s_threadName,m_client,m_address.c_str());
while (true) {
if (m_client->process())
Thread::check(true);
else
Thread::yield(true);
}
}
So basically the thread is continuously calling Thread:yield rather than waiting on some actual I/O to happen.
Files
No data to display