Netburner hanging up for about 256 seconds

Discussion to talk about software related topics only.
Post Reply
vsabino
Posts: 32
Joined: Wed May 14, 2008 8:45 am

Netburner hanging up for about 256 seconds

Post by vsabino »

Hi all,

I post this question to see if anybody else out there is experiencing my problem.

I have several Netburner devices in the field (which means I can't look at I/O lines with a scope) that hang up now and then.
My netburners communicate over serial and TCP (over several sockets) with a host PC, they respond to commands from the PC and also report acquired data.
They all hang up for the same amount of time: around 256 seconds.
During that time the Netburner "vanishes" from autoupdate and does not respond to the serial port or TCP.
After those 256 seconds, the device comes back to life and keeps on functioning until a next hang up, which can be hours, days, or weeks later.

I used wireshark to capture the TCP packets that exhibit what happens when the Netburner hangs up. They show that the Netburner retransmits some packets on two or more sockets over and over again, following the TCP retransmission rules (doubling the retransmission time, from 0.5 seconds up to 32 seconds). The PC is acknowledging these packets.

I had this problem for a long time now, so far I could not reproduce it in the lab.
I observe this happen with versions 2.4RC2 and 2.6.3.

Thanks,
Victor
rnixon
Posts: 833
Joined: Thu Apr 24, 2008 3:59 pm

Re: Netburner hanging up for about 256 seconds

Post by rnixon »

Hello Victor,

Can you be 100% positive the PC is ack'ing the packets correctly? Sometimes these traces can be hard to read and you need to verify the sequence numbers of the transmit and ack.

You may be running out of buffers. Can you add code to display how many buffers are available when this occurs?

What is different between your setup in which it does not happen, and the location(s)? where it does?
Ridgeglider
Posts: 513
Joined: Sat Apr 26, 2008 7:14 am

Re: Netburner hanging up for about 256 seconds

Post by Ridgeglider »

Victor: can you shed any light on what happens when your device vanishes? You say it reappears in 256 seconds, but what happens during this time? Does the NB reset? If so, maybe the STACKCHECK tool could provide insight on whether you might be running out of stack space on a particular task. For info on how to use stackcheck, and also on how to detect buffer overflows, see this useful post from Forrest: http://forum.embeddedethernet.com/viewt ... f=5&t=1411
vsabino
Posts: 32
Joined: Wed May 14, 2008 8:45 am

Re: Netburner hanging up for about 256 seconds

Post by vsabino »

Thanks to both for your posts!

Indeed I was running out of buffers. I finally could replicate it here in the lab.
I got a lot of help from Netburner Support.
I forgot to mention that my system consists of two Netburners on different enclosures. The one that is running out of buffers communicates with the PC and the other Netburner.
Both Netburners exchange some handshaking via UDP. This what was starving my buffers.
The reason I could never reproduce here in the lab was because I never had hooked up the second Netburner.

Ridgeglider, thanks for that link, I'll keep it handy for future use.
I have no idea why it recovers in 256 seconds. The Netburner does not hang up, I monitor some I/O lines to see what is going on, and it does cycle between some tasks I created. It just can't Rx anything or Tx any new information, except for those packets it repeats.
when it recovers, for some reason 11 buffers get released, so the Netburner keeps on going until the next time....

The problem was that I did not drain the buffers to get rid of unwanted UDP packets, and I never unregistered the UDP Fifo.

Thanks again,
Victor
Post Reply