Page 1 of 1

Select() errorfds values

Posted: Fri Apr 29, 2016 7:01 am
by vsabino
Hi,

I'm having an issue with my device sometimes closing sockets because the Select() function returns an error.
I'm more or less doing what is in the example in the programmers manual. "Simple TCP Server Using the select( ) Function".

My NetBurner is the client connecting to a PC. The NetBurner uses connect() to start the session.
I have two NetBurners and one PC on a private network via a switch, with static IP addresses. The PC also is connected to the company LAN via WI-FI using DHCP.
With WireShark I see the socket gets open, and after a few seconds the NetBurner closes it because the select() returned an error when inspected with FD_ISSET.

This does not happen continuously, but enough for me to notice it and be able to repeat it. So far, it's always the same NetBurner having the issue. The other seems to be connecting fine all the time. The NetBurner code that handles the communication portion is the same for both.
When it happens it's always a few seconds after the socket gets open. after a few attempts, the socket successfully opens and remains open until I shut down the application.

Any ideas as to what is happening? any way to see the error codes to further troubleshoot?

Thanks,
Victor

Re: Select() errorfds values

Posted: Wed May 04, 2016 10:33 am
by vsabino
I did a lot of troubleshooting on my own and with Netburner Support.

One of the last tests was to put a 2 second delay ( OSTimeDly(TICKS_PER_SECOND *2) ) before the line that closes the sockets after detecting the select() error.
I saw that the 2 seconds were not reflected in wireshark. With that and other debugging statements I placed, I looked closer at my code and found the cause! It's a bug because I use different OSTasks.
The Netburner normally connects to a PC on TCP as a result of an RS232 handshaking. We are switching to doing the handshaking over UDP.
There is a keep-alive timeout I forgot to clear when connecting to TCP as a result of the UDP packet parsing. That timeout triggers another task to assume connection loss and close the sockets.

The support from Netburner was excellent!