Page 1 of 1
PPP daemon woes
Posted: Wed Jul 16, 2008 12:18 am
by thomastaranowski
I have a problem where I have a remote site, into which a user dials up and connects, and browses some status and control pages. The one issue I have is that periodically the PPP daemon 'hangs', meaning the modem/ppp server will not accept any more dial-in attempts. I was able to reliably repro a similar scenario in my lab where the following sequence of events ALWAYS resulted in a hung PPP daemon:
1.) dial up to the target
2.) do a hard shutdown of the user connection (I unplugged the phone line going into the PC)
3.) The PPP daemon stays in a connected state for as long as I watched it (1 hour)
Shouldn't there be a timeout mechanism for this PPP connection? Is the live link detection something I have to specifically enable in the PPP configuration setup, or some special way I need to restart the PPP daemon or something?
Thanks!
Re: PPP daemon woes
Posted: Thu Jul 17, 2008 9:12 pm
by rnixon
Whether Ethernet or PPP, I don't think one end can tell if the other is gone unless data has been sent and there is no ack. I'm not an expert in this area, but I think you need to add some code on your server side and do one of two things:
1. Add code to override an open server socket (ie close it) when a new incoming request is detected.
2. Add a timer and close the socket if no data is detected after a specified amount of time.
If no data is going back and forth, and one side goes away, there is no way for the other end to know one side has died. Some programs use a "heartbeat" that sends/receives a simple query to determine if the other host is still there. This was a problem on a linux server I worked on, and they called them "half open sockets". Eventually the linux server would not readily accept a new connection until an application closed some of the open ones that didn't have any activity for a long time. I saw the timeout and override implementations in the SerialBurner example program.
Re: PPP daemon woes
Posted: Thu Jul 17, 2008 10:38 pm
by thomastaranowski
Thanks for the reply, but there are a couple of additional details. I'm certain that the issue isn't with the network stack, as the root cause is that the PPP daemon is failing to detect the connection drop. There are 2 basic scenarios for a dial-up connection to fail. The first: if I unplug the phone line from the netburner (simulating a bad connection or damage on the netburner segment), the modem/PPP daemon correctly detects connection failure, probably because the modem can detect the loss of line. However, if I unplug the phone line from my PC during a connection, the remote side never times out.
From my limited research, there appears to be a keepalive component to the PPP connection protocol. It seems to me that the PPP server needs to detect that the connection is dead, and hang up the call. This could be an invalid assumption, as I'm not a PPP expert, so I was hoping someone more experienced with the protocol could add some insight.
Re: PPP daemon woes
Posted: Sat Jul 19, 2008 10:30 am
by rnixon
I see what you are saying, but my understanding (i'm not an expert) is that the only way to detect if a peer has gone away is to attempt an i/o operation. A simple i/o operation to look for a peer is commonly called a heartbeat. To correct the problem you are describing, I think your application will need to implement its own heartbeat. I looked at the PPP RFC 1661, and could not find any mention of a keepalive. The references I did find seemed to be extensions implemented on linux. Sounds to me like the only way to quickly detect a down peer is to implement your own heartbeat/keepalive.
Re: PPP daemon woes
Posted: Sat Jul 19, 2008 2:35 pm
by thomastaranowski
rnixon, the thing to keep in mind is that there is a physical layer protocol below the tcp layer you are looking at. For example, the modem on the one end needs to detect the loss of carrier, and hang up. I suspect that the modem could be incorrectly wired up or initialized, as it still thinks it has a carrier when it doesn't. I also found there is also a configurable idle detection as part of the modem AT command set, which can be configured to hang up the modem after 1-10 seconds of idle time.
Now, when considering the tcp connection, a keepalive or session protocol at the application layer needs to be implemented to know when to close the connection. There is also a keepalive feature of the tcp spec (rfc 1122), but not all embedded stacks have it.
Re: PPP daemon woes
Posted: Sat Aug 09, 2008 6:44 pm
by thomastaranowski
I ended up adding an IDLE_TIMEOUT(S30 on my modem) to the modem init string, and that solved part of the issue. As part of a failsafe system, I added a ppp daemon reset algorithm. There were a couple sneaky things about it that folks should know about. The main thing is after calling Stop, you can't call start right away, but first have to wait for the PPP daemon to be in an eClosed state. Once that's done, the you can call StartPPPDameon(). I sometimes had problems with the first call to Start, as the modem wouldn't always successfully initialize, and I don't know why. I added a simple while loop that will try starting again if it fails, and that works fine.
StopPPPDameon()
//Wait for the Daemon to stop
while(GetPPPState() != eClosed) {
StartPPPDameon()
Re: PPP daemon woes
Posted: Wed Sep 03, 2008 6:24 am
by kevin_d_mccall
Nice one,
I too am having PPP woes and tried implementing the PPP echo request.
The NB code already responds to PPP echo request with an echo reply (do_ser in ppp.cpp) but had no way to generate a PPP echo request.
Still working on it...
Thanks for the tip on waiting for eClosed though, very useful.