begin_request fails with connection was closed by server

Joe Orton joe at manyfish.co.uk
Sun Aug 2 18:26:34 EDT 2009


On Mon, Aug 03, 2009 at 12:09:47AM +0200, Helge Heß wrote:
> On 02.08.2009, at 23:49, Joe Orton wrote:
>> On Thu, Jul 30, 2009 at 06:17:59PM +0200, Helge Heß wrote:
>>> I've just encountered such:
>>> ---snip---
>>> begin_request failed: 1: Could not read status line: connection was
>>> closed by server
>>> ---snap---
>>>
>>> Q: shouldn't Neon automatically restart the connection? The socket
>>> probably just went down because of a persistent connection timeout?
>>
>> Yes, that is (or should be) what happens already.  Do you have a
>> reproduction case for this failure, and/or a packet trace/debug log?
>
> No, its on Windoze and its hard for me to get a useful log on that. And 
> its also not strictly reproducable, but rather occassional (maybe a 
> global variable, threading issue? - we never use a ne_session from two 
> threads at the same time, but we pool them and reuse them in different 
> threads, would that be a problem?)

It should be fine, there are no global variables in play here.

> I actually wonder whether is a Windows specific issue, maybe the socket 
> error codes are slightly different or something like that?

It's possible.  neon maps two specific Windows socket errors to the 
"connection closed" error - 

#define NE_ISCLOSED(e) ((e) == WSAESHUTDOWN || (e) == WSAENOTCONN)

so you're seeing one of those two socket errors here, if you're getting 
the "Connection closed by server" error string back.

> What I'm currently doing is check for that specific error (by strstr'ing 
> the error text ...), then reopen a ne_session. This seems to work 
> reliably.
>
> I'll see whether I can somehow isolate the issue. I guess not :-/
>
> My guess is that neon gives up too early. As far as I can see the error 
> is a valid result of begin_request if its the _initial_ request to the 
> server. Eg a firewall could reject/timeout all attempts. Or the user 
> tried to connect a server/port which points to a different  
> service/protocol. The server might then immediatly close the connection, 
> and Neon should detect that.
> But if the ne_session did issue an HTTP request successfully before, the 
> closedown of the connection is more likely a TCP timeout? So maybe add a 
> flag whether this is the first request after a connect() call? And retry 
> if its not?

Yup, you've described exactly what neon does already; see use of the 
"retry" flag in ne_request.c:send_request().

How have you determined this is a neon bug rather than a server problem?  
A server process crash/shutdown will often manifest in this way, for 
example.

Regards, Joe




More information about the neon mailing list