Skip to content

net/http: Transport needs a limit on total connections (per host) #6785

Closed
@calmh

Description

@calmh
(go1.2rc4)

We're working on a reverse HTTP proxy that does some "stuff" (that's not
relevant) and proxies towards one or more backend HTTP servers. During testing I ran
into connection failures apparently due to running out of file descriptors. In narrowing
this down I reduced the problem to;

The "null" reverse proxy (doing nothing but proxying, using the
httputil.ReverseProxy):

http://play.golang.org/p/p1g4bpTZ_g

A trivial HTTP server acting as the backend behind the above proxy; it simply counts the
number of connections and responds successfully:

http://play.golang.org/p/F7W-vbXEUt

Running both of these, I have the proxy on :8080 forwarding to the server at :8090. To
test it, I use wrk (https://github.com/wg/wrk);

jb@jborg-mbp:~ $ wrk -c 1 -t 1 http://localhost:8080/
...

Granted, this isn't a super realistic reproduction of the real world since the latency
on localhost is minimal. I can't prove the same thing _can't_ happen in production
though.

Using one connection (-c 1) up to about three, this works perfectly. The server side
sees a bunch of requests over one to three connections, i.e. the number of backend
connections from the proxy matches the number of incoming connections. At around -c 4
and upwards, it blows up. The proxy doesn't manage to recycle connections quickly enough
and starts doing regular Dial:s at a rate of thousands per second, resulting in

...
2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested
address
2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested
address
2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested
address
2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested
address
...

from the proxy code and of course HTTP errors as seen by wrk. 

My theory, after going through the http.Transport, is that when the number of requests/s
to the proxy goes up, the small amount of bookkeeping that is required to recycle a
connection (putIdleConn etc) starts taking just long enough that the next request in the
pipeline gets in before a connection is idle. The Transport starts Dial:ing, adding more
connections to take care of, and it explodes chaotically. I would prefer if it blocked
and awaited an idle connection instead of Dial:ing at some point.

Line 92, "// TODO: tunable on global max cached connections" seems relevant,
although in my case the tunable should probably be max connections per host instead of
globally.

I'll probably take a stab at implementing something like that to fix this (i.e.
rewriting Transport to limit connections), since I couldn't figure out a way around it
by just using the available API:s. Unless I seem to have misunderstood something
obvious...

//jb

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeatureRequestIssues asking for a new feature that does not need a proposal.FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.SuggestedIssues that may be good for new contributors looking for work to do.help wanted

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions