TCP connection limiting in Go: Blocking Accept vs Active Load Shedding
I recently reviewed a Go codebase that implemented connection limiting in a TCP proxy. The goal was to protect the backend from being overwhelmed by too many concurrent connections.
The initial implementation used a semaphore to gatekeep the Accept() call. While this seems intuitive, it relies on the kernel’s backlog behavior in ways that can cause significant operational issues.
TL;DR #
Don’t block before calling Accept(). If you do, the OS continues to complete TCP handshakes and queues connections in the kernel backlog. Clients believe they are connected, but your application is not processing them.
Instead, use Active Load Shedding: Accept() the connection immediately, check your limits, and if full, explicitly Close() the connection. This provides immediate feedback to the client.
The “Blocking Accept” Pattern #
The pattern I encountered looked like this:
func (l *LimitedListener) Accept() (net.Conn, error) {
// 1. Wait for a permit
l.semaphore <- struct{}{}
// 2. Accept the connection
conn, err := l.listener.Accept()
if err != nil {
<-l.semaphore // release if accept failed
return nil, err
}
return &wrappedConn{conn, l.semaphore}, nil
}
The logic is straightforward: “Wait until we have capacity, then accept the next client.” It reads like cooperative concurrency, but it ignores the underlying TCP mechanics.
The Kernel Backlog #
The problem lies in what happens when Accept() is not called.
When your application stops calling Accept(), the operating system does not stop receiving connections. The kernel continues to complete the 3-way TCP handshake with clients. Once established, these connections are placed into the accept queue (backlog). This queue grows until it hits the limit defined by the listen(2) backlog argument.
In the blocking model, when the semaphore fills up, the application loop pauses. Meanwhile, new clients are still connecting. The kernel ACKs their SYN packets and queues them, expecting the application to pick them up shortly.
The Client Experience #
From the client’s perspective, the connection appears established.
- The TCP handshake is complete. The client starts sending data.
- Since the application hasn’t called
Accept(), it isn’t reading from the socket. The client’s writes fill the TCP receive window and then block. - Eventually, the client times out. This often takes significantly longer than a connection refusal.
This behavior masks the overload. Upstream load balancers may even consider the node healthy because the TCP port is open and accepting connections (into the kernel backlog), effectively black-holing traffic.
Active Load Shedding #
An alternative is to move the gatekeeping after the Accept() call.
The logic changes to:
- The application loop spins on
listener.Accept()as fast as possible. - Once you hold the
net.Conn, check the concurrency limit (e.g., with a non-blocking send to a buffered channel). - If over the limit,
Close()the connection immediately.
func (l *LimitedListener) Accept() (net.Conn, error) {
for {
// 1. Always Accept immediately
conn, err := l.listener.Accept()
if err != nil {
return nil, err
}
// 2. Check limits non-blocking
select {
case l.semaphore <- struct{}{}:
// We have capacity, proceed
return &wrappedConn{conn, l.semaphore}, nil
default:
// 3. Overload! Shed the load.
conn.Close()
metrics.Inc("listener_saturated")
continue
}
}
}
Why this is better
- Immediate feedback as the client receives a
FIN(orRSTdepending on how you close) immediately. There is no hanging. - The kernel backlog stays empty, ensuring that valid requests are processed with minimal latency when capacity frees up.
- You can count exactly how many connections were rejected. In the blocking model, the rejected connections are invisible to the application until they show up in OS-level packet counters.
Visualization
It’s gemini 3 release day, so we can even visualize this. Note how the kernel backlog grows and the clients eventually time out (as you balance the traffic load against server limit).
How modern proxies handle this #
It is worth noting that mature proxies like Nginx and HAProxy often use both patterns, but for different purposes.
Process protection with Blocking Accept
They use a global limit to protect the proxy process itself from running out of file descriptors or memory. When this limit is reached, they stop calling Accept(), causing the kernel backlog to fill.
- Nginx:
worker_connectionsacts as this hard limit. - HAProxy: The
maxconnsetting defines this process-wide limit.
Traffic control with Active Load Shedding
For controlling traffic to backends or limiting specific clients, they use active strategies.
- Nginx: The
limit_connmodule accepts the connection, checks the limit, and then actively closes it (or returns 503 for HTTP) if the limit is exceeded. - Traefik: The
InFlightConnmiddleware functions as a gatekeeper after the connection is accepted, closing it immediately if the limit is reached.
The key takeaway is that “Blocking Accept” should be a last-resort safety valve for the process itself, not the primary mechanism for shaping traffic or protecting backends.
Summary #
In network programming, relying on the kernel queue for backpressure is rarely the right choice for user-facing services. Accept the work, assess capacity, and if full, explicitly reject the connection.
Resources #
- SYN packet handling in the wild (Cloudflare) - A deep dive into the mechanics of Linux TCP queues.
- Using load shedding to avoid overload (AWS Builders Library) - Broader patterns for protecting services from overload.
- Building Blocks of TCP (High Performance Browser Networking) - Essential reading for understanding TCP handshakes and queues.