Prerequisite: Protocols

HTTP was designed for documents, not streams. The model is simple: client sends a request, server sends a response, connection closes. That works well for loading a webpage but breaks down when you need the server to push updates - a live sports score, a new chat message, a progress bar for a background job. This post covers how the web evolved to handle real-time communication.

The Problem with HTTP for Real-Time

The naive solution is polling: every N seconds, the client sends a fresh HTTP request asking “anything new?” The server responds immediately, usually with “nothing yet,” and the cycle repeats. This is easy to implement and works everywhere, but it wastes bandwidth and CPU on empty responses. At ten requests per second across ten thousand users, most of those requests return nothing.

Long polling is the first real improvement. The client sends a request, but instead of responding immediately, the server holds the connection open until it has data to send. Once it responds, the client immediately opens a new long-poll request. This cuts down on empty responses but keeps server threads or file descriptors occupied for the duration of each hold. It also introduces complexity around timeouts and reconnection logic.

Both approaches share a structural flaw: the client is always the initiator. Every piece of data from the server must be a response to a prior request.

Server-Sent Events (SSE)

SSE breaks the request-response symmetry in the simplest possible way. The client makes one HTTP request, and the server responds with Content-Type: text/event-stream. That response never ends. The server writes newline-delimited text events down the open connection whenever it has something to say.

The format is plain text:

data: {"score": 42, "team": "home"}\n\n

Fields include data, event (optional event type), id (for resumption), and retry (reconnect delay in ms). The browser’s built-in EventSource API handles all the plumbing:

const source = new EventSource('/api/scores/live');

source.addEventListener('score-update', (e) => {
  const payload = JSON.parse(e.data);
  document.getElementById('score').textContent = payload.score;
});

source.onerror = () => {
  // EventSource reconnects automatically using the last event id
};

If the connection drops, EventSource reconnects automatically and sends a Last-Event-ID header so the server can replay missed events. This built-in reconnection with resumption is one of SSE’s best features - you get it for free without client-side retry logic.

SSE is one-directional: server to client only. That is often exactly what you need. Notifications, live dashboards, feed updates, progress indicators - none of these require the client to send data back over the same channel.

WebSockets

WebSockets provide a full-duplex channel over a single persistent TCP connection. The setup begins with a standard HTTP request containing an upgrade header:

GET /chat HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

The server responds with 101 Switching Protocols and from that point on the connection speaks the WebSocket protocol rather than HTTP. Data flows in frames - small binary units that can carry text, binary data, ping/pong heartbeats, or close signals. Both sides can send frames at any time without waiting for the other.

In the browser:

const ws = new WebSocket('wss://example.com/chat');

ws.onopen = () => ws.send(JSON.stringify({ type: 'join', room: 'general' }));
ws.onmessage = (e) => renderMessage(JSON.parse(e.data));
ws.onclose = (e) => scheduleReconnect(e.code);

ws:// is unencrypted; wss:// is TLS-wrapped and required in production. Unlike SSE, WebSockets have no automatic reconnection - your application code must detect closure and re-establish.

When to Use Which

Situation Recommendation
Server pushes updates, client only reads SSE
Bidirectional messages (chat, collaborative editing, games) WebSocket
Already using HTTP/2 with many streams SSE (multiplexes for free)
Need binary frames or sub-protocols WebSocket
Simple deployment, works through all proxies SSE

SSE is underused. Its HTTP-native design means it works through standard reverse proxies without special configuration. WebSockets, by contrast, require proxy support for the upgrade handshake - older load balancers sometimes strip the headers.

Scaling Persistent Connections

Both SSE and WebSockets maintain long-lived connections, which creates a scaling challenge. Each server process can hold thousands of connections open, but when a user connects to server A and an event needs to be delivered to that user, any server in the fleet might receive the event trigger.

Two common approaches:

Sticky sessions: The load balancer routes a given user to the same server on every request. Simple, but uneven load distribution and no failover if a server dies.

Shared pub/sub (Redis): Each server subscribes to a Redis channel. When any server receives an event, it publishes to the channel. All servers receive it and deliver it to their locally connected clients. This is the standard approach for production fan-out.

Producer → Redis Pub/Sub → Server A → Client 1
                         → Server B → Client 2
                         → Server C → Client 3

Examples

Live leaderboard with SSE (FastAPI):

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import asyncio, json, redis.asyncio as aioredis

app = FastAPI()

async def leaderboard_stream():
    r = aioredis.from_url("redis://localhost")
    pubsub = r.pubsub()
    await pubsub.subscribe("leaderboard")
    async for message in pubsub.listen():
        if message["type"] == "message":
            yield f"data: {message['data'].decode()}\n\n"

@app.get("/leaderboard/stream")
async def stream():
    return StreamingResponse(leaderboard_stream(),
                             media_type="text/event-stream")

Any part of the system that writes score updates calls PUBLISH leaderboard <json>. Every connected browser receives the update within milliseconds.

Chat room with WebSockets:

A chat room requires each message from one client to be broadcast to all other clients in the room. The typical architecture:

  1. Each WebSocket server process maintains an in-memory set of active connections per room.
  2. On message receipt, the server publishes the message to a Redis channel named after the room.
  3. Every server subscribed to that channel forwards the message to its locally connected clients.

This means a message sent from a client connected to server A reaches a client connected to server C, with Redis handling the fan-out. The pub/sub round trip adds roughly 1–2ms, imperceptible for chat.

Connection management matters: track connection state per client, handle ping/pong heartbeats to detect dead connections, and implement backpressure if a slow client’s send buffer fills up.


Read Next: Parallelism