tests.ws

What Is WebSocket? Protocol Explained

websocket protocol real-time basics

WebSocket is a communication protocol that provides full-duplex (two-way) communication between a client and a server over a single, long-lived TCP connection. Sometimes written as “web socket” (two words), the WebSocket definition centers on enabling real-time, bidirectional communication. The WebSocket protocol meaning is straightforward: unlike HTTP where the client must send a request to get a response, WebSocket lets both sides send messages independently at any time.

The WebSocket protocol was standardized as RFC 6455 in 2011 and is supported by every modern browser.

Why WebSocket Exists

Before WebSocket, web developers had limited options for real-time communication. All of them were workarounds built on top of HTTP, a protocol designed for request-response interactions.

Polling is the simplest approach. The client sends an HTTP request every few seconds asking “is there new data?” The server responds with either new data or an empty response. This works, but it wastes bandwidth and server resources. If you poll every 2 seconds and data changes once a minute, 29 out of 30 requests return nothing useful.

Long polling improves on this. The client sends a request and the server holds the connection open until there is new data. Once the server responds, the client immediately sends another request. This reduces wasted requests but still creates overhead from repeated connection setup and HTTP headers.

Server-Sent Events (SSE) let the server push data to the client over a single HTTP connection. This is efficient for server-to-client streaming, but the client cannot send data back over the same connection. SSE also only supports text data, not binary.

WebSocket solved these problems by providing a persistent, bidirectional channel that either side can use at any time.

How the Protocol Works

A WebSocket connection goes through three phases: handshake, data transfer, and closing.

The Opening Handshake

Every WebSocket connection starts as a regular HTTP request. The client sends a GET request with special headers asking the server to upgrade the connection:

GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Origin: https://example.com

The Sec-WebSocket-Key is a random Base64-encoded value. The server uses it to prove that it understood the upgrade request.

If the server supports WebSocket, it responds with HTTP 101:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

The Sec-WebSocket-Accept value is computed from the client’s key concatenated with a fixed GUID and then SHA-1 hashed. This prevents misconfigured HTTP servers from accidentally accepting WebSocket connections.

After this handshake, the TCP connection stays open and both sides switch to the WebSocket frame format.

Data Frames

WebSocket sends data in frames. Each frame contains:

  • FIN bit indicates whether this is the final fragment of a message. Large messages can be split across multiple frames.
  • Opcode identifies the frame type: text (0x1), binary (0x2), close (0x8), ping (0x9), or pong (0xA).
  • Mask bit and masking key are required for all client-to-server frames. This prevents cache poisoning attacks on intermediate proxies.
  • Payload length can be 7 bits (for payloads up to 125 bytes), 7+16 bits (up to 65535 bytes), or 7+64 bits (up to 2^63 bytes).
  • Payload data is the actual content.

A text frame carrying “Hello” from client to server is only 11 bytes total: 2 bytes header + 4 bytes mask + 5 bytes payload. Compare that to an HTTP request which typically carries 200-800 bytes of headers.

Ping and Pong

Either side can send a ping frame to check if the connection is still alive. The other side must respond with a pong frame containing the same payload. This is useful for detecting dead connections, especially through proxies and load balancers that may silently drop idle connections.

Closing the Connection

Either side can initiate a close by sending a close frame. The close frame can include a status code and a reason string. The other side responds with its own close frame, and then both sides close the TCP connection.

A normal closure uses status code 1000. Code 1001 means the endpoint is going away (server shutting down, browser tab closed). Codes 1002-1015 cover various error conditions.

When to Use WebSocket

WebSocket is the right choice when your application needs low-latency, bidirectional, real-time communication.

Chat applications need to deliver messages to all participants instantly. With HTTP, each client would need to poll for new messages. With WebSocket, the server pushes new messages to connected clients as soon as they arrive.

Live dashboards and monitoring benefit from WebSocket when data changes frequently. Stock tickers, server metrics, analytics dashboards, and live sports scores all fit this pattern.

Multiplayer games require tight synchronization between players. Game state updates, player movements, and actions need to be exchanged with minimal delay. WebSocket’s low overhead per message makes it practical to send dozens of updates per second.

Collaborative editing tools like Google Docs use real-time protocols to synchronize changes across multiple users. Each keystroke or cursor movement is broadcast to all participants.

IoT and sensor data often involves a continuous stream of readings from devices. WebSocket provides an efficient channel for this kind of high-frequency, low-payload communication.

Trading platforms use WebSocket for real-time price feeds. Delays of even a few hundred milliseconds can matter when prices are changing rapidly.

When Not to Use WebSocket

WebSocket is not a replacement for HTTP. Many use cases are better served by simpler approaches.

CRUD APIs that fetch, create, update, or delete resources work well with REST over HTTP. A user profile page does not need a persistent connection.

Infrequent updates that happen every few minutes or less do not justify the complexity of WebSocket. Simple polling or SSE handles this with less code and infrastructure overhead.

Server-to-client only streaming is better handled by Server-Sent Events. SSE is simpler, works over standard HTTP, reconnects automatically, and is supported by CDNs and proxies without special configuration.

File transfers work better over HTTP which has built-in support for range requests, caching, compression, and progress tracking.

Cacheable content cannot use WebSocket. HTTP responses can be cached by browsers, CDNs, and proxies. WebSocket messages cannot.

Browser API Example

Here is a complete example of connecting to a WebSocket server in JavaScript:

const ws = new WebSocket('wss://echo.websocket.org');

ws.addEventListener('open', () => {
  console.log('Connected');
  ws.send('Hello, server!');
});

ws.addEventListener('message', (event) => {
  console.log('Received:', event.data);
});

ws.addEventListener('close', (event) => {
  console.log('Disconnected:', event.code, event.reason);
});

ws.addEventListener('error', (error) => {
  console.error('WebSocket error:', error);
});

The WebSocket constructor takes a URL starting with ws:// (unencrypted) or wss:// (encrypted over TLS). You should always use wss:// in production. Unencrypted WebSocket connections can be intercepted and are blocked by many proxies and firewalls.

The send() method accepts strings, ArrayBuffers, Blobs, and ArrayBuffer views. The message event’s data property contains the received data as a string (for text frames) or an ArrayBuffer/Blob (for binary frames).

You can check the connection state using ws.readyState:

  • 0 (CONNECTING) means the handshake is in progress.
  • 1 (OPEN) means the connection is ready to send and receive.
  • 2 (CLOSING) means a close frame has been sent or received.
  • 3 (CLOSED) means the connection is closed.

For a detailed walkthrough, see the JavaScript WebSocket guide.

Security Considerations

WebSocket connections are subject to the same-origin policy during the handshake (the browser sends the Origin header), but the server must validate it. Unlike HTTP CORS, there is no browser-enforced cross-origin restriction after the handshake completes. The server is responsible for checking the Origin header and rejecting connections from untrusted domains.

Other security concerns include:

  • Always use WSS in production. TLS encrypts the connection and prevents man-in-the-middle attacks.
  • Authenticate the connection during or immediately after the handshake. Common approaches include sending a token in the URL query string, in a cookie, or as the first message after connecting.
  • Validate all incoming messages on the server. Treat WebSocket input with the same caution as HTTP request bodies.
  • Implement rate limiting to prevent abuse. A single client can send thousands of messages per second over a WebSocket connection.

For a deeper look, see WebSocket Security Best Practices.

WebSocket Compared to Alternatives

FeatureHTTP PollingLong PollingSSEWebSocket
DirectionClient to serverClient to serverServer to clientBoth
ConnectionNew each timeHeld open, reconnectsPersistentPersistent
OverheadHigh (headers every request)MediumLowVery low
Binary dataYesYesNoYes
Browser supportUniversalUniversalModern browsersModern browsers
Works through proxiesAlwaysUsuallyUsuallySometimes needs config

For a full comparison with HTTP, see WebSocket vs HTTP.