$ cat ./blog/traintracker.md

Watching Britain Move: Building a Real-Time Train Tracker

--data-visualization --real-time --typescript --maplibre --transit

What happens when you pipe Network Rail's firehose of train data into a browser? A meditation on real-time systems and the hypnotic beauty of transit visualization.

Stand on a bridge over a busy rail junction and watch for long enough, and patterns begin to emerge. The morning rush flows one direction, the evening rush reverses. Express services streak through while local stoppers dawdle. Delays cascade through the network like ripples in a pond.

TrainTracker captures this same sensation, but at scale—the entire UK rail network rendered as a constellation of colored dots on a dark map, each one a train carrying passengers somewhere important.

The Data Firehose

Network Rail’s TRUST system (Train Running Update and Status Tracking) is the central nervous system of British railways. Every movement—every arrival, every departure, every signal stop—generates a message. Subscribe to the public data feed and you’ll receive several thousand events per minute during peak hours.

The challenge isn’t accessing this data. Network Rail generously provides it via a STOMP message queue. The challenge is making sense of it.

A typical TRUST message arrives as terse JSON:

{
  "header": { "msg_type": "0003" },
  "body": {
    "train_id": "515G531I24",
    "event_type": "DEPARTURE",
    "loc_stanox": "87701",
    "actual_timestamp": "1704290400000",
    "timetable_variation": "2"
  }
}

That loc_stanox value—87701—is a STANOX code. To place this train on a map, you need to know that 87701 corresponds to London Euston, latitude 51.528, longitude -0.134. Multiply this lookup by thousands of stations and you have a coordinate system for the entire network.

The Architecture That Emerged

After several false starts, the system settled into a pattern that feels natural for real-time data:

  1. A Node.js server maintains an in-memory map of every active train
  2. TRUST messages update this map—creating, modifying, or deleting entries
  3. Changes broadcast to connected browsers via WebSocket
  4. The React frontend renders trains as dots on a MapLibre map

The critical insight was to send deltas rather than snapshots. When a train moves, the server broadcasts only that train’s new state. Browsers maintain their own local copy of the world, applying updates as they arrive. This keeps bandwidth manageable even during rush hour.

// Server broadcasts only what changed
{ type: "update", train: { trainId: "515G531I24", lat: 52.477, lng: -1.898, status: "on-time" }}

// Not the entire state of the world
{ type: "snapshot", trains: [/* thousands of trains */]}

The Color Language

The visualization uses three colors, chosen for instant comprehension:

  • Green: On time. All is well.
  • Amber: 1-5 minutes late. Minor delay. You might make your connection.
  • Red: More than 5 minutes late. Significant delay. Adjust your plans.

This traffic-light metaphor requires no legend (though one is provided). A glance at the map tells you whether today is a good day for British rail.

The color distribution tells its own story. A sea of green with scattered amber feels like a system under control. Spreading red, particularly clustered around major junctions, signals trouble—often a single incident cascading through interconnected services.

State Management and Resilience

Trains don’t simply appear and disappear. They follow a lifecycle:

  • Activation: The train enters the system, usually early morning
  • Movement: The train arrives at or departs from stations
  • Cancellation: The train is withdrawn from service
  • Reinstatement: A cancelled train returns (rare but it happens)
  • Identity Change: The train’s ID changes mid-journey (rarer still)

The server tracks this lifecycle, maintaining a recentStops array for each train—the last five stations visited, with timestamps and delay information. Hover over any dot and you see not just where the train is, but where it’s been.

Persistence adds another layer of resilience. Every 30 seconds, the server writes its state to disk. On restart, it loads the previous state, discarding only trains that haven’t reported in over two hours. This means a deployment doesn’t wipe the map clean—trains continue their journeys uninterrupted.

The Mock Data Fallback

Real-time systems fail. Network Rail’s feed occasionally drops. Connections timeout. When the WebSocket can’t reach the server after three attempts, the frontend switches to mock data—12 pre-positioned trains at major UK stations, with varying delay statuses.

This isn’t just for development convenience. It ensures the application always renders something meaningful. A visitor encountering an empty map might assume the product is broken. A visitor seeing mock data with a “reconnecting” indicator understands the situation.

What the Patterns Reveal

The first time you watch TrainTracker during a winter storm, you understand something about infrastructure you couldn’t learn from news reports. The delays don’t happen uniformly—they cluster, they propagate, they cascade through junctions like falling dominoes.

Morning rush hour on a clear day is different. Trains stream into London from all directions, a slow-motion implosion of green dots converging on a handful of termini. The evening rush reverses the flow, an explosion outward to suburbs and satellite cities.

Late at night, activity drops to a trickle. A handful of dots creep across the map—freight services, overnight maintenance trains, the last passenger services making their way to depots.

It’s mesmerizing in the way that all complex systems become mesmerizing when you can finally see them.

The Technical Bits

For those who care about such things:

ComponentTechnologyWhy
FrontendReact 19 + MapLibre GLModern rendering, excellent map performance
BackendNode.js + WebSocketNative async, efficient streaming
ProtocolSTOMPNetwork Rail’s choice, not ours
HostingCloudflare + DockerEdge distribution, containerized backend

The STANOX lookup table—mapping station codes to coordinates—contains over 2,000 entries. Building it required parsing Network Rail’s reference data (a 7MB JSON corpus) and cross-referencing with TIPLOC coordinate spreadsheets. The hit rate hovers around 95%; the remaining 5% are obscure sidings, depots, and freight-only locations that don’t appear on passenger maps.

Why Build This?

Every city has its transit nerds—the people who can recite bus routes from memory, who notice when a train’s coupling sounds different, who find peace in the rhythm of departures and arrivals. This project is for them.

But it’s also an exercise in real-time system design. The challenges are universal: maintaining state under constant mutation, gracefully handling disconnection, optimizing for bandwidth while preserving fidelity, visualizing complexity without overwhelming the user.

TrainTracker doesn’t solve a business problem. It doesn’t have a revenue model. It’s a window into a system that most people experience only as “the 8:47 is late again.”

Sometimes that’s enough reason to build something.


The source code is available on GitHub. Data provided by Network Rail.