Server Architecture Walkthrough

This is a plain-language map of the Rust server: what owns what, how data moves, which APIs connect the pieces, and where the service layer is loosely or tightly coupled.

It is a companion to architecture.md and server-sim.md. Those remain the contract docs. This file is for understanding the shape of the system before changing it.

The Short Version

The server is three nested systems:

  1. The process shell in server/src/main.rs serves HTTP routes, static client files, and WebSocket upgrades.
  2. The lobby layer in server/src/lobby/ owns rooms. Each room is one Tokio task, and that task is the only owner of its live Game.
  3. The simulation crate in server/crates/sim/ owns the authoritative RTS world. Game::tick() runs a fixed sequence of services, then Game builds fog-filtered snapshots for each player.
flowchart TD
    Browser["Browser client"] -->|"ClientMessage JSON over /ws"| Main["server/src/main.rs"]
    Main -->|"RoomEvent over mpsc"| Lobby["Lobby registry"]
    Lobby -->|"spawn or route"| Room["one RoomTask per room"]
    Room -->|"Game::enqueue"| Game["rts_sim::game::Game"]
    Room -->|"Game::tick"| Game
    Game --> Systems["systems::run_tick"]
    Systems --> Services["services/*"]
    Game -->|"snapshot_for(player)"| Room
    Room -->|"ServerMessage::Snapshot"| Browser

The most important boundary is the Game public API. The networking and lobby code do not reach into simulation internals. They enqueue commands, tick the game, and ask for snapshots.

Coupling Scale

Coupling here means “how much one piece needs to know about another piece.”

CouplingMeaning in this codebase
LowA narrow API, usually read-only inputs or plain data, with little knowledge of tick order.
MediumThe caller must pass several state handles or phase-specific derived indexes, but the service still has one clear job.
HighThe service mutates several parts of the world, depends on tick order, or knows multiple gameplay rules at once.

High coupling is not automatically wrong in a game simulation. Some systems, like command validation and combat, really do need broad context. It is a risk flag: changes there should be smaller and tested closer to the behavior.

Phase 1: The Server Shell

The executable starts in server/src/main.rs.

It creates the Axum router, serves the JS client, exposes utility routes like /version, /wiki, /dev/scenarios, /dev/replay-artifact, /api/matches, and upgrades /ws to a WebSocket. It also builds one shared Lobby.

main.rs does not own a Game. It is the edge of the server, not the simulation.

flowchart LR
    HTTP["HTTP routes"] --> Router["Axum Router"]
    Static["client/ static files"] --> Router
    WS["GET /ws"] --> Conn["connection task"]
    Conn -->|"decoded ClientMessage"| Lobby["Lobby"]
    Lobby -->|"RoomEvent"| Room["RoomTask"]

The connection task reads client WebSocket frames, decodes protocol messages, and sends room events into the lobby/room layer. Outbound messages go through a connection sink. Reliable messages use a bounded queue, while snapshots use a replace-latest path so stale snapshots do not pile up behind a slow client.

Server Shell API Map

OwnerMain APIsCouplingWhy
main.rsAxum route handlers, ws_handler, handle_connectionMediumIt knows HTTP/WebSocket routing and protocol envelopes, but not sim internals.
Lobbyroom lookup/create, room event sender, replay room creation, drain stateMediumIt owns room registry and process-wide coordination, but sends messages instead of mutating rooms directly.
ConnectionWriter / connection sinksend_or_log, latest snapshot slot, reliable queueLow to mediumNarrow delivery API; coupling rises because snapshot delivery has performance/backpressure policy.
protocol DTOsClientMessage, ServerMessage, compact snapshot encodingMediumWire shape is shared with client and replay paths, so changes are contract changes.

Phase 2: Rooms, Lobby State, and Live Ticks

A room is the server’s unit of ownership. One room task owns one room’s lobby state and, during a match, one Game.

Players do not mutate the room directly. Connections send RoomEvents over an MPSC channel. The room task runs a loop that alternates between:

  1. Receiving room events, such as join, ready, command, give up, replay seek.
  2. Running scheduled ticks.
sequenceDiagram
    participant C as Connection task
    participant R as RoomTask
    participant G as Game
    participant A as AiController

    C->>R: RoomEvent::Command { player, seq, SimCommand }
    R->>G: Game::enqueue(player, command)
    loop every tick
        R->>A: think(snapshot)
        A-->>R: SimCommand list
        R->>G: enqueue AI commands
        R->>G: tick()
        G-->>R: per-player events
        R->>G: snapshot_for(player)
        R-->>C: ServerMessage::Snapshot
    end

RoomTask owns lifecycle: who is in the lobby, who is host, team/faction assignment, AI slots, whether the room is in lobby, live game, replay viewer, or branch staging mode, and when a match starts or ends.

LiveTickDriver owns the live-match tick wrapper. It asks AI controllers for ordinary SimCommands, calls Game::tick_with_perf, fans out snapshots, sends observer analysis, checks victory, and catches simulation panics so the room can write a crash replay instead of silently losing the match.

Room Layer API Map

OwnerAPI between systemsCouplingWhy
RoomEventJoin, Leave, Ready, StartRequest, Command, replay and branch eventsLowPlain message enum. It decouples connection tasks from room state.
RoomTaskrun(event_rx), handle_event, on_tick, phase transition helpersHighIt owns membership, lifecycle, match starts, match ends, replay transitions, AI slots, and the live Game.
LiveTickDriverrun(Box<Game>) -> LiveTickResultMedium to highIt is narrower than RoomTask, but knows AI enqueue, snapshots, defeat/game-over, perf, and panic capture.
AiControllerthink(AiThinkContext) -> Vec<SimCommand>LowAI only sees start payload, fog-filtered snapshot, alive ids, and retreat commands. It returns normal player commands.
SnapshotFanoutsend_to_recipients(players, recipients, snapshot_for)MediumDelivery is narrow, but it mutates net-status fields and tracks backpressure/perf.
ReplaySessionreplay seek/speed/vision/start payload/snapshot helpersMediumReplay owns playback state, but still drives a rebuilt Game through the same sim API.
ReplayBranchbranch staging and seat claim/release/launch dataMediumBranching is isolated from live tick code, but tied to room membership and replay seats.

Phase 3: The Simulation and Services

rts_sim::game::Game is the authoritative world. It stores the map, entity store, fog grids, player state, pending commands, command log, persistent pathing cache, smoke clouds, ability runtime, delayed shells, replay metadata, and debug/perf state.

The public API used by the server layer is intentionally small:

Game APIUsed for
new(...) and replay constructorsCreate live or replay simulation state from lobby/replay player records.
start_payload()Send terrain, starts, and setup data once at match start.
enqueue(player, SimCommand)Queue a command; validation happens when the tick applies it.
worker_retreat_commands_for(player)Let AI ask for a narrow sim-derived reflex without reading private entity state.
tick() / tick_with_perf()Advance the authoritative world by one fixed tick.
snapshot_for(player)Build one player’s fog-filtered snapshot.
snapshot_for_spectator(players)Build a union-fog spectator snapshot.
snapshot_full_for(player)Dev-only full-world watch snapshot.
alive_players(), alive_team_ids(), scores()Room outcome and score-screen data.
command_log(), player_inits()Replay and crash artifact data.
eliminate(player)Remove a leaving player’s army so matches can resolve.

Game::tick() increments the tick counter, drains pending commands, and calls systems::run_tick. After the services finish, it recomputes live fog and refreshes remembered enemy buildings.

flowchart TD
    Tick["Game::tick"] --> Pending["take pending commands"]
    Pending --> Run["systems::run_tick"]
    Run --> C0["rebuild pre-command occupancy + spatial"]
    C0 --> Commands["commands::apply_commands"]
    Commands --> Paths1["MoveCoordinator::process_awaiting_paths"]
    Paths1 --> Movement["movement::movement_system_with_events"]
    Movement --> Queue["order_queue::promote_ready_orders"]
    Queue --> C1["rebuild post-movement occupancy + spatial"]
    C1 --> Combat["combat::combat_system"]
    Combat --> Economy["economy::gather_system"]
    Economy --> Production["production::production_system"]
    Production --> Construction["construction::construction_system"]
    Construction --> Impacts["mortar/artillery/ability runtime"]
    Impacts --> Death["death::death_system"]
    Death --> Collision["movement::resolve_collisions"]
    Collision --> Supply["supply::recompute_supply"]
    Supply --> Final["final spatial index"]
    Final --> Fog["Game recomputes fog"]

The services are not independent actors. They are ordinary Rust functions called in a strict order. The order matters because each phase sees a particular view of the world:

  1. Pre-command derived state is valid before commands mutate orders.
  2. Post-movement derived state is valid after units move.
  3. Pre-collision derived state is valid after damage, production, construction, and death, but before collision cleanup.
  4. Final spatial state is valid for snapshot interest filtering.

Simulation Service API Map

ServiceMain APICouplingNotes
systems.rsrun_tick(map, entities, players, fog, pathing, rng, stores, pending, events, tick, perf) -> SpatialIndexHighIt is the orchestrator. It knows every phase, rebuild point, and service call order.
commands.rsapply_commands(... pending: Vec<(player, SimCommand)> ...)HighValidates ownership, command budgets, costs, tech, faction legality, placement, fog, ability use, and order application.
order_planner.rsplan_order(config, facts, request) -> PlannerOutputLowPure policy. It does not mutate the world and uses plain facts/actions.
order_execution.rsfocused helpers for setup, teardown, artillery point-fire ordersMediumShared mutation helpers reduce duplication between issue-time commands and queued promotion.
move_coordinator.rsorder_group_move, order_attack, order_gather, order_build, order_ability, process_awaiting_pathsHighCentral movement/order gateway. It wraps pathing, occupancy, spawn search, formation spread, and order state.
pathing.rsPathingService::request, advance_tick, cached tile pathsMediumStateful cache and per-tick budgeting, but isolated from combat/economy rules.
movement/movement_system_with_events, resolve_collisionsMedium to highOwns waypoint advancement, vehicle steering, smoke movement status, cooldown ticks, and collision cleanup.
combat/combat_system(...)HighNeeds teams, LOS, fog, smoke, spatial index, move coordinator, ability runtime, mortar shells, RNG, and events.
economy.rsgather_system(map, entities, players, occ, spatial, coordinator)MediumFocused on workers/resources, but mutates workers, resources, player stockpiles, and paths.
production.rsproduction_system(map, entities, players, coordinator, events)MediumAdvances queues, spends completed production, and asks coordinator for spawn/rally positions.
construction.rsconstruction_system(map, entities, players, events, fog, active_sites)Medium to highTied to build orders, placement legality, faction/economy rules, progress, and notices.
ability_orders.rsorder_or_launch_world_ability, launch_world_ability, launch_self_ability, predicatesHighAbility legality and effects span commands, queued orders, resources, cooldowns, smoke, runtime stores, and events.
ability_runtime.rsprojectile/anchor/return-marker tick and state helpersMediumOwns persistent ability state; combat and ability orders both touch it.
mortar.rs / artillery.rsdelayed shell schedule and resolve_dueMediumSmall stores, but resolution applies combat damage/events and depends on team/fog policy.
death.rsdeath_system(entities, fog, smokes, teams, players, lingering_sight, events, tick)HighDeath affects entities, scoring, lingering sight, resource cleanup, smoke/runtime cleanup, and event routing.
supply.rsrecompute_supply(players, entities)LowNarrow recalculation from authoritative entity state.
occupancy.rsOccupancy::build, footprint and clearance queriesLowDerived read model over map/entities. Rebuilt by the orchestrator at phase boundaries.
spatial.rsSpatialIndex::build, rectangle/circle id queriesLowDerived index with a narrow query API.
geometry.rsbody/rect/intersection helpersLowPure geometry helpers.
standability.rsstatic placement/body legality helpersMediumMostly pure, but knows map, occupancy, spatial index, and unit/building body rules.
line_of_sight.rsLineOfSight::clear_between_world_points, smoke-aware LOSLowNarrow read-only query surface.
world_query.rsowned/enemy/nearest query helpersMediumRead-only queries, but encode targetability and team relationship rules.
hero.rshero_regeneration_system(entities, tick)LowSmall single-purpose mutation.

Important API Boundaries Inside the Tick

The cleanest boundary is command planning:

flowchart LR
    Commands["commands.rs validates real world facts"] --> Facts["UnitFacts + OrderRequest"]
    Facts --> Planner["order_planner::plan_order"]
    Planner --> Actions["PlannedAction list"]
    Actions --> Commands
    Commands --> Mutate["apply orders through MoveCoordinator / helpers"]

order_planner is low coupling because it does not know EntityStore, Map, Fog, resources, or factions. It only knows facts and returns planned actions.

The highest-coupling area is command-to-order execution:

flowchart TD
    Command["SimCommand"] --> Commands["commands::apply_commands"]
    Commands --> Planner["order_planner"]
    Commands --> Ability["ability_orders"]
    Commands --> Build["construction placement helpers"]
    Commands --> Query["world_query + fog + teams"]
    Commands --> Coordinator["MoveCoordinator"]
    Coordinator --> Pathing["PathingService"]
    Coordinator --> EntityOrders["Entity orders and move phases"]

That coupling exists because player input crosses trust boundaries. The server must check ownership, stale ids, max command sizes, team hostility, fog visibility, faction legality, resources, tech, placement, queue length, and path staging before mutating orders.

Combat is also tightly coupled:

flowchart LR
    Combat["combat_system"] --> Target["target acquisition"]
    Combat --> LOS["line of sight + smoke"]
    Combat --> Move["chase via MoveCoordinator"]
    Combat --> Damage["damage + events"]
    Combat --> AbilityRuntime["anchors/projectiles"]
    Combat --> Mortar["mortar shell scheduling"]

This is expected for an RTS combat phase. The main safety rule is to keep the coupling contained inside combat/ and use narrower helper modules inside that folder for acquisition, chase, weapons, projection, events, and damage.

Snapshot and Fog Walkthrough

Snapshots are pulled after the tick, not pushed from inside services.

For normal players, Game::snapshot_for(player) builds a view from authoritative state and hides enemies outside that player’s living team vision. Spectators use snapshot_for_spectator(visible_players), which unions selected players’ current fog. Dev watch paths may use snapshot_full_for, but normal gameplay must not.

flowchart TD
    Entities["authoritative entities"] --> Projection["rules/projection"]
    Fog["Fog grids + smoke"] --> Projection
    BuildingMemory["remembered enemy buildings"] --> Projection
    Projection --> Snapshot["Snapshot"]
    Snapshot --> Compact["lobby::compact_snapshot_for_wire"]
    Compact --> Wire["ServerMessage::Snapshot"]

The coupling here is deliberately split:

PieceCouplingWhy
Game::snapshot_forMediumIt reads many world stores, but does not advance simulation.
rules/projection.rsMediumIt is the central visibility policy exception that reads sim state to produce protocol views.
compact_snapshot_for_wireLowIt only trims resource entities into compact resource deltas before wire send.
client renderingLow to server internalsThe client receives snapshots; it does not know EntityStore or tick services.

How To Read The Server Code

Start with the outer shell, then move inward:

  1. Read server/src/main.rs to see routes, WebSocket upgrade, shutdown drain, match-history endpoints, and dev endpoints.
  2. Read server/src/lobby/mod.rs for the room concurrency model and the RoomEvent message contract.
  3. Read server/src/lobby/room_task.rs for lifecycle: join, ready, start, leave, lobby state, replay state, branch state, and match reset.
  4. Read server/src/lobby/live_tick.rs for what happens during one live room tick around Game.
  5. Read server/crates/sim/src/game/mod.rs for what Game stores and exposes.
  6. Read server/crates/sim/src/game/systems.rs for the exact service order.
  7. Read individual files under server/crates/sim/src/game/services/ only when you need one gameplay phase.

The main mental model is:

Connections send intent. Rooms serialize that intent. Game applies it in a deterministic tick. Services mutate the world in a fixed order. Snapshots are derived views of the authoritative world.

Change Risk Guide

Change areaRiskReason
Pure helpers like geometry, line_of_sight, supplyLowerNarrow inputs and outputs. Focused tests are usually enough.
order_plannerLower to mediumPure and easy to unit test, but affects all command ordering.
Snapshot projection/fogHighA mistake can leak hidden information or hide legal information. Check protocol and fog docs.
commands, combat, move_coordinator, ability_orders, deathHighThese mutate several stores and rely on tick order. Use focused behavior tests.
Game public APIHighThis is the lobby/sim seam. Update design docs and all callers together.
protocol DTOsHighServer, client, replay, and docs must agree.

Glossary

TermMeaning
Room taskOne Tokio task that owns one room and its Game.
RoomEventInternal message from connections/lobby into a room task.
SimCommandDomain command queued into Game; not raw socket metadata.
EntityStoreAuthoritative collection of units, buildings, resource nodes, and their mutable state.
OccupancyDerived map/building clearance state for pathing and placement.
Spatial indexDerived broad-phase entity index for nearby queries.
FogServer-authoritative visibility grids.
ServiceA function/module called by systems::run_tick for one simulation phase or shared query surface.
ProjectionConversion from authoritative state into a player-safe protocol view.