Game Engine Service Merge (#946)
Merge physics and tick-engine into a single game-engine service.
Motivation
Physics and tick-engine are tightly coupled single-replica services. Every tick involves ~20ms of redundant Redis I/O and gRPC serialization passing data between them. Merging eliminates this overhead.
Architecture
Before
API Gateway → gRPC → Tick-Engine → gRPC → Physics → Redis → Tick-Engine → Redis
After
API Gateway → gRPC → Game-Engine (in-memory) → Redis (async write)
In-Memory State
Entities (bodies, ships, stations, jumpgates) live in memory as the authoritative state. Redis receives writes for:
- API gateway state broadcasts (via tick_completed pub/sub)
- Crash recovery (supplement to PostgreSQL snapshots)
Redis Write Strategy
Not all entities need Redis writes every tick. Bodies, stations, and jumpgates move on smooth, predictable orbital paths — writing them every tick wastes I/O. Ships change abruptly (thrust, attitude, docking) and must be written every tick.
| Entity | Memory | Redis write frequency | Rationale |
|---|---|---|---|
| Bodies | Every tick | Every REDIS_WRITE_INTERVAL ticks (default 10) + on flush |
Smooth Keplerian motion; clients don’t need per-tick updates |
| Ships | Every tick | Every tick | Abrupt state changes (thrust, landing, docking) |
| Stations | Every tick | Every REDIS_WRITE_INTERVAL ticks + on flush |
Passive orbital motion |
| Jumpgates | Every tick | Every REDIS_WRITE_INTERVAL ticks + on flush |
Passive orbital motion |
Flush triggers (write cached entities to Redis immediately):
- gRPC
GetAllBodies/GetAllStations/GetAllJumpGates/GetAllState— return in-memory state directly, no Redis read needed - Game pause — ensure Redis is consistent for snapshots
- Snapshot creation
- Service shutdown (graceful)
Read path: process_tick reads bodies/stations/jumpgates from self._bodies / self._stations / self._jumpgates (in-memory). Ships are read from Redis every tick (automation and api-gateway also write ship fields). On startup, restore_from_redis() populates memory from Redis.
Tick Loop Overhead Reduction
The tick loop caches frequently-read Redis values in memory to avoid per-tick round-trips:
| Value | Cache strategy | Invalidation |
|---|---|---|
game:paused |
In-memory bool, set by pause()/resume() methods |
Direct — same process controls pause state |
game:tick_rate |
In-memory float, set by set_tick_rate() |
Direct — same process controls tick rate |
game:adaptive_tick_rate |
In-memory bool, set by set_adaptive_tick_rate() |
Direct — same process controls adaptive mode |
Tick timing: When the computed sleep time is below 5ms, the tick loop uses a busy-wait spin (while time.monotonic() < deadline: pass) instead of asyncio.sleep(). This eliminates OS scheduler jitter (measured 1-36ms variance) at the cost of pinning one CPU core. Above 5ms, asyncio.sleep() is used to avoid unnecessary CPU usage.
Tick-end Redis pipeline: set_game_time, set_current_tick, and publish_tick_completed are batched into a single Redis pipeline instead of sequential calls.
gRPC Interface
Game-engine exposes both the Physics and TickEngine gRPC services:
- Physics RPCs (GetAllShips, SetSteeringCommand, etc.) — used by API gateway
- TickEngine RPCs (Pause, Resume, SetTickRate, etc.) — used by API gateway admin
ProcessTick becomes a direct function call (no longer an RPC).
Migration Phases
Phase 1: Create game-engine service structure
- New
services/game-engine/directory - Copy physics src + tick-engine src
- Unified Dockerfile, requirements.txt, config
- Single proto directory
Phase 2: Wire direct function calls
- Replace
physics_stub.ProcessTick()gRPC call with directsimulation.process_tick() - Keep entities in memory between ticks (no per-tick Redis reads)
- Write to Redis after each tick (async)
Phase 3: Unified state manager
- Merge
RedisState+TickEngineStateintoGameEngineState - Single Redis connection pool
- Single PostgreSQL connection pool
Phase 4: Update deployment
- New game-engine k8s manifests
- Remove physics + tick-engine deployments
- Update API gateway to point to game-engine
- Update CI/CD
Phase 5: Cleanup
- Remove old services/physics and services/tick-engine directories
- Update documentation
Status
All phases implemented and cleanup complete. The game-engine service is the unified replacement for the separate physics and tick-engine services. The old service directories have been removed (#1040).
Service mapping
services/game-engine/src/physics/— N-body simulation, ship physicsservices/game-engine/src/tick_engine/— game loop, automation, maneuversservices/game-engine/src/main.py— unified entry pointservices/game-engine/src/state.py— GameEngineState (shared Redis connection)services/game-engine/src/health.py— combined health/metrics endpoint