Ship Respawn Guard on Service Restart

Problem

When the physics service restarts, ships get respawned at their default position with full fuel, destroying in-progress maneuvers. This happens because:

  1. Player’s WebSocket reconnects after service restart, triggering re-login
  2. Players service calls _check_ship_existsGetShipState on physics
  3. If GetShipState fails for ANY reason (gRPC error, proto mismatch, timeout), _check_ship_exists returns False
  4. Players service calls SpawnShip → overwrites the ship in Redis at spawn position with full fuel
  5. Ship’s in-progress position, velocity, fuel, and maneuver state are lost

The ship data was safe in Redis the entire time — it was the SpawnShip call that destroyed it.

Root Cause

Two issues contribute:

  1. SpawnShip unconditionally overwrites: No check for whether the ship already exists in Redis. A spawn request for an existing ship should be a no-op (return existing state).

  2. _check_ship_exists treats all errors as “missing”: Any gRPC error (including proto mismatches, timeouts, service unavailable) is caught and returns False, triggering a respawn.

Fix

1. Guard SpawnShip against existing ships (physics service)

In simulation.py spawn_ship(), check Redis for an existing ship with the same ship_id before creating a new one. If found, return the existing ship without modification.

2. Distinguish NOT_FOUND from errors (players service)

In _check_ship_exists, only return False for an actual NOT_FOUND gRPC status. For all other errors (UNAVAILABLE, UNKNOWN, DEADLINE_EXCEEDED, etc.), raise the exception so the caller doesn’t trigger a respawn.


Back to top

Galaxy — Kubernetes-based multiplayer space game

This site uses Just the Docs, a documentation theme for Jekyll.