Error Handling Policy

Exception Catching Rules

1. Catch Specific Exceptions for Known Operations

When calling a known external service or performing a specific I/O operation, catch the specific exception type for that operation:

Operation Specific Exception(s)
gRPC calls grpc.aio.AioRpcError
Redis operations redis.RedisError
HTTP requests (httpx) httpx.RequestError, httpx.HTTPStatusError
WebSocket sends WebSocketDisconnect, ConnectionError, RuntimeError
File I/O FileNotFoundError, OSError
JSON parsing json.JSONDecodeError, ValueError, KeyError

2. Log with Exception Type Context

When catching specific exceptions, log details that aid diagnosis:

  • gRPC: Log e.code().name and e.details() to capture the gRPC status code.
  • Redis: Log the exception type and message.
  • HTTP: Log the status code for HTTPStatusError, the request URL for RequestError.

3. Keep except Exception as Safety Nets Only

Generic except Exception catches are allowed only in:

  • Event loop top-level: The main while True loop in tick processing, WebSocket event streaming, and similar long-running loops. These prevent a single unexpected error from crashing the service.
  • Startup/shutdown cleanup: Where failure should be logged but not propagate (e.g., closing connections during shutdown).
  • CLI boundaries: Top-level entry points where any unhandled exception should produce a user-friendly error.

In all other locations, use specific exception types.

4. Never Silently Swallow Exceptions

Every except block must either:

  • Log the exception (at minimum logger.error or logger.warning)
  • Re-raise the exception
  • Return an explicit error value

The only exception is cleanup code (e.g., closing WebSockets during shutdown) where logging would be noise.

This rule also applies to client-side JavaScript: .catch(() => {}) on promise chains must at minimum use console.warn to log the failure.

5. Preserve except HTTPException: raise Guards

In FastAPI route handlers, when except Exception follows code that may raise HTTPException, always include except HTTPException: raise before the generic catch to avoid converting client errors into 500s.

6. Guard Uninitialized Resources

Service classes that hold connection pools, clients, or other resources that require explicit initialization (e.g., Database._pool, gRPC channels) must guard access with a null check. If the resource is None, raise a descriptive RuntimeError rather than allowing an opaque AttributeError to propagate.

# Good: Explicit guard with descriptive error
if self._pool is None:
    raise RuntimeError("Database not connected. Call connect() first.")
async with self._pool.acquire(timeout=5) as conn:
    ...

# Bad: No guard — raises AttributeError: 'NoneType' object has no attribute 'acquire'
async with self._pool.acquire(timeout=5) as conn:
    ...

This applies to all methods that use the resource, not just the first one. A private helper method _ensure_connected() may be used to centralize the check.

Examples

Good: Specific gRPC catch

try:
    response = await stub.ProcessTick(request, timeout=5)
except grpc.aio.AioRpcError as e:
    logger.error("ProcessTick RPC failed", code=e.code().name, details=e.details())
    raise HTTPException(status_code=502, detail="Physics service unavailable")

Good: Safety net in event loop

while not shutdown_event.is_set():
    try:
        await process_next_event()
    except asyncio.CancelledError:
        break
    except Exception as e:
        logger.error("Event loop error", error=str(e))
        await asyncio.sleep(1)

Good: Safe Redis value parsing with defaults

# Redis returns all values as strings. Corrupted data must not crash the handler.
try:
    tick = int(maneuver.get("started_tick", 0))
except (ValueError, TypeError):
    logger.warning("Invalid Redis value for started_tick", raw=maneuver.get("started_tick"))
    tick = 0

try:
    inclination = float(maneuver["target_inclination"])
except (ValueError, TypeError):
    logger.warning("Invalid Redis value for target_inclination", raw=maneuver.get("target_inclination"))
    inclination = None

Rule: All int() and float() conversions of Redis string values must be wrapped in try/except (ValueError, TypeError) with:

  • A sensible default (0 for ints, 0.0 for floats, or None where the field is optional)
  • A logger.warning call identifying the field and raw value

This applies to both routes.py (WebSocket message handlers) and websocket_manager.py (event loop and state broadcasting).

Bad: Generic catch hiding error type

try:
    response = await stub.GetStatus(request, timeout=5)
except Exception as e:  # Hides whether this is a network error, timeout, or bug
    logger.error("GetStatus failed", error=str(e))

Back to top

Galaxy — Kubernetes-based multiplayer space game

This site uses Just the Docs, a documentation theme for Jekyll.