Implementation Guide

How to generate Galaxy services from specifications.

Overview

Code is generated from specs using AI-assisted test-driven development. This guide defines the implementation order, technology stack, and verification process.

Technology Stack

Python Services

Component Technology Version
Runtime Python 3.12
Async framework asyncio stdlib
HTTP/REST FastAPI 0.109+
WebSocket FastAPI WebSockets (included)
gRPC grpcio, grpcio-tools 1.60+
Database asyncpg (PostgreSQL) 0.29+
Redis redis-py (async) 5.0+
Validation Pydantic 2.5+
Logging structlog 24.1+
Testing pytest, pytest-asyncio 8.0+
HTTP client (tests) httpx 0.26+

Web Client

Component Technology Version
Runtime Node.js 20 LTS
3D Rendering Three.js 0.160+
Build tool Vite 5.0+
Testing Vitest, Testing Library latest
E2E Testing Playwright 1.40+

Infrastructure

Component Technology Version
Container runtime Docker 24+
Orchestration Kubernetes 1.28+
Ingress NGINX Ingress Controller 1.9+
Database PostgreSQL 16
Cache/Queue Redis 7

Implementation Phases

Phase 0: Infrastructure (Parallel)

Generate Kubernetes manifests and base configurations.

┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│ k8s/namespace   │  │ k8s/postgres    │  │ k8s/redis       │
└─────────────────┘  └─────────────────┘  └─────────────────┘
         │                   │                    │
         └───────────────────┴────────────────────┘
                             │
                    ┌────────▼────────┐
                    │ k8s/configmap   │
                    │ k8s/secrets     │
                    └─────────────────┘

Deliverables:

  • k8s/namespace.yaml
  • k8s/postgres.yaml (includes postgres-init ConfigMap with schema)
  • k8s/redis.yaml (includes redis-config ConfigMap)
  • k8s/configmap.yaml (galaxy-config, frontend-config, nginx-config)
  • k8s/secrets.yaml (template only; actual secrets created via kubectl)
  • config/ephemeris-j2000.json (bundled fallback ephemeris with body properties; see services.md)

Phase 1: API Specifications

Two types of API specifications are needed:

Protocol Buffers (gRPC for internal service-to-service communication):

specs/data/grpc.md
        │
        ▼
┌─────────────────┐
│ specs/api/*.proto │
└─────────────────┘
        │
        ▼
┌─────────────────┐
│ protoc compile  │
└─────────────────┘
        │
        ▼
┌─────────────────┐
│ *_pb2.py files  │
│ *_pb2_grpc.py   │
└─────────────────┘

OpenAPI (REST for external client-to-gateway communication):

specs/api/api-gateway.yaml  →  FastAPI implementation

Deliverables:

Proto files (gRPC):

  • specs/api/common.proto
  • specs/api/physics.proto
  • specs/api/players.proto
  • specs/api/galaxy.proto
  • specs/api/tick_engine.proto

OpenAPI (REST):

  • specs/api/api-gateway.yaml (REST and WebSocket endpoints)

Phase 2: Independent Services (Parallel)

Services with no inter-service dependencies can be built simultaneously.

┌─────────────────┐  ┌─────────────────┐
│     galaxy      │  │     players     │
│                 │  │                 │
│ - Body config   │  │ - Auth logic    │
│ - Ephemeris     │  │ - Registration  │
│ - gRPC server   │  │ - JWT tokens    │
└─────────────────┘  └─────────────────┘

galaxy service:

  1. Tests for ephemeris loading (JPL fetch, fallback)
  2. Tests for body state management
  3. Tests for gRPC endpoints
  4. Implementation

players service:

  1. Tests for registration (validation, hashing)
  2. Tests for authentication (JWT generation, validation)
  3. Tests for gRPC endpoints
  4. Implementation

Phase 3: Physics Service

Depends on galaxy service for celestial body data (received via tick-engine’s InitializeBodies call, not direct gRPC).

┌─────────────────┐
│     physics     │
│                 │
│ - N-body sim    │
│ - Ship movement │
│ - Attitude ctrl │
│ - Fuel tracking │
└─────────────────┘

Implementation order:

  1. Tests for N-body gravity calculation
  2. Tests for Leapfrog integrator
  3. Tests for ship thrust/movement
  4. Tests for attitude control (wheels, RCS)
  5. Tests for fuel consumption
  6. Tests for service requests (fuel, reset)
  7. Tests for gRPC endpoints
  8. Implementation
  9. Physics validation tests (orbital periods, energy conservation)

Phase 4: Tick Engine

Depends on physics service.

┌─────────────────┐
│   tick-engine   │
│                 │
│ - Tick loop     │
│ - Timing        │
│ - Catch-up      │
│ - Pause/resume  │
└─────────────────┘

Implementation order:

  1. Tests for tick timing
  2. Tests for catch-up behavior
  3. Tests for pause/resume
  4. Tests for snapshot creation and recovery
  5. Tests for Redis event publishing
  6. Tests for gRPC endpoints
  7. Implementation

Phase 5: API Gateway

Depends on all game services.

┌─────────────────┐
│   api-gateway   │
│                 │
│ - REST API      │
│ - WebSocket     │
│ - Auth proxy    │
│ - Rate limiting │
└─────────────────┘

Implementation order:

  1. Tests for REST endpoints (auth, services)
  2. Tests for WebSocket connection lifecycle
  3. Tests for message handling (rotate, thrust, service)
  4. Tests for state broadcasting
  5. Tests for rate limiting
  6. Tests for error handling
  7. Implementation

Phase 6: Web Clients (Parallel)

┌─────────────────┐  ┌─────────────────┐
│   web-client    │  │ admin-dashboard │
└─────────────────┘  └─────────────────┘

web-client:

  1. Three.js scene setup tests
  2. HUD component tests
  3. WebSocket connection tests
  4. Control input tests
  5. Implementation

admin-dashboard:

  1. Auth flow tests
  2. Control panel tests
  3. Metrics display tests
  4. Implementation

Phase 7: Admin CLI

┌─────────────────┐
│    admin-cli    │
└─────────────────┘

Implementation order:

  1. Tests for command parsing
  2. Tests for REST API client calls (admin-cli communicates via api-gateway REST endpoints)
  3. Implementation

Phase 8: Integration & E2E

After all services complete:

  1. Integration tests (service-to-service)
  2. E2E tests (full player flow)
  3. Load testing

Per-Service Structure

Each Python service follows this structure:

services/{service}/
├── Dockerfile
├── README.md
├── requirements.txt
├── pyproject.toml
├── src/
│   ├── __init__.py
│   ├── main.py           # Entry point (includes logging config)
│   ├── config.py         # Configuration
│   ├── models.py         # Pydantic models
│   ├── service.py        # Business logic
│   ├── grpc_server.py    # gRPC implementation
│   └── health.py         # Health endpoints
├── tests/
│   ├── __init__.py
│   ├── conftest.py       # Fixtures
│   ├── test_models.py
│   ├── test_service.py
│   └── test_grpc.py
└── proto/                # Source proto files (copied from specs/api/proto/)
    ├── common.proto      # Compiled to *_pb2.py during Docker build
    ├── physics.proto
    ├── players.proto
    ├── galaxy.proto
    └── tick_engine.proto

Note: Each service needs all proto files even if it only uses some, because protos import each other (e.g., all import common.proto). The Dockerfile compiles these to *_pb2.py and *_pb2_grpc.py files during build.

Import strategy: Each Dockerfile sets ENV PYTHONPATH=/app/proto:/app, which allows clean imports like from proto import physics_pb2. Production source code must not use sys.path.insert for proto imports — rely on PYTHONPATH instead. Test files may use sys.path.insert with relative paths (via Path(__file__).parent.parent) for running outside Docker.

galaxy service additional files:

The galaxy service requires the bundled ephemeris file:

services/galaxy/
├── config/
│   └── ephemeris-j2000.json  # Copied from project config/
└── ... (standard structure)

Code Generation Workflow

For each service:

1. Read Spec
      │
      ▼
2. Generate Tests (TDD)
      │
      ├── Unit tests for models
      ├── Unit tests for business logic
      ├── Integration tests for gRPC
      └── Integration tests for dependencies
      │
      ▼
3. Run Tests (expect failures)
      │
      ▼
4. Generate Implementation
      │
      ▼
5. Run Tests (expect pass)
      │
      ▼
6. Generate Dockerfile
      │
      ▼
7. Build & Test Container
      │
      ▼
8. Generate K8s Manifest
      │
      ▼
9. Deploy to galaxy-dev
      │
      ▼
10. Smoke Test

Parallel Execution Strategy

Phase Parallelization

Phase Can Parallelize
0 - Infrastructure All manifests in parallel
1 - API Specs All protos in parallel, OpenAPI separately
2 - Independent galaxy ∥ players
3 - Physics Sequential (depends on galaxy)
4 - Tick Engine Sequential (depends on physics)
5 - API Gateway Sequential (depends on all)
6 - Web Clients web-client ∥ admin-dashboard
7 - Admin CLI Sequential
8 - Integration After all services

Within-Service Parallelization

For larger services, tests can be written in parallel:

physics service:
├── Agent A: N-body tests + implementation
├── Agent B: Ship movement tests + implementation
├── Agent C: Attitude control tests + implementation
└── Merge → Integration tests → Final assembly

Build Parallelization

docker build -t ghcr.io/erikevenson/galaxy/galaxy:1.0.0 services/galaxy &
docker build -t ghcr.io/erikevenson/galaxy/players:1.0.0 services/players &
docker build -t ghcr.io/erikevenson/galaxy/physics:1.0.0 services/physics &
wait

Verification Checklist

Before marking a service complete:

Code Quality

  • All tests pass
  • Coverage meets target (see testing.md)
  • No linting errors (ruff)
  • Type hints complete (mypy passes)

Functionality

  • All spec requirements implemented
  • All gRPC methods work
  • Health endpoints respond
  • Graceful shutdown works

Container

  • Docker build succeeds
  • Container starts without errors
  • Health check passes
  • Runs as non-root user

Kubernetes

  • Manifest is valid (kubectl dry-run)
  • Deploys to galaxy-dev
  • Readiness probe passes
  • Liveness probe passes
  • Can communicate with dependencies

Documentation

  • README.md complete
  • Configuration documented
  • API documented (if applicable)

Handling Missing Requirements

During implementation, gaps in specifications may be discovered. Follow this process:

Workflow

Implementing service...
       │
       ▼
Discover missing requirement
       │
       ▼
┌──────────────────────────────┐
│ 1. STOP implementation       │
│    Do not guess or improvise │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 2. Create GitHub issue       │
│    Label: bug, spec-gap      │
│    Title: "Spec: <what's     │
│            missing>"         │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 3. Ask for decision          │
│    Present options if known  │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 4. Update spec FIRST         │
│    Document the requirement  │
│    Close the issue           │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 5. Continue implementation   │
│    From updated spec         │
└──────────────────────────────┘

Rationale

  • Specs are source code — Implementation must match spec exactly
  • Regeneration safety — Undocumented features would be lost on regeneration
  • Consistency — All requirements in one place, not scattered in code comments

Example

Implementing physics service...
  ↓
Realize: spec doesn't say what happens when ship enters body's radius
  ↓
Stop. Create issue: "Spec: Ship-body collision behavior undefined"
  ↓
Ask: "Should ships be destroyed? Pass through? Bounce?"
  ↓
Answer received: "Pass through for MVP, log warning"
  ↓
Update services.md tick-engine section with collision handling
  ↓
Close issue
  ↓
Continue implementation

What Qualifies as Missing Requirement

Gap Type Action
Behavior undefined Stop, ask, update spec
Edge case not covered Stop, ask, update spec
Error handling unclear Stop, ask, update spec
Formula incomplete Stop, ask, update spec
Obvious typo in spec Fix spec, note in commit, continue
Implementation detail (e.g., variable names) Use judgment, continue

Rule of thumb: If the decision affects observable behavior or would change test assertions, it needs to be in the spec.

Regeneration Process

To regenerate a service from spec:

  1. Identify spec version — Note which spec commit to use
  2. Reference old code — Old code is available via git history as a development aid for patterns and edge cases (use git show HEAD:services/{service}/ to view)
  3. Clear service directory — Remove services/{service}/ directory for clean generation
  4. Follow workflow — Execute code generation workflow above
  5. Increment version — Update version in pyproject.toml
  6. Deploy — Roll out new version to Kubernetes

Note: Git history preserves all previous implementations. If needed, create a branch or tag before regeneration to make old code easier to reference.

Estimated Effort

Phase Services Parallelism Relative Effort
0 Infrastructure Full Low
1 API Specs Full Low
2 galaxy, players 2x Medium
3 physics 1x High (complex logic)
4 tick-engine 1x Medium
5 api-gateway 1x Medium-High
6 web-client, admin 2x Medium
7 admin-cli 1x Low
8 Integration 1x Medium

Physics service is the most complex due to N-body simulation, attitude control, and fuel management logic.


Back to top

Galaxy — Kubernetes-based multiplayer space game

This site uses Just the Docs, a documentation theme for Jekyll.