Implementation Guide

How to generate Galaxy services from specifications.

Overview

Code is generated from specs using AI-assisted test-driven development. This guide defines the implementation order, technology stack, and verification process.

Technology Stack

Python Services

Component	Technology	Version
Runtime	Python	3.12
Async framework	asyncio	stdlib
HTTP/REST	FastAPI	0.109+
WebSocket	FastAPI WebSockets	(included)
gRPC	grpcio, grpcio-tools	1.60+
Database	asyncpg (PostgreSQL)	0.29+
Redis	redis-py (async)	5.0+
Validation	Pydantic	2.5+
Logging	structlog	24.1+
Testing	pytest, pytest-asyncio	8.0+
HTTP client (tests)	httpx	0.26+

Web Client

Component	Technology	Version
Runtime	Node.js	20 LTS
3D Rendering	Three.js	0.160+
Build tool	Vite	5.0+
Testing	Vitest, Testing Library	latest
E2E Testing	Playwright	1.40+

Infrastructure

Component	Technology	Version
Container runtime	Docker	24+
Orchestration	Kubernetes	1.28+
Ingress	NGINX Ingress Controller	1.9+
Database	PostgreSQL	16
Cache/Queue	Redis	7

Implementation Phases

Phase 0: Infrastructure (Parallel)

Generate Kubernetes manifests and base configurations.

┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│ k8s/namespace   │  │ k8s/postgres    │  │ k8s/redis       │
└─────────────────┘  └─────────────────┘  └─────────────────┘
         │                   │                    │
         └───────────────────┴────────────────────┘
                             │
                    ┌────────▼────────┐
                    │ k8s/configmap   │
                    │ k8s/secrets     │
                    └─────────────────┘

Deliverables:

k8s/namespace.yaml
k8s/postgres.yaml (includes postgres-init ConfigMap with schema)
k8s/redis.yaml (includes redis-config ConfigMap)
k8s/configmap.yaml (galaxy-config, frontend-config, nginx-config)
k8s/secrets.yaml (template only; actual secrets created via kubectl)
config/ephemeris-j2000.json (bundled fallback ephemeris with body properties; see services.md)

Phase 1: API Specifications

Two types of API specifications are needed:

Protocol Buffers (gRPC for internal service-to-service communication):

specs/data/grpc.md
        │
        ▼
┌─────────────────┐
│ specs/api/*.proto │
└─────────────────┘
        │
        ▼
┌─────────────────┐
│ protoc compile  │
└─────────────────┘
        │
        ▼
┌─────────────────┐
│ *_pb2.py files  │
│ *_pb2_grpc.py   │
└─────────────────┘

OpenAPI (REST for external client-to-gateway communication):

specs/api/api-gateway.yaml  →  FastAPI implementation

Deliverables:

Proto files (gRPC):

specs/api/common.proto
specs/api/physics.proto
specs/api/players.proto
specs/api/galaxy.proto
specs/api/tick_engine.proto

OpenAPI (REST):

specs/api/api-gateway.yaml (REST and WebSocket endpoints)

Phase 2: Independent Services (Parallel)

Services with no inter-service dependencies can be built simultaneously.

┌─────────────────┐  ┌─────────────────┐
│     galaxy      │  │     players     │
│                 │  │                 │
│ - Body config   │  │ - Auth logic    │
│ - Ephemeris     │  │ - Registration  │
│ - gRPC server   │  │ - JWT tokens    │
└─────────────────┘  └─────────────────┘

galaxy service:

Tests for ephemeris loading (JPL fetch, fallback)
Tests for body state management
Tests for gRPC endpoints
Implementation

players service:

Tests for registration (validation, hashing)
Tests for authentication (JWT generation, validation)
Tests for gRPC endpoints
Implementation

Phase 3: Physics Service

Depends on galaxy service for celestial body data (received via tick-engine’s InitializeBodies call, not direct gRPC).

┌─────────────────┐
│     physics     │
│                 │
│ - N-body sim    │
│ - Ship movement │
│ - Attitude ctrl │
│ - Fuel tracking │
└─────────────────┘

Implementation order:

Tests for N-body gravity calculation
Tests for Leapfrog integrator
Tests for ship thrust/movement
Tests for attitude control (wheels, RCS)
Tests for fuel consumption
Tests for service requests (fuel, reset)
Tests for gRPC endpoints
Implementation
Physics validation tests (orbital periods, energy conservation)

Phase 4: Tick Engine

Depends on physics service.

┌─────────────────┐
│   tick-engine   │
│                 │
│ - Tick loop     │
│ - Timing        │
│ - Catch-up      │
│ - Pause/resume  │
└─────────────────┘

Implementation order:

Tests for tick timing
Tests for catch-up behavior
Tests for pause/resume
Tests for snapshot creation and recovery
Tests for Redis event publishing
Tests for gRPC endpoints
Implementation

Phase 5: API Gateway

Depends on all game services.

┌─────────────────┐
│   api-gateway   │
│                 │
│ - REST API      │
│ - WebSocket     │
│ - Auth proxy    │
│ - Rate limiting │
└─────────────────┘

Implementation order:

Tests for REST endpoints (auth, services)
Tests for WebSocket connection lifecycle
Tests for message handling (rotate, thrust, service)
Tests for state broadcasting
Tests for rate limiting
Tests for error handling
Implementation

Phase 6: Web Clients (Parallel)

┌─────────────────┐  ┌─────────────────┐
│   web-client    │  │ admin-dashboard │
└─────────────────┘  └─────────────────┘

web-client:

Three.js scene setup tests
HUD component tests
WebSocket connection tests
Control input tests
Implementation

admin-dashboard:

Auth flow tests
Control panel tests
Metrics display tests
Implementation

Phase 7: Admin CLI

┌─────────────────┐
│    admin-cli    │
└─────────────────┘

Implementation order:

Tests for command parsing
Tests for REST API client calls (admin-cli communicates via api-gateway REST endpoints)
Implementation

Phase 8: Integration & E2E

After all services complete:

Integration tests (service-to-service)
E2E tests (full player flow)
Load testing

Per-Service Structure

Each Python service follows this structure:

services/{service}/
├── Dockerfile
├── README.md
├── requirements.txt
├── pyproject.toml
├── src/
│   ├── __init__.py
│   ├── main.py           # Entry point (includes logging config)
│   ├── config.py         # Configuration
│   ├── models.py         # Pydantic models
│   ├── service.py        # Business logic
│   ├── grpc_server.py    # gRPC implementation
│   └── health.py         # Health endpoints
├── tests/
│   ├── __init__.py
│   ├── conftest.py       # Fixtures
│   ├── test_models.py
│   ├── test_service.py
│   └── test_grpc.py
└── proto/                # Source proto files (copied from specs/api/proto/)
    ├── common.proto      # Compiled to *_pb2.py during Docker build
    ├── physics.proto
    ├── players.proto
    ├── galaxy.proto
    └── tick_engine.proto

Note: Each service needs all proto files even if it only uses some, because protos import each other (e.g., all import common.proto). The Dockerfile compiles these to *_pb2.py and *_pb2_grpc.py files during build.

Import strategy: Each Dockerfile sets ENV PYTHONPATH=/app/proto:/app, which allows clean imports like from proto import physics_pb2. Production source code must not use sys.path.insert for proto imports — rely on PYTHONPATH instead. Test files may use sys.path.insert with relative paths (via Path(__file__).parent.parent) for running outside Docker.

galaxy service additional files:

The galaxy service requires the bundled ephemeris file:

services/galaxy/
├── config/
│   └── ephemeris-j2000.json  # Copied from project config/
└── ... (standard structure)

Code Generation Workflow

For each service:

1. Read Spec
      │
      ▼
2. Generate Tests (TDD)
      │
      ├── Unit tests for models
      ├── Unit tests for business logic
      ├── Integration tests for gRPC
      └── Integration tests for dependencies
      │
      ▼
3. Run Tests (expect failures)
      │
      ▼
4. Generate Implementation
      │
      ▼
5. Run Tests (expect pass)
      │
      ▼
6. Generate Dockerfile
      │
      ▼
7. Build & Test Container
      │
      ▼
8. Generate K8s Manifest
      │
      ▼
9. Deploy to galaxy-dev
      │
      ▼
10. Smoke Test

Parallel Execution Strategy

Phase Parallelization

Phase	Can Parallelize
0 - Infrastructure	All manifests in parallel
1 - API Specs	All protos in parallel, OpenAPI separately
2 - Independent	galaxy ∥ players
3 - Physics	Sequential (depends on galaxy)
4 - Tick Engine	Sequential (depends on physics)
5 - API Gateway	Sequential (depends on all)
6 - Web Clients	web-client ∥ admin-dashboard
7 - Admin CLI	Sequential
8 - Integration	After all services

Within-Service Parallelization

For larger services, tests can be written in parallel:

physics service:
├── Agent A: N-body tests + implementation
├── Agent B: Ship movement tests + implementation
├── Agent C: Attitude control tests + implementation
└── Merge → Integration tests → Final assembly

Build Parallelization

docker build -t ghcr.io/erikevenson/galaxy/galaxy:1.0.0 services/galaxy &
docker build -t ghcr.io/erikevenson/galaxy/players:1.0.0 services/players &
docker build -t ghcr.io/erikevenson/galaxy/physics:1.0.0 services/physics &
wait

Verification Checklist

Before marking a service complete:

Code Quality

All tests pass
Coverage meets target (see testing.md)
No linting errors (ruff)
Type hints complete (mypy passes)

Functionality

All spec requirements implemented
All gRPC methods work
Health endpoints respond
Graceful shutdown works

Container

Docker build succeeds
Container starts without errors
Health check passes
Runs as non-root user

Kubernetes

Documentation

README.md complete
Configuration documented
API documented (if applicable)

Handling Missing Requirements

During implementation, gaps in specifications may be discovered. Follow this process:

Workflow

Implementing service...
       │
       ▼
Discover missing requirement
       │
       ▼
┌──────────────────────────────┐
│ 1. STOP implementation       │
│    Do not guess or improvise │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 2. Create GitHub issue       │
│    Label: bug, spec-gap      │
│    Title: "Spec: <what's     │
│            missing>"         │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 3. Ask for decision          │
│    Present options if known  │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 4. Update spec FIRST         │
│    Document the requirement  │
│    Close the issue           │
└──────────────────────────────┘
       │
       ▼
┌──────────────────────────────┐
│ 5. Continue implementation   │
│    From updated spec         │
└──────────────────────────────┘

Rationale

Specs are source code — Implementation must match spec exactly
Regeneration safety — Undocumented features would be lost on regeneration
Consistency — All requirements in one place, not scattered in code comments

Example

Implementing physics service...
  ↓
Realize: spec doesn't say what happens when ship enters body's radius
  ↓
Stop. Create issue: "Spec: Ship-body collision behavior undefined"
  ↓
Ask: "Should ships be destroyed? Pass through? Bounce?"
  ↓
Answer received: "Pass through for MVP, log warning"
  ↓
Update services.md tick-engine section with collision handling
  ↓
Close issue
  ↓
Continue implementation

What Qualifies as Missing Requirement

Gap Type	Action
Behavior undefined	Stop, ask, update spec
Edge case not covered	Stop, ask, update spec
Error handling unclear	Stop, ask, update spec
Formula incomplete	Stop, ask, update spec
Obvious typo in spec	Fix spec, note in commit, continue
Implementation detail (e.g., variable names)	Use judgment, continue

Rule of thumb: If the decision affects observable behavior or would change test assertions, it needs to be in the spec.

Regeneration Process

To regenerate a service from spec:

Identify spec version — Note which spec commit to use
Reference old code — Old code is available via git history as a development aid for patterns and edge cases (use git show HEAD:services/{service}/ to view)
Clear service directory — Remove services/{service}/ directory for clean generation
Follow workflow — Execute code generation workflow above
Increment version — Update version in pyproject.toml
Deploy — Roll out new version to Kubernetes

Note: Git history preserves all previous implementations. If needed, create a branch or tag before regeneration to make old code easier to reference.

Estimated Effort

Phase	Services	Parallelism	Relative Effort
0	Infrastructure	Full	Low
1	API Specs	Full	Low
2	galaxy, players	2x	Medium
3	physics	1x	High (complex logic)
4	tick-engine	1x	Medium
5	api-gateway	1x	Medium-High
6	web-client, admin	2x	Medium
7	admin-cli	1x	Low
8	Integration	1x	Medium

Physics service is the most complex due to N-body simulation, attitude control, and fuel management logic.