Implementation Guide
How to generate Galaxy services from specifications.
Overview
Code is generated from specs using AI-assisted test-driven development. This guide defines the implementation order, technology stack, and verification process.
Technology Stack
Python Services
| Component | Technology | Version |
|---|---|---|
| Runtime | Python | 3.12 |
| Async framework | asyncio | stdlib |
| HTTP/REST | FastAPI | 0.109+ |
| WebSocket | FastAPI WebSockets | (included) |
| gRPC | grpcio, grpcio-tools | 1.60+ |
| Database | asyncpg (PostgreSQL) | 0.29+ |
| Redis | redis-py (async) | 5.0+ |
| Validation | Pydantic | 2.5+ |
| Logging | structlog | 24.1+ |
| Testing | pytest, pytest-asyncio | 8.0+ |
| HTTP client (tests) | httpx | 0.26+ |
Web Client
| Component | Technology | Version |
|---|---|---|
| Runtime | Node.js | 20 LTS |
| 3D Rendering | Three.js | 0.160+ |
| Build tool | Vite | 5.0+ |
| Testing | Vitest, Testing Library | latest |
| E2E Testing | Playwright | 1.40+ |
Infrastructure
| Component | Technology | Version |
|---|---|---|
| Container runtime | Docker | 24+ |
| Orchestration | Kubernetes | 1.28+ |
| Ingress | NGINX Ingress Controller | 1.9+ |
| Database | PostgreSQL | 16 |
| Cache/Queue | Redis | 7 |
Implementation Phases
Phase 0: Infrastructure (Parallel)
Generate Kubernetes manifests and base configurations.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ k8s/namespace │ │ k8s/postgres │ │ k8s/redis │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────┴────────────────────┘
│
┌────────▼────────┐
│ k8s/configmap │
│ k8s/secrets │
└─────────────────┘
Deliverables:
k8s/namespace.yamlk8s/postgres.yaml(includes postgres-init ConfigMap with schema)k8s/redis.yaml(includes redis-config ConfigMap)k8s/configmap.yaml(galaxy-config, frontend-config, nginx-config)k8s/secrets.yaml(template only; actual secrets created via kubectl)config/ephemeris-j2000.json(bundled fallback ephemeris with body properties; see services.md)
Phase 1: API Specifications
Two types of API specifications are needed:
Protocol Buffers (gRPC for internal service-to-service communication):
specs/data/grpc.md
│
▼
┌─────────────────┐
│ specs/api/*.proto │
└─────────────────┘
│
▼
┌─────────────────┐
│ protoc compile │
└─────────────────┘
│
▼
┌─────────────────┐
│ *_pb2.py files │
│ *_pb2_grpc.py │
└─────────────────┘
OpenAPI (REST for external client-to-gateway communication):
specs/api/api-gateway.yaml → FastAPI implementation
Deliverables:
Proto files (gRPC):
specs/api/common.protospecs/api/physics.protospecs/api/players.protospecs/api/galaxy.protospecs/api/tick_engine.proto
OpenAPI (REST):
specs/api/api-gateway.yaml(REST and WebSocket endpoints)
Phase 2: Independent Services (Parallel)
Services with no inter-service dependencies can be built simultaneously.
┌─────────────────┐ ┌─────────────────┐
│ galaxy │ │ players │
│ │ │ │
│ - Body config │ │ - Auth logic │
│ - Ephemeris │ │ - Registration │
│ - gRPC server │ │ - JWT tokens │
└─────────────────┘ └─────────────────┘
galaxy service:
- Tests for ephemeris loading (JPL fetch, fallback)
- Tests for body state management
- Tests for gRPC endpoints
- Implementation
players service:
- Tests for registration (validation, hashing)
- Tests for authentication (JWT generation, validation)
- Tests for gRPC endpoints
- Implementation
Phase 3: Physics Service
Depends on galaxy service for celestial body data (received via tick-engine’s InitializeBodies call, not direct gRPC).
┌─────────────────┐
│ physics │
│ │
│ - N-body sim │
│ - Ship movement │
│ - Attitude ctrl │
│ - Fuel tracking │
└─────────────────┘
Implementation order:
- Tests for N-body gravity calculation
- Tests for Leapfrog integrator
- Tests for ship thrust/movement
- Tests for attitude control (wheels, RCS)
- Tests for fuel consumption
- Tests for service requests (fuel, reset)
- Tests for gRPC endpoints
- Implementation
- Physics validation tests (orbital periods, energy conservation)
Phase 4: Tick Engine
Depends on physics service.
┌─────────────────┐
│ tick-engine │
│ │
│ - Tick loop │
│ - Timing │
│ - Catch-up │
│ - Pause/resume │
└─────────────────┘
Implementation order:
- Tests for tick timing
- Tests for catch-up behavior
- Tests for pause/resume
- Tests for snapshot creation and recovery
- Tests for Redis event publishing
- Tests for gRPC endpoints
- Implementation
Phase 5: API Gateway
Depends on all game services.
┌─────────────────┐
│ api-gateway │
│ │
│ - REST API │
│ - WebSocket │
│ - Auth proxy │
│ - Rate limiting │
└─────────────────┘
Implementation order:
- Tests for REST endpoints (auth, services)
- Tests for WebSocket connection lifecycle
- Tests for message handling (rotate, thrust, service)
- Tests for state broadcasting
- Tests for rate limiting
- Tests for error handling
- Implementation
Phase 6: Web Clients (Parallel)
┌─────────────────┐ ┌─────────────────┐
│ web-client │ │ admin-dashboard │
└─────────────────┘ └─────────────────┘
web-client:
- Three.js scene setup tests
- HUD component tests
- WebSocket connection tests
- Control input tests
- Implementation
admin-dashboard:
- Auth flow tests
- Control panel tests
- Metrics display tests
- Implementation
Phase 7: Admin CLI
┌─────────────────┐
│ admin-cli │
└─────────────────┘
Implementation order:
- Tests for command parsing
- Tests for REST API client calls (admin-cli communicates via api-gateway REST endpoints)
- Implementation
Phase 8: Integration & E2E
After all services complete:
- Integration tests (service-to-service)
- E2E tests (full player flow)
- Load testing
Per-Service Structure
Each Python service follows this structure:
services/{service}/
├── Dockerfile
├── README.md
├── requirements.txt
├── pyproject.toml
├── src/
│ ├── __init__.py
│ ├── main.py # Entry point (includes logging config)
│ ├── config.py # Configuration
│ ├── models.py # Pydantic models
│ ├── service.py # Business logic
│ ├── grpc_server.py # gRPC implementation
│ └── health.py # Health endpoints
├── tests/
│ ├── __init__.py
│ ├── conftest.py # Fixtures
│ ├── test_models.py
│ ├── test_service.py
│ └── test_grpc.py
└── proto/ # Source proto files (copied from specs/api/proto/)
├── common.proto # Compiled to *_pb2.py during Docker build
├── physics.proto
├── players.proto
├── galaxy.proto
└── tick_engine.proto
Note: Each service needs all proto files even if it only uses some, because protos import each other (e.g., all import common.proto). The Dockerfile compiles these to *_pb2.py and *_pb2_grpc.py files during build.
Import strategy: Each Dockerfile sets ENV PYTHONPATH=/app/proto:/app, which allows clean imports like from proto import physics_pb2. Production source code must not use sys.path.insert for proto imports — rely on PYTHONPATH instead. Test files may use sys.path.insert with relative paths (via Path(__file__).parent.parent) for running outside Docker.
galaxy service additional files:
The galaxy service requires the bundled ephemeris file:
services/galaxy/
├── config/
│ └── ephemeris-j2000.json # Copied from project config/
└── ... (standard structure)
Code Generation Workflow
For each service:
1. Read Spec
│
▼
2. Generate Tests (TDD)
│
├── Unit tests for models
├── Unit tests for business logic
├── Integration tests for gRPC
└── Integration tests for dependencies
│
▼
3. Run Tests (expect failures)
│
▼
4. Generate Implementation
│
▼
5. Run Tests (expect pass)
│
▼
6. Generate Dockerfile
│
▼
7. Build & Test Container
│
▼
8. Generate K8s Manifest
│
▼
9. Deploy to galaxy-dev
│
▼
10. Smoke Test
Parallel Execution Strategy
Phase Parallelization
| Phase | Can Parallelize |
|---|---|
| 0 - Infrastructure | All manifests in parallel |
| 1 - API Specs | All protos in parallel, OpenAPI separately |
| 2 - Independent | galaxy ∥ players |
| 3 - Physics | Sequential (depends on galaxy) |
| 4 - Tick Engine | Sequential (depends on physics) |
| 5 - API Gateway | Sequential (depends on all) |
| 6 - Web Clients | web-client ∥ admin-dashboard |
| 7 - Admin CLI | Sequential |
| 8 - Integration | After all services |
Within-Service Parallelization
For larger services, tests can be written in parallel:
physics service:
├── Agent A: N-body tests + implementation
├── Agent B: Ship movement tests + implementation
├── Agent C: Attitude control tests + implementation
└── Merge → Integration tests → Final assembly
Build Parallelization
docker build -t ghcr.io/erikevenson/galaxy/galaxy:1.0.0 services/galaxy &
docker build -t ghcr.io/erikevenson/galaxy/players:1.0.0 services/players &
docker build -t ghcr.io/erikevenson/galaxy/physics:1.0.0 services/physics &
wait
Verification Checklist
Before marking a service complete:
Code Quality
- All tests pass
- Coverage meets target (see testing.md)
- No linting errors (ruff)
- Type hints complete (mypy passes)
Functionality
- All spec requirements implemented
- All gRPC methods work
- Health endpoints respond
- Graceful shutdown works
Container
- Docker build succeeds
- Container starts without errors
- Health check passes
- Runs as non-root user
Kubernetes
- Manifest is valid (kubectl dry-run)
- Deploys to galaxy-dev
- Readiness probe passes
- Liveness probe passes
- Can communicate with dependencies
Documentation
- README.md complete
- Configuration documented
- API documented (if applicable)
Handling Missing Requirements
During implementation, gaps in specifications may be discovered. Follow this process:
Workflow
Implementing service...
│
▼
Discover missing requirement
│
▼
┌──────────────────────────────┐
│ 1. STOP implementation │
│ Do not guess or improvise │
└──────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ 2. Create GitHub issue │
│ Label: bug, spec-gap │
│ Title: "Spec: <what's │
│ missing>" │
└──────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ 3. Ask for decision │
│ Present options if known │
└──────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ 4. Update spec FIRST │
│ Document the requirement │
│ Close the issue │
└──────────────────────────────┘
│
▼
┌──────────────────────────────┐
│ 5. Continue implementation │
│ From updated spec │
└──────────────────────────────┘
Rationale
- Specs are source code — Implementation must match spec exactly
- Regeneration safety — Undocumented features would be lost on regeneration
- Consistency — All requirements in one place, not scattered in code comments
Example
Implementing physics service...
↓
Realize: spec doesn't say what happens when ship enters body's radius
↓
Stop. Create issue: "Spec: Ship-body collision behavior undefined"
↓
Ask: "Should ships be destroyed? Pass through? Bounce?"
↓
Answer received: "Pass through for MVP, log warning"
↓
Update services.md tick-engine section with collision handling
↓
Close issue
↓
Continue implementation
What Qualifies as Missing Requirement
| Gap Type | Action |
|---|---|
| Behavior undefined | Stop, ask, update spec |
| Edge case not covered | Stop, ask, update spec |
| Error handling unclear | Stop, ask, update spec |
| Formula incomplete | Stop, ask, update spec |
| Obvious typo in spec | Fix spec, note in commit, continue |
| Implementation detail (e.g., variable names) | Use judgment, continue |
Rule of thumb: If the decision affects observable behavior or would change test assertions, it needs to be in the spec.
Regeneration Process
To regenerate a service from spec:
- Identify spec version — Note which spec commit to use
- Reference old code — Old code is available via git history as a development aid for patterns and edge cases (use
git show HEAD:services/{service}/to view) - Clear service directory — Remove
services/{service}/directory for clean generation - Follow workflow — Execute code generation workflow above
- Increment version — Update version in pyproject.toml
- Deploy — Roll out new version to Kubernetes
Note: Git history preserves all previous implementations. If needed, create a branch or tag before regeneration to make old code easier to reference.
Estimated Effort
| Phase | Services | Parallelism | Relative Effort |
|---|---|---|---|
| 0 | Infrastructure | Full | Low |
| 1 | API Specs | Full | Low |
| 2 | galaxy, players | 2x | Medium |
| 3 | physics | 1x | High (complex logic) |
| 4 | tick-engine | 1x | Medium |
| 5 | api-gateway | 1x | Medium-High |
| 6 | web-client, admin | 2x | Medium |
| 7 | admin-cli | 1x | Low |
| 8 | Integration | 1x | Medium |
Physics service is the most complex due to N-body simulation, attitude control, and fuel management logic.