Database Maintenance
Rack Gateway uses PostgreSQL for persistent storage. This guide covers database maintenance tasks including migrations, backups, and troubleshooting.
Database Commands
Section titled “Database Commands”The rack-gateway-api binary includes maintenance subcommands for database operations.
Apply Migrations
Section titled “Apply Migrations”Apply pending database migrations:
rack-gateway-api migrateThis command:
- Connects using
DATABASE_URLorPG*environment variables - Applies migrations in
internal/gateway/db/migrations/ - Uses
schema_migrationstable to track applied migrations - Is safe to run repeatedly (idempotent)
Reset Database
Section titled “Reset Database”Reset the database to a clean state (development only):
# Required safety guardsexport RESET_RACK_GATEWAY_DATABASE=DELETE_ALL_DATAexport DEV_MODE=true
rack-gateway-api reset-dbSafety guards:
RESET_RACK_GATEWAY_DATABASE=DELETE_ALL_DATA- Must be set exactly- Database must be marked as development, OR set
DISABLE_DATABASE_ENVIRONMENT_CHECK=1 - For fresh databases,
DEV_MODE=trueis required
Check Migration Status
Section titled “Check Migration Status”Verify which migrations have been applied:
# Connect to database and checkpsql $DATABASE_URL -c "SELECT * FROM schema_migrations ORDER BY version;"Backup Procedures
Section titled “Backup Procedures”Automated Backups (AWS RDS)
Section titled “Automated Backups (AWS RDS)”If using AWS RDS, configure automated backups:
# Terraform configurationresource "aws_db_instance" "gateway" { # Automated backups backup_retention_period = 30 backup_window = "03:00-04:00"
# Point-in-time recovery copy_tags_to_snapshot = true}Manual Backup
Section titled “Manual Backup”Create a manual backup using pg_dump:
# Full database backuppg_dump $DATABASE_URL > rack_gateway_backup_$(date +%Y%m%d).sql
# Compressed backuppg_dump $DATABASE_URL | gzip > rack_gateway_backup_$(date +%Y%m%d).sql.gzRestore from Backup
Section titled “Restore from Backup”-
Stop the gateway service
Terminal window convox scale gateway --count 0 -
Restore the database
Terminal window psql $DATABASE_URL < rack_gateway_backup_20240115.sql -
Run any pending migrations
Terminal window rack-gateway-api migrate -
Restart the service
Terminal window convox scale gateway --count 2
Schema Overview
Section titled “Schema Overview”The database contains these main tables:
| Table | Purpose |
|---|---|
users | User accounts from OAuth |
user_sessions | Active user sessions |
api_tokens | API tokens for CI/CD |
mfa_methods | User MFA enrollments |
audit.audit_event | Immutable audit log entries |
mfa_backup_codes | MFA backup codes |
trusted_devices | Trusted devices for step-up |
settings | Gateway configuration |
deploy_approval_requests | Pending deploy approvals |
Key Relationships
Section titled “Key Relationships”Maintenance Tasks
Section titled “Maintenance Tasks”Cleanup Old Sessions
Section titled “Cleanup Old Sessions”Sessions expire automatically, but you can manually clean up:
-- Delete sessions older than 30 daysDELETE FROM user_sessionsWHERE created_at < NOW() - INTERVAL '30 days';Cleanup Old Audit Logs
Section titled “Cleanup Old Audit Logs”Audit log retention is policy-driven. WORM anchors use AUDIT_ANCHOR_RETENTION_DAYS.
-- Check audit log volumeSELECT COUNT(*), DATE(timestamp) as dayFROM audit.audit_eventGROUP BY DATE(timestamp)ORDER BY day DESCLIMIT 10;
-- Manual cleanup (if needed)DELETE FROM audit.audit_eventWHERE timestamp < NOW() - INTERVAL '400 days';Vacuum and Analyze
Section titled “Vacuum and Analyze”Regular maintenance improves performance:
-- Analyze query statisticsANALYZE;
-- Reclaim space from deleted rowsVACUUM ANALYZE;For RDS, this happens automatically. For self-managed PostgreSQL, schedule regular maintenance.
Connection Management
Section titled “Connection Management”Connection Pool Settings
Section titled “Connection Pool Settings”Configure connection pooling:
DB_MAX_OPEN_CONNS=25 # Maximum concurrent connectionsDB_MAX_IDLE_CONNS=5 # Idle connections to keep warmDB_CONN_MAX_LIFETIME=30m # Connection lifetimeDB_CONN_MAX_IDLE_TIME=10m # Idle timeoutMonitoring Connections
Section titled “Monitoring Connections”-- Current active connectionsSELECT count(*) FROM pg_stat_activityWHERE datname = 'rack_gateway';
-- Connection detailsSELECT usename, application_name, client_addr, stateFROM pg_stat_activityWHERE datname = 'rack_gateway';Troubleshooting
Section titled “Troubleshooting”Migration Failures
Section titled “Migration Failures”If a migration fails:
- Check the error message for the specific issue
- Fix the underlying problem (permissions, constraints, etc.)
- Re-run
rack-gateway migrate
Migrations are transactional—partial failures won’t leave the database in an inconsistent state.
Connection Issues
Section titled “Connection Issues”If the gateway can’t connect to the database:
# Test connectivitypsql $DATABASE_URL -c "SELECT 1;"
# Check if host is reachablenc -zv $PGHOST $PGPORT
# Verify SSL settingspsql "$DATABASE_URL?sslmode=require" -c "SELECT 1;"Slow Queries
Section titled “Slow Queries”Identify slow queries:
-- Enable query logging (if not already)ALTER SYSTEM SET log_min_duration_statement = '1000'; -- Log queries > 1s
-- Check for long-running queriesSELECT pid, now() - pg_stat_activity.query_start AS duration, queryFROM pg_stat_activityWHERE state = 'active'ORDER BY duration DESC;Disk Space
Section titled “Disk Space”Check database size:
SELECT pg_size_pretty(pg_database_size('rack_gateway'));
-- Size by tableSELECT relname, pg_size_pretty(pg_total_relation_size(relid))FROM pg_catalog.pg_statio_user_tablesORDER BY pg_total_relation_size(relid) DESC;Docker Development
Section titled “Docker Development”When using the Docker development stack:
Migrate
Section titled “Migrate”task docker:db:migratetask docker:db:resetConnect to Database
Section titled “Connect to Database”docker exec -it rack-gateway-postgres-1 psql -U postgres -d gateway_devBest Practices
Section titled “Best Practices”- Always run migrations before starting the app - Include in deployment scripts
- Use automated backups - Configure RDS backup retention
- Monitor connection usage - Prevent connection exhaustion
- Keep audit logs - Don’t delete before retention period expires
- Test restore procedures - Verify backups actually work
Further Reading
Section titled “Further Reading”- AWS Infrastructure - RDS configuration
- Production Checklist - Pre-deployment verification
- Troubleshooting - Common issues