xpeditis2.0/docker/PORTAINER_DEPLOYMENT_GUIDE.md
David-Henri ARNAUD 5d06ad791f feat: Portainer stacks for staging & production deployment with Traefik
🐳 Docker Deployment Infrastructure
Complete Portainer stacks with Traefik reverse proxy integration for zero-downtime deployments

## Stack Files Created

### 1. Staging Stack (docker/portainer-stack-staging.yml)
**Services** (4 containers):
- `postgres-staging`: PostgreSQL 15 (db.t3.medium equivalent)
- `redis-staging`: Redis 7 with 512MB cache
- `backend-staging`: NestJS API (1 instance)
- `frontend-staging`: Next.js app (1 instance)

**Domains**:
- Frontend: `staging.xpeditis.com`
- Backend API: `api-staging.xpeditis.com`

**Features**:
- HTTP → HTTPS redirect
- Let's Encrypt SSL certificates
- Health checks on all services
- Security headers (HSTS, XSS protection, frame deny)
- Rate limiting via Traefik
- Sandbox carrier APIs
- Sentry monitoring (10% sampling)

### 2. Production Stack (docker/portainer-stack-production.yml)
**Services** (6 containers for High Availability):
- `postgres-prod`: PostgreSQL 15 with automated backups
- `redis-prod`: Redis 7 with persistence (1GB cache)
- `backend-prod-1` & `backend-prod-2`: NestJS API (2 instances, load balanced)
- `frontend-prod-1` & `frontend-prod-2`: Next.js app (2 instances, load balanced)

**Domains**:
- Frontend: `xpeditis.com` + `www.xpeditis.com` (auto-redirect to non-www)
- Backend API: `api.xpeditis.com`

**Features**:
- **Zero-downtime deployments** (rolling updates with 2 instances)
- **Load balancing** with sticky sessions
- **Strict security headers** (HSTS 2 years, CSP, force TLS)
- **Resource limits** (CPU, memory)
- **Production carrier APIs** (Maersk, MSC, CMA CGM, Hapag-Lloyd, ONE)
- **Enhanced monitoring** (Sentry + Google Analytics)
- **WWW redirect** (www → non-www)
- **Rate limiting** (stricter than staging)

### 3. Environment Files
- `docker/.env.staging.example`: Template for staging environment variables
- `docker/.env.production.example`: Template for production environment variables

**Variables** (30+ required):
- Database credentials (PostgreSQL, Redis)
- JWT secrets (256-512 bits)
- AWS configuration (S3, SES, region)
- Carrier API keys (Maersk, MSC, CMA CGM, etc.)
- Monitoring (Sentry DSN, Google Analytics)
- Email service configuration

### 4. Deployment Guide (docker/PORTAINER_DEPLOYMENT_GUIDE.md)
**Comprehensive 400+ line guide** covering:
- Prerequisites (server, Traefik, DNS, Docker images)
- Step-by-step Portainer deployment
- Environment variables configuration
- SSL/TLS certificate verification
- Health check validation
- Troubleshooting (5 common issues with solutions)
- Rolling updates (zero-downtime)
- Monitoring setup (Portainer, Sentry, logs)
- Security best practices (12 recommendations)
- Backup procedures

## đŸ—ïž Architecture Highlights

### High Availability (Production)
```
Traefik Load Balancer
    ├── frontend-prod-1 ──┐
    └── frontend-prod-2 ──┌── Sticky Sessions
                          │
    ├── backend-prod-1 ────
    └── backend-prod-2 ───┘
            │
            ├── postgres-prod (Single instance with backups)
            └── redis-prod (Persistence enabled)
```

### Traefik Labels Integration
- **HTTPS Routing**: Host-based routing with SSL termination
- **HTTP Redirect**: Automatic HTTP → HTTPS (permanent 301)
- **Security Middleware**: Custom headers, HSTS, XSS protection
- **Compression**: Gzip compression for responses
- **Rate Limiting**: Traefik-level + application-level
- **Health Checks**: Automatic container removal if unhealthy
- **Sticky Sessions**: Cookie-based session affinity

### Network Architecture
- **Internal Network**: `xpeditis_internal_staging` / `xpeditis_internal_prod` (isolated)
- **Traefik Network**: `traefik_network` (external, shared with Traefik)
- **Database/Redis**: Only accessible from internal network
- **Frontend/Backend**: Connected to both networks (internal + Traefik)

## 📊 Resource Allocation

### Staging (Single Instances)
- PostgreSQL: 2 vCPU, 4GB RAM
- Redis: 0.5 vCPU, 512MB cache
- Backend: 1 vCPU, 1GB RAM
- Frontend: 1 vCPU, 1GB RAM
- **Total**: ~4 vCPU, ~6.5GB RAM

### Production (High Availability)
- PostgreSQL: 2 vCPU, 4GB RAM (limits)
- Redis: 1 vCPU, 1.5GB RAM (limits)
- Backend x2: 2 vCPU, 2GB RAM each (4 vCPU, 4GB total)
- Frontend x2: 2 vCPU, 2GB RAM each (4 vCPU, 4GB total)
- **Total**: ~13 vCPU, ~17GB RAM

## 🔒 Security Features

1. **SSL/TLS**: Let's Encrypt certificates with auto-renewal
2. **HSTS**: Strict-Transport-Security (1 year staging, 2 years production)
3. **Security Headers**: XSS protection, frame deny, content-type nosniff
4. **Rate Limiting**: Traefik (50-100 req/min) + Application-level
5. **Secrets Management**: Environment variables, never hardcoded
6. **Network Isolation**: Services communicate only via internal network
7. **Health Checks**: Automatic restart on failure
8. **Resource Limits**: Prevent resource exhaustion attacks

## 🚀 Deployment Process

1. **Prerequisites**: Traefik + DNS configured
2. **Build Images**: Docker build + push to registry
3. **Configure Environment**: Copy .env.example, fill secrets
4. **Deploy Stack**: Portainer UI → Add Stack → Deploy
5. **Verify**: Health checks, SSL, DNS, logs
6. **Monitor**: Sentry + Portainer stats

## 📩 Files Summary

```
docker/
├── portainer-stack-staging.yml      (250 lines) - 4 services
├── portainer-stack-production.yml   (450 lines) - 6 services
├── .env.staging.example             (80 lines)
├── .env.production.example          (100 lines)
└── PORTAINER_DEPLOYMENT_GUIDE.md    (400+ lines)
```

Total: 5 files, ~1,280 lines of infrastructure-as-code

## 🎯 Next Steps

1. Build Docker images (frontend + backend)
2. Push to Docker registry (Docker Hub / GHCR)
3. Configure DNS (staging + production domains)
4. Deploy Traefik (if not already done)
5. Copy .env files and fill secrets
6. Deploy staging stack via Portainer
7. Test staging thoroughly
8. Deploy production stack
9. Setup monitoring (Sentry, Uptime Robot)

## 🔗 Related Documentation

- [DEPLOYMENT.md](../DEPLOYMENT.md) - General deployment guide
- [ARCHITECTURE.md](../ARCHITECTURE.md) - System architecture
- [PHASE4_SUMMARY.md](../PHASE4_SUMMARY.md) - Phase 4 completion status

đŸ€– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-15 11:55:59 +02:00

420 lines
11 KiB
Markdown

# Guide de Déploiement Portainer - Xpeditis
Ce guide explique comment déployer les stacks Xpeditis (staging et production) sur Portainer avec Traefik.
---
## 📋 PrĂ©requis
### 1. Infrastructure Serveur
- **Serveur VPS/Dédié** avec Docker installé
- **Minimum**: 4 vCPU, 8 GB RAM, 100 GB SSD
- **Recommandé Production**: 8 vCPU, 16 GB RAM, 200 GB SSD
- **OS**: Ubuntu 22.04 LTS ou Debian 11+
### 2. Traefik déjà déployé
- Network `traefik_network` doit exister
- Let's Encrypt configuré (`letsencrypt` resolver)
- Ports 80 et 443 ouverts
### 3. DNS Configuré
**Staging**:
- `staging.xpeditis.com` → IP du serveur
- `api-staging.xpeditis.com` → IP du serveur
**Production**:
- `xpeditis.com` → IP du serveur
- `www.xpeditis.com` → IP du serveur
- `api.xpeditis.com` → IP du serveur
### 4. Images Docker
Les images Docker doivent ĂȘtre buildĂ©es et pushĂ©es sur un registry (Docker Hub, GitHub Container Registry, ou privĂ©):
```bash
# Build backend
cd apps/backend
docker build -t xpeditis/backend:staging-latest .
docker push xpeditis/backend:staging-latest
# Build frontend
cd apps/frontend
docker build -t xpeditis/frontend:staging-latest .
docker push xpeditis/frontend:staging-latest
```
---
## 🚀 DĂ©ploiement sur Portainer
### Étape 1: CrĂ©er le network Traefik (si pas dĂ©jĂ  fait)
```bash
docker network create traefik_network
```
### Étape 2: PrĂ©parer les variables d'environnement
#### Pour Staging:
1. Copier `.env.staging.example` vers `.env.staging`
2. Remplir toutes les valeurs (voir section Variables d'environnement ci-dessous)
3. **IMPORTANT**: Utiliser des mots de passe forts (min 32 caractĂšres)
#### Pour Production:
1. Copier `.env.production.example` vers `.env.production`
2. Remplir toutes les valeurs avec les credentials de production
3. **IMPORTANT**: Utiliser des mots de passe ultra-forts (min 64 caractĂšres)
### Étape 3: DĂ©ployer via Portainer UI
#### A. Accéder à Portainer
- URL: `https://portainer.votre-domaine.com` (ou `http://IP:9000`)
- Login avec vos credentials admin
#### B. Créer la Stack Staging
1. **Aller dans**: Stacks → Add Stack
2. **Name**: `xpeditis-staging`
3. **Build method**: Web editor
4. **Copier le contenu** de `portainer-stack-staging.yml`
5. **Onglet "Environment variables"**:
- Cliquer sur "Load variables from .env file"
- Copier-coller le contenu de `.env.staging`
- OU ajouter manuellement chaque variable
6. **Cliquer**: Deploy the stack
7. **Vérifier**: Les 4 services doivent démarrer (postgres, redis, backend, frontend)
#### C. Créer la Stack Production
1. **Aller dans**: Stacks → Add Stack
2. **Name**: `xpeditis-production`
3. **Build method**: Web editor
4. **Copier le contenu** de `portainer-stack-production.yml`
5. **Onglet "Environment variables"**:
- Cliquer sur "Load variables from .env file"
- Copier-coller le contenu de `.env.production`
- OU ajouter manuellement chaque variable
6. **Cliquer**: Deploy the stack
7. **Vérifier**: Les 6 services doivent démarrer (postgres, redis, backend x2, frontend x2)
---
## 🔐 Variables d'environnement Critiques
### Variables Obligatoires (staging & production)
| Variable | Description | Exemple |
|----------|-------------|---------|
| `POSTGRES_PASSWORD` | Mot de passe PostgreSQL | `XpEd1t1s_pG_S3cur3_2024!` |
| `REDIS_PASSWORD` | Mot de passe Redis | `R3d1s_C4ch3_P4ssw0rd!` |
| `JWT_SECRET` | Secret pour JWT tokens | `openssl rand -base64 64` |
| `AWS_ACCESS_KEY_ID` | AWS Access Key | `AKIAIOSFODNN7EXAMPLE` |
| `AWS_SECRET_ACCESS_KEY` | AWS Secret Key | `wJalrXUtnFEMI/K7MDENG/...` |
| `SENTRY_DSN` | Sentry monitoring URL | `https://xxx@sentry.io/123` |
| `MAERSK_API_KEY` | Clé API Maersk | Voir portail Maersk |
### Générer des Secrets Sécurisés
```bash
# PostgreSQL password (64 chars)
openssl rand -base64 48
# Redis password (64 chars)
openssl rand -base64 48
# JWT Secret (512 bits)
openssl rand -base64 64
# Generic secure password
pwgen -s 64 1
```
---
## 🔍 VĂ©rification du DĂ©ploiement
### 1. Vérifier l'état des conteneurs
Dans Portainer:
- **Stacks** → `xpeditis-staging` (ou production)
- Tous les services doivent ĂȘtre en status **running** (vert)
### 2. Vérifier les logs
Cliquer sur chaque service → **Logs** → VĂ©rifier qu'il n'y a pas d'erreurs
```bash
# Ou via CLI
docker logs xpeditis-backend-staging -f
docker logs xpeditis-frontend-staging -f
```
### 3. Vérifier les health checks
```bash
# Backend health check
curl https://api-staging.xpeditis.com/health
# Réponse attendue: {"status":"ok","timestamp":"..."}
# Frontend health check
curl https://staging.xpeditis.com/api/health
# Réponse attendue: {"status":"ok"}
```
### 4. Vérifier Traefik
Dans Traefik dashboard:
- Routers: Doit afficher `xpeditis-backend-staging` et `xpeditis-frontend-staging`
- Services: Doit afficher les load balancers avec health checks verts
- Certificats: Let's Encrypt doit ĂȘtre vert
### 5. Vérifier SSL
```bash
# Vérifier certificat SSL
curl -I https://staging.xpeditis.com
# Header "Strict-Transport-Security" doit ĂȘtre prĂ©sent
# Test SSL avec SSLLabs
# https://www.ssllabs.com/ssltest/analyze.html?d=staging.xpeditis.com
```
### 6. Test Complet
1. **Frontend**: Ouvrir `https://staging.xpeditis.com` dans un navigateur
2. **Backend**: Tester un endpoint: `https://api-staging.xpeditis.com/health`
3. **Login**: Créer un compte et se connecter
4. **Recherche de taux**: Tester une recherche Rotterdam → Shanghai
5. **Booking**: Créer un booking de test
---
## 🐛 DĂ©pannage
### ProblÚme 1: Service ne démarre pas
**SymptĂŽme**: Conteneur en status "Exited" ou "Restarting"
**Solution**:
1. VĂ©rifier les logs: Portainer → Service → Logs
2. Erreurs communes:
- `POSTGRES_PASSWORD` manquant → Ajouter la variable
- `Cannot connect to postgres` → VĂ©rifier que postgres est en running
- `Redis connection refused` → VĂ©rifier que redis est en running
- `Port already in use` → Un autre service utilise le port
### ProblĂšme 2: Traefik ne route pas vers le service
**SymptĂŽme**: 404 Not Found ou Gateway Timeout
**Solution**:
1. Vérifier que le network `traefik_network` existe:
```bash
docker network ls | grep traefik
```
2. Vérifier que les services sont connectés au network:
```bash
docker inspect xpeditis-backend-staging | grep traefik_network
```
3. VĂ©rifier les labels Traefik dans Portainer → Service → Labels
4. Restart Traefik:
```bash
docker restart traefik
```
### ProblĂšme 3: SSL Certificate Failed
**SymptĂŽme**: "Your connection is not private" ou certificat invalide
**Solution**:
1. Vérifier que DNS pointe vers le serveur:
```bash
nslookup staging.xpeditis.com
```
2. Vérifier les logs Traefik:
```bash
docker logs traefik | grep -i letsencrypt
```
3. Vérifier que ports 80 et 443 sont ouverts:
```bash
sudo ufw status
sudo netstat -tlnp | grep -E '80|443'
```
4. Si nécessaire, supprimer le certificat et re-déployer:
```bash
docker exec traefik rm /letsencrypt/acme.json
docker restart traefik
```
### ProblĂšme 4: Database connection failed
**SymptĂŽme**: Backend logs montrent "Cannot connect to database"
**Solution**:
1. Vérifier que PostgreSQL est en running
2. Vérifier les credentials:
```bash
docker exec -it xpeditis-postgres-staging psql -U xpeditis -d xpeditis_staging
```
3. Vérifier le network interne:
```bash
docker exec -it xpeditis-backend-staging ping postgres-staging
```
### ProblĂšme 5: High memory usage
**SymptĂŽme**: Serveur lent, OOM killer
**Solution**:
1. Vérifier l'utilisation mémoire:
```bash
docker stats
```
2. Réduire les limites dans docker-compose (section `deploy.resources`)
3. Augmenter la RAM du serveur
4. Optimiser les queries PostgreSQL (indexes, explain analyze)
---
## 🔄 Mise à Jour des Stacks
### Update Rolling (Zero Downtime)
#### Staging:
1. Build et push nouvelle image:
```bash
docker build -t xpeditis/backend:staging-v1.2.0 .
docker push xpeditis/backend:staging-v1.2.0
```
2. Dans Portainer → Stacks → `xpeditis-staging` → Editor
3. Changer `BACKEND_TAG=staging-v1.2.0`
4. Cliquer "Update the stack"
5. Portainer va pull la nouvelle image et redémarrer les services
#### Production (avec High Availability):
La stack production a 2 instances de chaque service (backend-prod-1, backend-prod-2). Traefik va load balancer entre les deux.
**Mise Ă  jour sans downtime**:
1. Stopper `backend-prod-2` dans Portainer
2. Update l'image de `backend-prod-2`
3. Redémarrer `backend-prod-2`
4. Vérifier health check OK
5. Stopper `backend-prod-1`
6. Update l'image de `backend-prod-1`
7. Redémarrer `backend-prod-1`
8. Vérifier health check OK
**OU via Portainer** (plus simple):
1. Portainer → Stacks → `xpeditis-production` → Editor
2. Changer `BACKEND_TAG=v1.2.0`
3. Cliquer "Update the stack"
4. Portainer va mettre Ă  jour les services un par un (rolling update automatique)
---
## 📊 Monitoring
### 1. Portainer Built-in Monitoring
Portainer → Containers → SĂ©lectionner service → **Stats**
- CPU usage
- Memory usage
- Network I/O
- Block I/O
### 2. Sentry (Error Tracking)
Toutes les erreurs backend et frontend sont envoyées à Sentry (configuré via `SENTRY_DSN`)
URL: https://sentry.io/organizations/xpeditis/projects/
### 3. Logs Centralisés
**Voir tous les logs en temps réel**:
```bash
docker logs -f xpeditis-backend-staging
docker logs -f xpeditis-frontend-staging
docker logs -f xpeditis-postgres-staging
docker logs -f xpeditis-redis-staging
```
**Rechercher dans les logs**:
```bash
docker logs xpeditis-backend-staging 2>&1 | grep "ERROR"
docker logs xpeditis-backend-staging 2>&1 | grep "booking"
```
### 4. Health Checks Dashboard
Créer un dashboard custom avec:
- Uptime Robot: https://uptimerobot.com (free tier: 50 monitors)
- Grafana + Prometheus (advanced)
---
## 🔒 SĂ©curitĂ© Best Practices
### 1. Mots de passe forts
✅ Min 64 caractùres pour production
✅ GĂ©nĂ©rĂ©s alĂ©atoirement (openssl, pwgen)
✅ StockĂ©s dans un gestionnaire de secrets (AWS Secrets Manager, Vault)
### 2. Rotation des credentials
✅ Tous les 90 jours
✅ ImmĂ©diatement si compromis
### 3. Backups automatiques
✅ PostgreSQL: Backup quotidien
✅ Retention: 30 jours staging, 90 jours production
✅ Test restore mensuel
### 4. Monitoring actif
✅ Sentry configurĂ©
✅ Uptime monitoring actif
✅ Alertes email/Slack pour downtime
### 5. SSL/TLS
✅ HSTS activĂ© (Strict-Transport-Security)
✅ TLS 1.2+ minimum
✅ Certificat Let's Encrypt auto-renew
### 6. Rate Limiting
✅ Traefik rate limiting configurĂ©
✅ Application-level rate limiting (NestJS throttler)
✅ Brute-force protection active
### 7. Firewall
✅ Ports 80, 443 ouverts uniquement
✅ PostgreSQL/Redis accessibles uniquement depuis rĂ©seau interne Docker
✅ SSH avec clĂ©s uniquement (pas de mot de passe)
---
## 📞 Support
### En cas de problĂšme critique:
1. **Vérifier les logs** dans Portainer
2. **Vérifier Sentry** pour les erreurs récentes
3. **Restart du service** via Portainer (si safe)
4. **Rollback**: Portainer → Stacks → Redeploy previous version
### Contacts:
- **Tech Lead**: david-henri.arnaud@3ds.com
- **DevOps**: ops@xpeditis.com
- **Support**: support@xpeditis.com
---
## 📚 Ressources
- **Portainer Docs**: https://docs.portainer.io/
- **Traefik Docs**: https://doc.traefik.io/traefik/
- **Docker Docs**: https://docs.docker.com/
- **Let's Encrypt**: https://letsencrypt.org/docs/
---
*DerniĂšre mise Ă  jour*: 2025-10-14
*Version*: 1.0.0
*Auteur*: Xpeditis DevOps Team