Self-Hosted Issues
TL;DR
Self-hosted CMDOP deployment issues commonly involve PostgreSQL connection failures, Redis connectivity, TLS certificate errors, and upgrade migration problems. Verify database connectivity with psql, check Redis with redis-cli ping, and always run pg_dump before upgrades. For Kubernetes deployments, use kubectl describe pod to diagnose CrashLoopBackOff or ImagePullBackOff errors. Enable debug logging with LOG_LEVEL=debug for detailed diagnostics.
How do I fix database connection issues?
What are the symptoms?
Error: connection to database failedor
Error: FATAL: password authentication failedHow do I diagnose database connection problems?
# Test connection
psql -h localhost -U cmdop -d cmdop -c "SELECT 1"
# Check PostgreSQL is running
systemctl status postgresql
# Check logs
tail -100 /var/log/postgresql/postgresql-15-main.logHow do I fix database connection failures?
Check Connection String
# Environment variable
echo $DATABASE_URL
# Should be:
# postgres://cmdop:password@localhost:5432/cmdop?sslmode=disableVerify PostgreSQL Accepts Connections
# /etc/postgresql/15/main/pg_hba.conf
# Add line for local connections:
host cmdop cmdop 127.0.0.1/32 scram-sha-256# Reload configuration
sudo systemctl reload postgresqlCheck Firewall
# Allow local PostgreSQL
sudo ufw allow from 127.0.0.1 to any port 5432Reset Password
sudo -u postgres psql -c "ALTER USER cmdop PASSWORD 'newpassword';"How do I fix Redis issues?
What are the symptoms?
Error: ECONNREFUSED 127.0.0.1:6379How do I diagnose Redis problems?
# Test connection
redis-cli ping
# Should return: PONG
# Check Redis is running
systemctl status redisHow do I fix Redis connection failures?
Start Redis
sudo systemctl start redis
sudo systemctl enable redisCheck Memory
redis-cli INFO memory
# If memory is full
redis-cli FLUSHDB # WARNING: clears dataCheck Configuration
# Redis should listen on localhost
grep "^bind" /etc/redis/redis.conf
# Should be: bind 127.0.0.1How do I fix certificate errors?
What are the symptoms?
Error: certificate verify failedor
Error: x509: certificate signed by unknown authorityHow do I fix certificate verification failures?
Self-Signed Certificates
# Generate new certificate
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
-keyout /etc/cmdop/server.key \
-out /etc/cmdop/server.crt \
-subj "/CN=cmdop.company.com"Agent Trusts Custom CA
# On agent machines
sudo cp company-ca.crt /usr/local/share/ca-certificates/
sudo update-ca-certificates
# Or set environment
export CMDOP_CA_FILE=/path/to/ca.crt
cmdop connectLet’s Encrypt
# Install certbot
sudo apt install certbot
# Get certificate
sudo certbot certonly --standalone -d cmdop.company.com
# Certificate at:
# /etc/letsencrypt/live/cmdop.company.com/fullchain.pem
# /etc/letsencrypt/live/cmdop.company.com/privkey.pemCertificate Renewal
# Auto-renew with cron
0 0 1 * * /usr/bin/certbot renew --quiet && systemctl restart cmdopHow do I fix upgrade failures?
What are the symptoms?
Error: migration failedor service won’t start after upgrade
How do I fix failed upgrades?
Rollback
# Docker
docker pull cmdop/control-plane:v1.2.3 # Previous version
docker-compose down
docker-compose up -d
# Kubernetes
kubectl rollout undo deployment/cmdop-control-planeManual Migration
# Check pending migrations
cmdop-server migrate status
# Run migrations
cmdop-server migrate up
# If failed, check logs
journalctl -u cmdop -n 100Database Backup Before Upgrade
# Always backup first
pg_dump cmdop > backup-$(date +%Y%m%d).sql
# Then upgrade
docker-compose pull
docker-compose up -dHow do I troubleshoot Kubernetes issues?
What if a pod won’t start?
# Check pod status
kubectl get pods -n cmdop
# Check events
kubectl describe pod -n cmdop <pod-name>
# Check logs
kubectl logs -n cmdop <pod-name>What are common Kubernetes errors?
How do I fix ImagePullBackOff?
# Check image exists
docker pull cmdop/control-plane:latest
# Check secret
kubectl get secret regcred -n cmdopHow do I fix CrashLoopBackOff?
# Check logs for crash reason
kubectl logs -n cmdop <pod-name> --previousHow do I fix pods stuck in Pending?
# Check resources
kubectl describe pod -n cmdop <pod-name>
# Common: insufficient CPU/memory
# Scale down other workloads or increase node sizeHow do I fix Ingress issues?
# Check ingress
kubectl get ingress -n cmdop
kubectl describe ingress -n cmdop cmdop-ingress
# Check ingress controller logs
kubectl logs -n ingress-nginx <ingress-pod>How do I configure a Service Mesh?
If using Istio/Linkerd:
# Allow gRPC
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: cmdop-grpc
spec:
host: cmdop-grpc
trafficPolicy:
connectionPool:
http:
h2UpgradePolicy: UPGRADEHow do I troubleshoot Docker issues?
What if the container exits immediately?
# Check logs
docker logs cmdop-control-plane
# Run interactively
docker run -it --rm cmdop/control-plane:latest /bin/shHow do I fix out of memory errors?
# Check memory usage
docker stats cmdop-control-plane
# Increase memory limit
docker-compose.yml:
services:
control-plane:
deploy:
resources:
limits:
memory: 2GHow do I fix volume permission errors?
# Fix permissions
sudo chown -R 1000:1000 /data/cmdop
# Or in compose
user: "1000:1000"How do I tune performance?
How do I optimize PostgreSQL?
-- Check slow queries
SELECT query, calls, mean_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;# /etc/postgresql/15/main/postgresql.conf
shared_buffers = 256MB
effective_cache_size = 768MB
work_mem = 4MB
maintenance_work_mem = 64MBHow do I optimize Redis?
# /etc/redis/redis.conf
maxmemory 256mb
maxmemory-policy allkeys-lruHow do I set up connection pooling?
Use PgBouncer for high connection counts:
# /etc/pgbouncer/pgbouncer.ini
[databases]
cmdop = host=localhost dbname=cmdop
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20How do I set up logging?
How do I enable debug logs?
# Environment variable
export LOG_LEVEL=debug
# Or in config
log:
level: debug
format: jsonHow do I set up centralized logging?
# docker-compose.yml
logging:
driver: "json-file"
options:
max-size: "100m"
max-file: "3"How do I aggregate logs with ELK or Loki?
Forward to ELK/Loki:
# Filebeat config
- type: container
paths:
- /var/lib/docker/containers/*/*.log
processors:
- add_kubernetes_metadata:How do I back up and recover?
How do I create backups?
#!/bin/bash
# backup.sh
DATE=$(date +%Y%m%d)
pg_dump cmdop | gzip > /backups/cmdop-$DATE.sql.gz
redis-cli BGSAVE
cp /var/lib/redis/dump.rdb /backups/redis-$DATE.rdbHow do I restore from a backup?
# Restore PostgreSQL
gunzip -c backup.sql.gz | psql cmdop
# Restore Redis
sudo systemctl stop redis
sudo cp redis-backup.rdb /var/lib/redis/dump.rdb
sudo systemctl start redisHow do I test backup recovery?
Regularly test backups by restoring to staging environment.
How do I check system health?
How do I check API health?
curl http://localhost:8080/healthHow do I check gRPC health?
grpcurl -plaintext localhost:50051 grpc.health.v1.Health/CheckHow do I run a full health check?
# Combined health check script
#!/bin/bash
curl -sf http://localhost:8080/health || exit 1
grpcurl -plaintext localhost:50051 grpc.health.v1.Health/Check || exit 1
psql -h localhost -U cmdop -d cmdop -c "SELECT 1" || exit 1
redis-cli ping || exit 1
echo "All checks passed"Last updated on