Deployment Guide

Deploy your AI agent to Kubernetes with enterprise security, monitoring, and scaling.

This guide covers deploying your AI agent to production on Kubernetes, including configuration, monitoring, and scaling considerations.

Prerequisites

Before deploying to production:

  • Kubernetes cluster access
  • Container registry access
  • MCP servers deployed and accessible
  • LLM API keys configured
  • Agent tested locally

Step 1: Build Container Image

# Build the agent container
podman build -t your-registry/ai-agent:v1.0 .

# Push to registry
podman push your-registry/ai-agent:v1.0

Step 2: Configure Secrets

Create Kubernetes secrets for sensitive data:

# Create secret for LLM API key
kubectl create secret generic agent-secrets \
  --from-literal=llm-api-key=your-key-here \
  --from-literal=mcp-server-token=your-token \
  -n ai-agents

# Create configmap for MCP server URLs
kubectl create configmap agent-config \
  --from-literal=mcp-servers=http://mcp-server.default.svc:3000/mcp \
  -n ai-agents

Step 3: Deploy to Kubernetes

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ai-agent
  namespace: ai-agents
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ai-agent
  template:
    metadata:
      labels:
        app: ai-agent
    spec:
      containers:
      - name: agent
        image: your-registry/ai-agent:v1.0
        env:
        - name: LLM_API_KEY
          valueFrom:
            secretKeyRef:
              name: agent-secrets
              key: llm-api-key
        - name: MCP_SERVERS
          valueFrom:
            configMapKeyRef:
              name: agent-config
              key: mcp-servers
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
        readinessProbe:
          httpGet:
            path: /ready
            port: 8000

Apply the deployment:

kubectl apply -f kubernetes/deployment.yaml
kubectl apply -f kubernetes/service.yaml

Step 4: Monitoring

Health Checks

Monitor agent health:

# Check pod status
kubectl get pods -n ai-agents

# View logs
kubectl logs -f deployment/ai-agent -n ai-agents

# Check health endpoint
kubectl port-forward svc/ai-agent 8000:8000 -n ai-agents
curl http://localhost:8000/health

Metrics

The template includes Prometheus metrics:

  • Request count and latency
  • MCP tool usage
  • LLM token consumption
  • Error rates

Step 5: Scaling

Horizontal Scaling

# Scale manually
kubectl scale deployment ai-agent --replicas=5 -n ai-agents

# Or use HPA (Horizontal Pod Autoscaler)
kubectl autoscale deployment ai-agent \
  --cpu-percent=70 \
  --min=2 \
  --max=10 \
  -n ai-agents

Performance Tuning

Adjust based on load:

  • Memory: Increase for large context windows
  • CPU: Increase for CPU-intensive processing
  • Replicas: Scale based on request volume

Security Considerations

  • API Keys: Always use Kubernetes secrets
  • Network Policies: Restrict agent-to-MCP communication
  • RBAC: Limit service account permissions
  • Audit Logging: Enable for compliance

Next Steps

  • Monitor performance: Set up dashboards
  • Configure alerts: Get notified of issues
  • Scale as needed: Adjust resources based on usage
  • Update regularly: Deploy new versions safely

For deployment questions, visit GitHub Discussions.