Deployment Guide

Deploy your AI agent to Kubernetes with enterprise security, monitoring, and scaling.

This guide covers deploying your AI agent to production on Kubernetes, including configuration, monitoring, and scaling considerations.

Prerequisites

Before deploying to production:

Kubernetes cluster access
Container registry access
MCP servers deployed and accessible
LLM API keys configured
Agent tested locally

Step 1: Build Container Image

# Build the agent container
podman build -t your-registry/ai-agent:v1.0 .

# Push to registry
podman push your-registry/ai-agent:v1.0

Step 2: Configure Secrets

Create Kubernetes secrets for sensitive data:

# Create secret for LLM API key
kubectl create secret generic agent-secrets \
  --from-literal=llm-api-key=your-key-here \
  --from-literal=mcp-server-token=your-token \
  -n ai-agents

# Create configmap for MCP server URLs
kubectl create configmap agent-config \
  --from-literal=mcp-servers=http://mcp-server.default.svc:3000/mcp \
  -n ai-agents

Step 3: Deploy to Kubernetes

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ai-agent
  namespace: ai-agents
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ai-agent
  template:
    metadata:
      labels:
        app: ai-agent
    spec:
      containers:
      - name: agent
        image: your-registry/ai-agent:v1.0
        env:
        - name: LLM_API_KEY
          valueFrom:
            secretKeyRef:
              name: agent-secrets
              key: llm-api-key
        - name: MCP_SERVERS
          valueFrom:
            configMapKeyRef:
              name: agent-config
              key: mcp-servers
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
        readinessProbe:
          httpGet:
            path: /ready
            port: 8000

Apply the deployment:

kubectl apply -f kubernetes/deployment.yaml
kubectl apply -f kubernetes/service.yaml

Step 4: Monitoring

Health Checks

Monitor agent health:

# Check pod status
kubectl get pods -n ai-agents

# View logs
kubectl logs -f deployment/ai-agent -n ai-agents

# Check health endpoint
kubectl port-forward svc/ai-agent 8000:8000 -n ai-agents
curl http://localhost:8000/health

Metrics

The template includes Prometheus metrics:

Request count and latency
MCP tool usage
LLM token consumption
Error rates

Step 5: Scaling

Horizontal Scaling

# Scale manually
kubectl scale deployment ai-agent --replicas=5 -n ai-agents

# Or use HPA (Horizontal Pod Autoscaler)
kubectl autoscale deployment ai-agent \
  --cpu-percent=70 \
  --min=2 \
  --max=10 \
  -n ai-agents

Performance Tuning

Adjust based on load:

Memory: Increase for large context windows
CPU: Increase for CPU-intensive processing
Replicas: Scale based on request volume

Security Considerations

API Keys: Always use Kubernetes secrets
Network Policies: Restrict agent-to-MCP communication
RBAC: Limit service account permissions
Audit Logging: Enable for compliance

Next Steps

Monitor performance: Set up dashboards
Configure alerts: Get notified of issues
Scale as needed: Adjust resources based on usage
Update regularly: Deploy new versions safely

For deployment questions, visit GitHub Discussions.

Edit this page Open a documentation issue