Manual scaling is reactive and slow. HPA automatically scales pods based on resource usage.
Create HPA:
kubectl autoscale deployment myapp \ --cpu-percent=70 \ --min=2 \ --max=10 # When CPU > 70%, adds pods (up to 10) # When CPU < 70%, removes pods (down to 2)
YAML Version:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Check Status: kubectl get hpa
Automatic scaling saves money and handles traffic spikes!
