
KEDA: Kubernetes Event-Driven Autoscaling
Stewart Moreland

KEDA (Kubernetes-based Event Driven Autoscaler) revolutionizes how we approach autoscaling in Kubernetes by extending beyond traditional CPU and memory metrics. Created by Microsoft and Red Hat, this powerful tool enables workload scaling based on events from databases, message queues, monitoring systems, and cloud services, providing unprecedented flexibility for modern applications.
What is KEDA?
KEDA is a lightweight tool that works alongside Kubernetes components like the Horizontal Pod Autoscaler (HPA). It doesn't replace anything but adds more functionality, allowing you to choose which apps to scale with KEDA while leaving others untouched. This makes it flexible and easy to integrate with your existing setup.
Table of Contents
- Understanding ScaledObjects
- Configuration Components
- Practical Implementation Examples
- Advanced Configuration Patterns
- Production Best Practices
- Monitoring & Troubleshooting
- Getting Started with KEDA 2.17
Understanding ScaledObjects
KEDA operates through custom resource definitions called ScaledObject resources. These objects define both what to scale and when to scale it, providing a declarative approach to event-driven autoscaling.
KEDA 2.17 monitors external event sources and adjusts your app's resources based on demand. Its main components include: KEDA Operator (tracks event sources), Metrics Server (provides external metrics to HPA), Scalers (connect to event sources), and Custom Resource Definitions (define scaling behavior).
KEDA 2.17 Custom Resources (CRDs)
KEDA 2.17 uses Custom Resource Definitions (CRDs) to manage scaling behavior:
ScaledObject- Links your app (Deployment, StatefulSet, or Custom Resource) to external event sources, defining how scaling worksScaledJob- Handles batch processing tasks by scaling Kubernetes Jobs based on external metricsTriggerAuthentication- Provides secure ways to access event sources, supporting methods like environment variables, secrets, or cloud-specific credentials
Core ScaledObject Components
Every ScaledObject contains two essential components:
scaleTargetRef- Defines the Kubernetes resource to be scaled (Deployments, StatefulSets, Custom Resources)triggers- Defines the events and metrics that trigger scaling operations
Configuration Components
scaleTargetRef Configuration
The scaleTargetRef defines which Kubernetes resource KEDA should scale. While only the name is required, additional configuration options provide fine-grained control.
spec:scaleTargetRef:apiVersion: apps/v1 # Optional. Default: apps/v1kind: Deployment # Optional. Default: Deploymentname: my-application # Mandatory. Must be in same namespaceenvSourceContainerName: app-container # Optional. Default: first container
Namespace Requirement
The scaleTargetRef resource must be in the same namespace as your ScaledObject. This is a security feature that prevents cross-namespace scaling operations and maintains proper access control.
Trigger Types and Scalers
KEDA's true power lies in its extensive collection of scalers. Each scaler connects to different external systems and metrics sources, enabling real-time scaling based on actual workload demands.
KEDA 2.17 Scaler Categories:
- Messaging: Apache Kafka, RabbitMQ, Azure Service Bus, AWS SQS, Redis Streams, NATS JetStream, Apache Pulsar
- Data & Storage: AWS CloudWatch, Azure Monitor, Google Cloud Pub/Sub, Azure Blob Storage, AWS DynamoDB
- Metrics: Prometheus, Datadog, New Relic, Dynatrace, InfluxDB, Graphite, Splunk
- Datastore: PostgreSQL, MySQL, MongoDB, Elasticsearch, CouchDB, Cassandra, MSSQL
- Apps: GitHub Runner Scaler, Selenium Grid, Temporal, Azure Pipelines
- CI/CD: GitHub Runner Scaler, Azure Pipelines
- Testing: Selenium Grid Scaler
KEDA 2.17 Scaler Ecosystem
KEDA 2.17 includes over 60+ built-in scalers across multiple categories, with active community and enterprise maintainers. Check the official scalers documentation for the complete list and latest additions.
Complete ScaledObject Specification
Here's a comprehensive KEDA 2.17 ScaledObject configuration showcasing all available parameters:
apiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: production-autoscalernamespace: defaultlabels:app: my-applicationversion: v2.17.0spec:# Target Configuration (supports Deployments, StatefulSets, Custom Resources)scaleTargetRef:apiVersion: apps/v1kind: Deploymentname: my-applicationenvSourceContainerName: app-container# Scaling BehaviorpollingInterval: 30 # Metric check frequency (seconds)cooldownPeriod: 300 # Wait time after scale down (seconds)idleReplicaCount: 0 # Scale to zero when idle (optional)minReplicaCount: 2 # Minimum replicas for availabilitymaxReplicaCount: 100 # Maximum replicas for cost control# Fallback Strategy (KEDA 2.17 enhanced)fallback:failureThreshold: 3 # Failed checks before fallbackreplicas: 6 # Fallback replica count# Advanced HPA Configuration (KEDA 2.17 enhanced)advanced:restoreToOriginalReplicaCount: falsehorizontalPodAutoscalerConfig:name: custom-hpa-namebehavior:scaleDown:stabilizationWindowSeconds: 300policies:- type: Percentvalue: 50periodSeconds: 60- type: Podsvalue: 2periodSeconds: 60scaleUp:stabilizationWindowSeconds: 60policies:- type: Percentvalue: 100periodSeconds: 15- type: Podsvalue: 4periodSeconds: 60# KEDA 2.17 Triggers (examples below)triggers:- type: prometheusmetadata:serverAddress: http://prometheus-server.default.svc.cluster.local:9090metricName: http_requests_per_secondthreshold: '100'query: sum(rate(http_requests_total{app="my-application"}[1m]))# Optional: Authentication referenceauthenticationRef:name: prometheus-auth
Practical Implementation Examples
AWS CloudWatch SQS Queue Scaling
One of the most common use cases is scaling based on message queue depth. Here's a comprehensive implementation for AWS SQS:
triggers:- type: aws-cloudwatchmetadata:# SQS Specific Configurationnamespace: AWS/SQSdimensionName: QueueNamedimensionValue: user-processing-queuemetricName: ApproximateNumberOfMessagesVisible# Scaling ThresholdstargetMetricValue: "5" # Scale up when >5 messagesminMetricValue: "0" # Scale down when 0 messages# AWS ConfigurationawsRegion: "us-east-1"identityOwner: operator # Use pod identity# Metric CollectionmetricCollectionTime: "300" # 5 minute collection windowmetricStat: "Average" # Statistical methodmetricStatPeriod: "300" # 5 minute period
Application Load Balancer Response Time Scaling
This advanced example demonstrates scaling based on Application Load Balancer metrics with dynamic target group discovery using Helm:
{{- $root := . }}{{- if .Values.autoscaler.enabled }}{{- if $root.Capabilities.APIVersions.Has "keda.sh/v1alpha1" }}# Dynamic Target Group Discovery{{- $targetGroups := (lookup "elbv2.k8s.aws/v1beta1" "TargetGroupBinding" "" "").items }}apiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: {{ $root.Release.Name }}-response-time-autoscalernamespace: {{ $root.Release.Namespace }}labels:{{- include "my-chart.labels" . | nindent 4 }}component: autoscalerscaling-type: response-timespec:scaleTargetRef:name: {{ include "my-chart.fullname" . }}kind: Deployment# Optimized for response time scalingpollingInterval: 15 # Frequent checks for responsivenesscooldownPeriod: 180 # Shorter cooldown for web workloadsminReplicaCount: 3 # Ensure availabilitymaxReplicaCount: 50 # Control maximum scale# Advanced scaling behavioradvanced:restoreToOriginalReplicaCount: truehorizontalPodAutoscalerConfig:behavior:scaleDown:stabilizationWindowSeconds: 300policies:- type: Percentvalue: 50 # Conservative scale downperiodSeconds: 60scaleUp:stabilizationWindowSeconds: 30policies:- type: Percentvalue: 100 # Aggressive scale upperiodSeconds: 15- type: Podsvalue: 5 # Add up to 5 pods quicklyperiodSeconds: 30triggers:{{- range $index, $group := $targetGroups }}{{- if eq .spec.serviceRef.name (include "my-chart.fullname" $root) }}# Response Time Trigger- type: aws-cloudwatchmetadata:namespace: AWS/ApplicationELBdimensionName: TargetGroupdimensionValue: {{ .spec.targetGroupARN }}metricName: TargetResponseTimetargetMetricValue: "0.5" # 500ms thresholdminMetricValue: "0.1" # 100ms minimumawsRegion: {{ $.Values.aws.region | default "us-east-1" }}identityOwner: operatormetricCollectionTime: "60"metricStat: "Average"metricStatPeriod: "60"# Request Count Trigger (secondary)- type: aws-cloudwatchmetadata:namespace: AWS/ApplicationELBdimensionName: TargetGroupdimensionValue: {{ .spec.targetGroupARN }}metricName: RequestCountPerTargettargetMetricValue: "100" # 100 requests per targetawsRegion: {{ $.Values.aws.region | default "us-east-1" }}identityOwner: operatormetricCollectionTime: "60"metricStat: "Sum"metricStatPeriod: "60"{{- end }}{{- end }}{{- end }}{{- end }}
Helm Lookup Limitation
The lookup function in Helm only works during helm upgrade operations, not during initial helm install. For initial deployments, consider using static target group ARNs or implement a post-install hook to update the ScaledObject.
Advanced Configuration Patterns
Multi-Trigger Scaling Strategy
Combine multiple triggers to create sophisticated scaling logic that responds to different application conditions:
triggers:# Primary: Application Performance- type: prometheusmetadata:serverAddress: http://prometheus:9090metricName: request_ratethreshold: '100'query: |rate(http_requests_total{app="my-application",status!~"5.."}[2m])# Secondary: Queue Depth- type: aws-cloudwatchmetadata:namespace: AWS/SQSdimensionName: QueueNamedimensionValue: background-jobsmetricName: ApproximateNumberOfMessagesVisibletargetMetricValue: "20"awsRegion: "us-east-1"identityOwner: operator# Tertiary: Resource Utilization- type: memorymetadata:type: Utilizationvalue: "80"# Quaternary: Custom Business Metric- type: externalmetadata:scalerAddress: business-metrics-scaler.monitoring.svc.cluster.local:8080metricName: active_user_sessionstargetValue: "1000"authenticationRef:name: business-metrics-auth
Custom External Scaler Implementation
Create custom scalers for business-specific metrics:
Scaling Based on Time Patterns
Implement predictive scaling using cron-based triggers:
# Cron-based scaling for predictable traffic patternstriggers:# Business hours scaling- type: cronmetadata:timezone: America/New_Yorkstart: "0 8 * * 1-5" # 8 AM weekdaysend: "0 18 * * 1-5" # 6 PM weekdaysdesiredReplicas: "10"# Peak hours scaling- type: cronmetadata:timezone: America/New_Yorkstart: "0 12 * * 1-5" # 12 PM weekdaysend: "0 14 * * 1-5" # 2 PM weekdaysdesiredReplicas: "20"# Weekend maintenance scaling- type: cronmetadata:timezone: America/New_Yorkstart: "0 2 * * 0" # 2 AM Sundayend: "0 4 * * 0" # 4 AM SundaydesiredReplicas: "1"
Production Best Practices
- Start Conservative: Begin with higher thresholds and longer cooldown periods
- Monitor Continuously: Use comprehensive observability tools
- Test Thoroughly: Validate scaling behavior in staging environments
- Set Boundaries: Always configure
maxReplicaCountandminReplicaCount - Plan for Failures: Configure fallback replicas for metric collection failures
Security and Authentication
Implement robust security practices for production KEDA deployments:
# RBAC for KEDA OperatorapiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:name: keda-scaledobject-controllerrules:- apiGroups: [""]resources: ["pods", "services", "endpoints"]verbs: ["get", "list", "watch"]- apiGroups: ["apps"]resources: ["deployments", "replicasets"]verbs: ["get", "list", "watch", "update", "patch"]- apiGroups: ["autoscaling"]resources: ["horizontalpodautoscalers"]verbs: ["*"]---# Service Account for ApplicationsapiVersion: v1kind: ServiceAccountmetadata:name: keda-application-saannotations:eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/KedaApplicationRole
Performance Optimization
Optimization Strategies:
- Polling Frequency: Balance between responsiveness and resource usage
- Cooldown Periods: Prevent scaling oscillation while maintaining responsiveness
- Stabilization Windows: Use HPA behavior policies for smoother scaling
- Metric Collection: Optimize collection windows for your use case
Monitoring and Observability
Implement comprehensive monitoring for your KEDA deployments:
Key Monitoring Metrics:
- Scaling Events: Track scale up/down frequency and timing
- Metric Collection: Monitor scaler health and response times
- Resource Usage: Track KEDA operator resource consumption
- Application Performance: Correlate scaling with application metrics
# Prometheus ServiceMonitor for KEDAapiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata:name: keda-operator-metricsspec:selector:matchLabels:app: keda-operatorendpoints:- port: metricsinterval: 30spath: /metrics---# Grafana Dashboard ConfigMapapiVersion: v1kind: ConfigMapmetadata:name: keda-dashboarddata:dashboard.json: |{"dashboard": {"title": "KEDA Scaling Metrics","panels": [{"title": "Scaling Events","type": "graph","targets": [{"expr": "rate(keda_scaled_object_scaling_total[5m])"}]}]}}
Troubleshooting Common Issues
Common KEDA Problems & Solutions
Metrics Not Found: Verify scaler configuration, authentication, and network connectivity to metric sources
Rapid Scaling Oscillation: Increase cooldown periods, implement stabilization windows, and review threshold values
Authentication Failures: Check TriggerAuthentication resources, secret references, and IAM permissions
Performance Impact: Monitor KEDA operator resource usage and consider adjusting polling intervals
Scale-to-Zero Issues: Verify idleReplicaCount configuration and ensure proper health checks
Diagnostic Commands:
# Check KEDA operator statuskubectl get pods -n keda-system# Examine ScaledObject statuskubectl describe scaledobject my-scaledobject# View KEDA operator logskubectl logs -n keda-system deployment/keda-operator# Check HPA created by KEDAkubectl get hpa# Monitor scaling eventskubectl get events --field-selector involvedObject.kind=ScaledObject# Debug metric collectionkubectl logs -n keda-system deployment/keda-metrics-apiserver
Getting Started with KEDA 2.17
Installation Options
KEDA 2.17 provides multiple deployment methods based on the official deployment guide:
# Add KEDA Helm repositoryhelm repo add kedacore https://kedacore.github.io/chartshelm repo update# Install KEDA 2.17helm install keda kedacore/keda \--namespace keda-system \--create-namespace \--version 2.17.0# Verify installationkubectl get pods -n keda-system
Simple KEDA 2.17 Example
Based on the KEDA 2.17 getting started guide, here's a complete example:
# 1. Sample Application DeploymentapiVersion: apps/v1kind: Deploymentmetadata:name: http-appspec:replicas: 1selector:matchLabels:app: http-apptemplate:metadata:labels:app: http-appspec:containers:- name: http-serverimage: nginx:latestresources:requests:cpu: 100mmemory: 128Mi---# 2. Service for the ApplicationapiVersion: v1kind: Servicemetadata:name: http-app-servicespec:selector:app: http-appports:- port: 80targetPort: 80type: LoadBalancer---# 3. KEDA 2.17 ScaledObjectapiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: http-app-scaledobjectspec:scaleTargetRef:name: http-appminReplicaCount: 1maxReplicaCount: 10triggers:- type: prometheusmetadata:serverAddress: http://prometheus-server.default.svc.cluster.local:9090metricName: http_requests_totalthreshold: '5'query: sum(rate(http_requests_total[1m]))
Conclusion
KEDA 2.17 transforms Kubernetes autoscaling from reactive resource-based scaling to proactive event-driven scaling. By leveraging external metrics and sophisticated scaling strategies, you can achieve:
- Better Performance: Proactive scaling based on leading indicators from 60+ built-in scalers
- Cost Efficiency: Scale-to-zero capabilities and precise resource allocation
- Operational Excellence: Reduced manual intervention with enhanced monitoring and admission webhooks
- Business Alignment: Scaling based on business metrics and user demand with custom scalers
- Enhanced Security: Advanced authentication providers including AWS IRSA, Azure Workload Identity, and GCP Workload Identity
The combination of KEDA 2.17's robust configuration options, security practices, and comprehensive monitoring creates a production-ready autoscaling solution that adapts to your application's unique requirements while maintaining reliability and performance.
Start with simple triggers like Prometheus or CloudWatch metrics using the examples above, then gradually add complexity with multi-trigger configurations and custom scalers. KEDA 2.17's enhanced admission webhooks will help validate your configurations before deployment.