Configure horizontal pod autoscaling to automatically adjust the number of replicas based on resource utilization.Documentation Index
Fetch the complete documentation index at: https://porter-mintlify-env-group-sync-target-removal-1778710168.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Field Reference
| Field | Type | Description |
|---|---|---|
enabled | boolean | Enable autoscaling |
minInstances | integer | Minimum number of replicas |
maxInstances | integer | Maximum number of replicas |
cpuThresholdPercent | integer | CPU usage threshold (0-100) |
memoryThresholdPercent | integer | Memory usage threshold (0-100) |
Basic Configuration
When autoscaling is enabled, the
instances field is ignored. The autoscaler manages replica count automatically.How It Works
When either CPU or memory usage exceeds your configured threshold, Porter automatically adds replicas. When usage drops, replicas are removed (down to your minimum).Example: Autoscaling in Action
Consider an API service with this configuration:| Time | Avg CPU | Avg Memory | Replicas | What Happens |
|---|---|---|---|---|
| t=0 | 30% | 40% | 2 | Baseline: both metrics below thresholds |
| t=1 | 75% | 50% | 4 | CPU (75%) exceeds 60% threshold → scale up |
| t=2 | 90% | 60% | 6 | CPU still high → continue scaling up |
| t=3 | 55% | 85% | 8 | CPU stabilized, but memory (85%) exceeds 80% → scale up |
| t=4 | 45% | 70% | 8 | Both metrics below thresholds → no change (cooldown period) |
| t=5 | 40% | 50% | 5 | Sustained low usage → scale down |
| t=6 | 35% | 45% | 2 | Continue scaling down to minimum |
- Either metric triggers scaling: If CPU or memory exceeds its threshold, replicas are added
- Both must be low to scale down: Replicas are only removed when both CPU and memory are below their thresholds
- Respects bounds: Replicas never drop below
minInstances(2) or exceedmaxInstances(10) - Gradual changes: The autoscaler adjusts incrementally, not all at once, to avoid oscillation
Custom Metrics Autoscaling (Prometheus)
Scale based on application-specific metrics like queue length, request latency, or custom business metrics.| Field | Type | Description |
|---|---|---|
customAutoscaling.prometheusMetricCustomAutoscaling.metricName | string | Prometheus metric name |
customAutoscaling.prometheusMetricCustomAutoscaling.threshold | number | Threshold value to trigger scaling |
customAutoscaling.prometheusMetricCustomAutoscaling.query | string | Custom PromQL query (optional, defaults to metric name) |
Custom metrics autoscaling requires Prometheus to be accessible in your cluster. See Custom Metrics and Autoscaling for setup details.
Temporal Autoscaling
Scale Temporal workflow workers based on task queue depth. Porter monitors your Temporal task queues and automatically adjusts worker count.Temporal autoscaling requires a Temporal integration to be configured. See Temporal Autoscaling for setup details.
| Field | Type | Description |
|---|---|---|
temporalAutoscaling.temporalIntegrationId | string | UUID of the Temporal integration |
temporalAutoscaling.taskQueue | string | Name of the Temporal task queue to monitor |
temporalAutoscaling.targetQueueSize | integer | How many queued tasks each replica should handle (e.g., set to 10 with 100 tasks queued → 10 replicas) |
Related Documentation
- Autoscaling Overview - UI-based configuration and concepts
- Web Services - Web service configuration
- Worker Services - Worker service configuration

