Knative Revisions: Scaling Issues With InitialScale > 1

Nov 25, 2025 by Alex Johnson 56 views

Knative Revisions: Understanding Scaling Issues with `initialScale > 1`

In the realm of Knative, a serverless workload management platform, revisions play a crucial role in managing application deployments. However, a peculiar issue arises when revisions are configured with initialScale > 1 and are no longer referenced by any Route. This scenario can prevent these revisions from scaling down to 0, leading to resource wastage. In this comprehensive guide, we will delve into the intricacies of this problem, exploring its causes, expected behavior, actual behavior, and a proposed solution.

Understanding the Problem: Revisions Stuck in Limbo

When Knative services undergo updates, new revisions are created to reflect the changes. As traffic shifts to these new revisions, the older ones become obsolete. Ideally, these older revisions should scale down to 0, freeing up valuable resources. However, when a revision is configured with an initialScale greater than 1, and it's no longer actively serving traffic (i.e., its routingState is set to "reserve"), it encounters a scaling roadblock. This means that even though the revision is idle, it continues to consume resources, which is far from ideal in a serverless environment where resource efficiency is paramount.

Expected Behavior: Scaling Down to Zero

In a well-functioning Knative system, the expected behavior is that when a revision's routingState transitions to "reserve" (indicating it's no longer referenced by any Route), it should promptly scale down to 0. This behavior is critical for several reasons:

Resource Optimization: Scaling down idle revisions ensures that resources are not unnecessarily tied up, leading to better resource utilization and cost savings.
Rapid Resource Release: When a new revision becomes ready and replaces an older one, the old revision should quickly release its resources by scaling down to 0. This ensures that the system can efficiently accommodate new workloads.
Adherence to Serverless Principles: Serverless platforms are designed to scale resources based on demand. Revisions that are not serving traffic should not consume resources, aligning with the core principles of serverless computing.
Constraint Override: The initialScale constraint, which dictates the minimum number of pods for a revision, should not impede the scaling down process when a revision is no longer in use. This ensures that the system can effectively manage resources based on actual traffic demands.

Actual Behavior: The Scaling Obstacle

Unfortunately, the actual behavior deviates from the expected behavior. Revisions with routingState = "reserve" and initialScale > 1 fail to scale down to 0 due to a combination of factors:

initialScale Logic: The scaler.go component within Knative's autoscaling mechanism enforces a minimum scale (min) equal to the initialScale, even when the revision is no longer referenced by any Route. This logic inadvertently prevents the autoscaler from scaling down below the initialScale setting.
Overriding ScaleBounds(): The ScaleBounds() function, which determines the minimum and maximum scale for a revision, correctly returns min=0 for unreachable revisions, including those with routingState = "reserve". However, the initialScale logic overrides this value, effectively blocking the scale-down operation.

This behavior results in resource wastage, as idle revisions continue to consume resources despite not serving any traffic. This contradicts the fundamental principles of serverless computing and can lead to increased operational costs.

Reproducing the Issue: A Step-by-Step Guide

To illustrate the issue, let's walk through a step-by-step scenario:

Deploy a Knative Service with minScale=1 and initialScale=2:

First, deploy a Knative Service with the following configuration:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/initialScale: "2"
    spec:
      containers:
      - image: gcr.io/knative-samples/helloworld-go

This configuration sets the minimum scale to 1 and the initial scale to 2, meaning that the revision will initially have two pods.

Deploy a New Revision with an Invalid Image:

Next, deploy a new revision with an invalid image, such as invalid-image:latest. This will cause the pods to enter the ImagePullBackOff state:
```
spec:
  template:
    spec:
      containers:
      - image: invalid-image:latest
```
The new revision will have two pods in the ImagePullBackOff state and will be marked as Unreachable because the image cannot be pulled.
Deploy a Third Revision with a Valid Image:

Now, deploy a third revision with a valid image. This revision will become the active revision, serving traffic.
Observe the Issue:

Observe that the second revision (with the ImagePullBackOff pods) remains at two pods and cannot scale down to 0. This occurs even though the old revision (helloworld-00002) has routingState = "reserve", indicating that it's no longer referenced by any Route.

You can verify this using the following command:
```
$ kubectl get po -n paas-uat
NAME                                                  READY   STATUS             RESTARTS   AGE
helloworld-nodejs-00002-deployment-564896c9fc-v7ntx   0/2     ImagePullBackOff   0          3h
helloworld-nodejs-00002-deployment-564896c9fc-vsqrh   0/2     ImagePullBackOff   0          3h
helloworld-nodejs-00003-deployment-847f88dbd8-6vfll   2/2     Running            0          168m
```
This output clearly shows that the second revision remains active with two pods, despite being in an error state and not serving traffic.

Root Cause Analysis: Diving into the Code

To understand the root cause, let's examine the relevant code snippet from serving/pkg/reconciler/autoscaling/kpa/scaler.go, specifically the scale() method:

// Line 343-349
if initialScale > 1 && !pa.Status.IsScaleTargetInitialized() {
    // Ignore initial scale if minScale >= initialScale.
    if min < initialScale {
        logger.Debugf("Adjusting min to meet the initial scale: %d -> %d", min, initialScale)
    }
    min = intMax(initialScale, min)
}

This code block is responsible for enforcing the initialScale setting. It checks if the initialScale is greater than 1 and if the Pod Autoscaler (PA) status indicates that the scale target is not yet initialized. If both conditions are met, it sets the minimum scale (min) to the initialScale value.

The problem lies in the fact that this logic is applied regardless of the revision's routingState. Even when a revision's routingState is "reserve", indicating that it's no longer referenced by any Route, this code block still forces the minimum scale to be the initialScale value. This prevents the autoscaler from scaling down the revision to 0.

The Disconnect Between `routingState` and `Reachability`

The issue is further compounded by the relationship between routingState and Reachability, as defined in serving/pkg/reconciler/revision/resources/pa.go:

routingState = "active" maps to Reachability = Reachable
routingState = "reserve" maps to Reachability = Unreachable
routingState = "pending" or unset maps to Reachability = Unknown

This mapping implies that a routingState of "reserve" is equivalent to Reachability = Unreachable for the purpose of determining whether a revision should be allowed to scale down. However, the initialScale logic in scaler.go doesn't take this equivalence into account, leading to the scaling issue.

`ScaleBounds()` and the Overridden `min` Value

It's important to note that the ScaleBounds() function already returns min=0 for unreachable revisions (as seen in pa_lifecycle.go:90). This indicates that the system is aware that unreachable revisions should be allowed to scale down to 0. However, the initialScale logic in scaler.go overrides this min=0 value, effectively negating the intended behavior.

Proposed Solution: Respecting `routingState`

To address this issue, we propose modifying the initialScale check in scaler.go to consider the revision's routingState. Specifically, the initialScale should be ignored when the revision's routingState is "reserve" (i.e., when pa.Spec.Reachability == autoscalingv1alpha1.ReachabilityUnreachable).

The proposed modification is as follows:

if initialScale > 1 && !pa.Status.IsScaleTargetInitialized() && pa.Spec.Reachability != autoscalingv1alpha1.ReachabilityUnreachable {
    // Ignore initial scale if minScale >= initialScale.
    if min < initialScale {
        logger.Debugf("Adjusting min to meet the initial scale: %d -> %d", min, initialScale)
    }
    min = intMax(initialScale, min)
}

By adding the condition pa.Spec.Reachability != autoscalingv1alpha1.ReachabilityUnreachable, we ensure that the initialScale logic is only applied to revisions that are still referenced by Routes (i.e., routingState = "active"). This modification achieves the following:

Enables Scaling Down to 0: Revisions with routingState = "reserve" (no longer referenced by any Route) can scale down to 0 immediately, freeing up resources.
Preserves initialScale Logic for Active Revisions: The initialScale logic remains effective for revisions that are actively serving traffic, ensuring that they maintain the desired minimum scale.
Optimizes Resource Utilization: Resources are promptly freed when old revisions are replaced by new ones, leading to better resource utilization and cost efficiency.

Conclusion: Towards Efficient Resource Management in Knative

The issue of revisions with initialScale > 1 failing to scale down to 0 when no longer referenced by any Route highlights a critical aspect of resource management in Knative. By understanding the root cause and implementing the proposed solution, we can ensure that Knative services efficiently utilize resources, aligning with the core principles of serverless computing.

This fix ensures that Knative deployments are more cost-effective and environmentally friendly by preventing idle resources from being unnecessarily allocated.

For more information on Knative and its features, please visit the official Knative Documentation.