Wait for Pods to finish before considering Failed in Job #113860

alculquicondor · 2022-11-11T17:23:08Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

Wait for Pods to finish before considering them failed, provided that the features gates PodDisruptionConditions and JobPodFailurePolicy, and the Job has a podFailurePolicy.

The behavior shouldn't be limited to pods with podFailurePolicy, but we do so to make the feature opt-in while PodDisruptionConditions graduates to stable.

Wait for Pods to finish before considering Failed

Which issue(s) this PR fixes:

Part of #113855

The next step would be to lift the restriction to Jobs with podFailurePolicy, possibly in v1.28.

Special notes for your reviewer:

Does this PR introduce a user-facing change?

When the feature gates `PodDisruptionConditions` and `JobPodFailurePolicy` are both enabled,
the Job controller now does not consider a terminating Pod (a pod that has a `.metadata.deletionTimestamp`)
as a failure until that Pod is terminal (its `.status.phase` is `Failed` or `Succeeded`).

However, the Job controller creates a replacement Pod as soon as the termination becomes apparent.
Once the pod terminates, the Job controller evaluates `.backoffLimit` and `.podFailurePolicy` for the
relevant Job, taking this now-terminated Pod into consideration.

This behavior is limited to Jobs with `.spec.podFailurePolicy` set, and only when those two feature
gates are both enabled.
If either of these requirements is not satisfied, the Job controller counts a terminating Pod as an immediate
failure, even if that Pod later terminates with `phase: "Succeeded"`.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

alculquicondor · 2022-11-11T17:23:22Z

/assign @liggitt

alculquicondor · 2022-11-11T20:12:34Z

/assign @mimowo

alculquicondor · 2022-11-11T20:19:06Z

/priority important-background

k8s-ci-robot · 2022-11-11T20:19:08Z

@alculquicondor: The label(s) priority/important-background cannot be applied, because the repository doesn't have them.

In response to this:

/priority important-background

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

alculquicondor · 2022-11-11T20:19:41Z

/priority important-longterm

Limit behavior to feature gates PodDisruptionConditions and JobPodFailurePolicy and jobs with a podFailurePolicy. Change-Id: I926391cc2521b389c8e52962afb0d4a6a845ab8f

mimowo

Thanks for fixing. I have three main comments:

I suggest adding an e2e test for the main issue, it could be similar to the already existing e2e test where we evict a running pod, but instead of checking for the DisruptionTarget condition we could check for the 137 exit code. On master such podFailurePolicy would not work properly as the pod would be running at the moment of checking (or at least is very likely to be still running).
is the change to look for never scheduled terminating pods required? There is already code in PodGC to move unscheduled terminating pods to Failed phase (code path under:

kubernetes/pkg/controller/podgc/gc_controller.go

Line 294 in 8e48df1

func (gcc *PodGCController) gcUnscheduledTerminating(ctx context.Context, pods []*v1.Pod) {

)
similarly, is the change to consider all pods as terminated when finishedCond != nil required?

If changes 2. and 3. are performance optimizations then I would suggest extracting them into a dedicated PR.

mimowo · 2022-11-14T09:56:05Z