Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kms: use different context for server lifecycle and initial load #113955

Merged
merged 1 commit into from Nov 16, 2022

Conversation

enj
Copy link
Member

@enj enj commented Nov 16, 2022

Signed-off-by: Monis Khan mok@microsoft.com

/kind bug

Fixed a bug that resulted in "grpc: the client connection is closing" errors shortly after the Kubernetes API server automatically reloaded its encryption-at-rest config due to an observed change to the file.  This bug was only encountered when the --encryption-provider-config-automatic-reload flag was set to true.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 16, 2022
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.26 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.26.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Nov 16 15:22:24 UTC 2022.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 16, 2022
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 16, 2022
}

// make sure things still work at a "later" time
time.Sleep(2 * time.Minute)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to add a big penalty 8 test-cases = 16 min, can't we do better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't intend to merge this as-is, I am trying to root cause an issue I am seeing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I was surprised , that explains it XD

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 16, 2022
@@ -706,6 +710,20 @@ resources:
if !bytes.HasPrefix(rawEnvelope, []byte(wantPrefix)) {
t.Fatalf("expected secret to be prefixed with %s, but got %s", wantPrefix, rawEnvelope)
}

// make sure things still work at a "later" time
time.Sleep(20 * time.Second)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails against master as expected:

    kms_transformation_test.go:718: Failed to create test secret, error: error while writing secret: Internal error occurred: rpc error: code = Canceled desc = grpc: the client connection is closing
I1116 21:02:22.232689 2038072 controller.go:211] Shutting down kubernetes service endpoint reconciler
E1116 21:02:22.233190 2038072 controller.go:214] Unable to remove endpoints from kubernetes service: StorageError: key not found, Code: 1, Key: /b36cdd15-dfb9-493c-b494-58d6e2f9667e/registry/masterleases//127.0.0.1, ResourceVersion: 0, AdditionalErrorMsg:
I1116 21:02:22.234391 2038072 cluster_authentication_trust_controller.go:463] Shutting down cluster_authentication_trust_controller controller
I1116 21:02:22.233642 2038072 controller.go:86] Shutting down OpenAPI V3 AggregationController
--- FAIL: TestEncryptionConfigHotReloadFileWatch (101.85s)
    --- FAIL: TestEncryptionConfigHotReloadFileWatch/truncate (25.35s)
    --- FAIL: TestEncryptionConfigHotReloadFileWatch/deleteAndCreate (25.58s)
    --- FAIL: TestEncryptionConfigHotReloadFileWatch/move (25.46s)
    --- FAIL: TestEncryptionConfigHotReloadFileWatch/symLink (25.45s)
FAIL
FAIL	k8s.io/kubernetes/test/integration/controlplane/transformation	102.041s
FAIL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we limit this delay/check to one of the loop iterations (maybe the first one or the last one) so we don't expand the test runtime by 20seconds*testcases?

Copy link
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix looks good, one request on the test

@@ -706,6 +710,20 @@ resources:
if !bytes.HasPrefix(rawEnvelope, []byte(wantPrefix)) {
t.Fatalf("expected secret to be prefixed with %s, but got %s", wantPrefix, rawEnvelope)
}

// make sure things still work at a "later" time
time.Sleep(20 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we limit this delay/check to one of the loop iterations (maybe the first one or the last one) so we don't expand the test runtime by 20seconds*testcases?

@enj enj changed the title kms: exercise connection close on reload kms: use different context for server lifecycle and initial load Nov 16, 2022
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 16, 2022
@enj
Copy link
Member Author

enj commented Nov 16, 2022

/kind bug
/priority important-soon
/milestone v1.26
/sig auth

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 16, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.26 milestone Nov 16, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. label Nov 16, 2022
@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/auth Categorizes an issue or PR as relevant to SIG Auth. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 16, 2022
@enj
Copy link
Member Author

enj commented Nov 16, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 16, 2022
Signed-off-by: Monis Khan <mok@microsoft.com>
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Nov 16, 2022
@liggitt
Copy link
Member

liggitt commented Nov 16, 2022

/lgtm
/approve

since this will show up as a diff between rc0 and rc1, please add a release note indicating what was fixed

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 16, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enj, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Nov 16, 2022
@k8s-ci-robot k8s-ci-robot merged commit 3f823c0 into kubernetes:master Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants