-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Key Storage Provider not registered in Docker Image #89
Comments
I have observed the exact same class of behavior with my company's KSP (not the same KSP as above). |
This is not Venafi specific. I have also verified this issue is the same if you use the SafeNet Luna client. This issue will be seen if the KSP is not installed on the host machine. If I install the KSP on the host node, it is always available in the container. The work-around of registering the KSP post load of the container should not be necessary. |
This issue has been open for 30 days with no updates. |
I am honestly surprised Microsoft is not actively looking at this issue. Seems like the crypto layer is not being initialized properly. |
This issue has been open for 30 days with no updates. |
Sorry for the delay on this. We do have an internal bug created to investigate, but it's stalled for some reason. I just pinged the teams that own the internal but to follow up. Will let this thread know once I hear back. |
For internal MS tracking, we have a bug for investigation and fix: 31877422. |
This issue has been open for 30 days with no updates. |
1 similar comment
This issue has been open for 30 days with no updates. |
Unfortunately this is still under works. We'll update as soon as we have any news. |
This issue has been open for 90 days with no updates. |
As this has been open more than a year, is there any timescale for a fix on this? This impacts signing infrastructure & the ability to bundle providers as part of images without compromising container security. In addition, this issue is tagged as |
Agreed, that is confusing. I will look into this. |
Update: re-engaged the dev team who owns the code associated with this area. If and when there is an ETA I will share here. |
Is there a public roadmap you could link us to? |
Hey @iamjplant, there is not (the roadmap on this GitHub ideally, but the team needs to improve hygiene as it's very out of date). |
I jumped in on this issue and sat down to reproduce it. • Took an up-to-date Windows Server 2022 host VM that It appears to be working. I know other folks had run into this, so it would be helpful to know if anyone is still seeing this issue and could help me reproduce it. Thanks. |
@iamjplant Just double checking - what host OS (edition and version) are you building from? The original issue above has a big, bold sentence about running in hyper-v isolation, so I assumed that was relevant. |
Thank you. I'll need to see if I can reproduce it with that host/guest combination. |
It's been a busy few weeks with people-management stuff for me. I did update my dockerfile with precisely the FROM statement that @iamjplant provided: FROM mcr.microsoft.com/windows/servercore:ltsc2019-amd64 And it still works for me. I'm starting to run out of avenues of investigation here. |
Bummer... |
@imjplant the KSP still needs to be installed in the Docker container as well for my host comment to work. |
@jwbrothers agreed. |
Thank you, @jwbrothers - I can finally reproduce this on a Server 2022 host / Server 2019 guest. In fairness, this is pretty subtle; I appreciate your help in clarifying expectations. Now I can see about what might be causing this. I am looping in the appropriate feature teams to investigate further. |
Thanks again for all the help on this. We made some good progress in this, in partnership with the subject matter experts for Windows Crypto. @jwbrothers hypothesis was half-right: the crypto layer is initialized properly, but it is not persisting through container shutdown properly. We understand why the KSP registration is not persisting through container shutdown, but it is not trivial to fix. We need to plan and cost the engineering work, so we do not have a timeline yet. We recognize this is a long-running issue. In the meantime, the Crypto engineers confirmed that re-registering the KSP’s DLL is a reasonable workaround and is a safe way of keeping your workload functioning properly in Windows Containers. |
@michbern-ms Just a quick question, as we, too, are running into this issue. I understand that you've identified a problem where the KSP registration is not correctly persisted when performed within a container (e.g., while building the container), so information is lost and the KSP will not work when the container is later restarted. However, I see a second problem here. If I understand correctly, the issue seems to go away if the KSP is registered on the host. AFAIK, a container should not behave differently depending on what is installed/registered on the host, it should have an isolated surroundings. Does that imply that there is a bug in the isolation? And could a container thus maybe corrupt its host's crypto layer configuration in some way? |
@fschmied I have not been able to duplicate that with my setup. The KSP is not registered in the container regardless of the host registration. |
Okay, that's interesting, as @jwbrothers mentioned that above, and my team have also reported seeing this. However, we were using process isolation whereas the original issue was reported with Hyper-V isolation, maybe that's the difference. @iamjplant Could you try if you can see what @jwbrothers was reporting above, i.e., different behavior depending on what's installed on the host when using process isolation? |
Sadly as soon as I add a |
@fschmied, you raise an interesting question. Process-isolated containers are implemented with a shared kernel, whereas a Hyper-v isolated container has its own kernel. So, I'm also curious to see if the results are consistently different in those two cases. As a reminder, there is not really a security boundary between a process-isolated container and its host. See: |
@michbern-ms We've been able to reproduce that there is a bug in process isolation regarding KSP registrations. See #263 for the full description. |
This issue has been open for 90 days with no updates. |
Wanted to provide an update - the engineering work to fix this issue was painstaking, but it is complete and available in builds for the Windows Server Insider Program: https://learn.microsoft.com/en-us/windows-insider/business/server-get-started Meanwhile, we have started on the process to make this change available on Windows Server 2022. |
Thanks for the update. |
Also thanks for the update! For the time being, we're still on Windows Server 2019, so we'll need to keep working around the issues and hoping none of our customers will find a situation that we can't work around any more. We're planning to move on to Server 2022 in a couple of months, so I'm keeping my fingers crossed. |
This issue has been open for 90 days with no updates. |
Hi, I'm also encountering the same issues and was curious if there were any new updates? |
We are very close. I can't give you a precise date, but we're planning to service the fix to both Server 2019 and Server 2022 rather soon. @zosocanuck Are you also using the Venafi CSP, or a different one? |
@michbern-ms Yes I'm using the Venafi CSP |
This is a humbling update to write, but here goes. The good news: We completed our work to backport the crypto-stack isolation feature to Server 2019 and Server 2022. It is available in the servicing releases that became generally available today (September 12, 2023). So, the support for having different crypto/KSP configs in a container than exist in the host machine is complete and available. Microsoft is very careful with the millions of existing Windows Server instances in production today, so authoring and verification of these two backports was time-consuming, but we're proud that the work is available now. Moreover, the whole scenario works end-to-end in the Windows Server 23H2 release, which will become available later this fall: The bad news: In final testing, we found that there is an issue that only manifests when running with the Venafi provider. This issue manifests as the container failing to start after the Venafi provider is installed. We are investigating this new issue with high priority. This is clearly not the update I was hoping to post today, but I felt that the transparency was important for all of you who have been paying attention to this investigation. Thanks, |
We asked our colleagues in the OS crypto and security team for a simple way to demonstrate the new functionality in the September servicing release, and they helpfully sent us the following - some PowerShell commands you can use to see the new crypto-config isolation support: Using the following PowerShell commands: Call: Get-TlsCipherSuite on both the host and the container should show the current state. Note: Enable-TlsCipherSuite and Disable-TlsCipherSuite need to be run from a PowerShell started as Administrator (Works by default in a container but needs to be done on the host). |
I'm happy to say that the new issue identified above on 9/12 is fixed in the Windows Server 11B release, which is available publicly today. We retested the original scenario with the Venafi provider and confirmed that it is working as expected now. I'd love to get an independent confirmation of that, and then we can close out this thread. |
That's wonderful news. We'd be happy to validate as well, and report back.
Thanks,
Micah
…On Tue, Nov 14, 2023, at 8:53 AM, Michael Bernstein (MSFT) wrote:
I'm happy to say that the new issue identified above on 9/12 is fixed in the Windows Server 11B release, which is available publicly today. We retested the original scenario with the Venafi provider and confirmed that it is working as expected now.
I'd love to get an independent confirmation of that, and then we can close out this thread.
—
Reply to this email directly, view it on GitHub <#89 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADSRCATQUV5TTUKV4MKE4LYEOHX3AVCNFSM4WSGAZG2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBRGA2TANBVGM4A>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
@michbern-ms Is there a specific image tag that you recommend we can test against? |
I believe the fix impacts both the container host and the container image. So, for the container, you can use: mcr.microsoft.com/windows/servercore:ltsc2022-amd64 For the host, please be sure you've received the 2023-11 cumulative update. Thanks, |
@michbern-ms We were also able to validate the latest servercore images fixed the issues we were seeing with the Venafi provider. |
Fantastic! After quite a long journey, I'm happy to see this one closed. |
Venafi created a Key Storage Provider and is running into the following problem:
Key Storage Provider (KSP) is not available after loading a container image with it installed. If the Docker host has the KSP also installed, then there is no issues, but I should not need to install the KSP on the Docker host. If I register the KSP manually after the docker image is loaded the KSP works without any issues.
What assistance are you requesting from Microsoft?
My belief is there is a bug or logic bug with how containers are handling KSPs or perhaps bcrypt in general. I would like to see this issue investigated and resolved. I do not believe there is in issue with the KSP developed by Venafi (my company) for the following reasons:
TROUBLESHOOTING INFORMATION:
I have a Dockerfile and an MSI of Venafi’s KSP that can be run to reproduce the issue.
High-level overview:
Notice only Venafi’s CSP is available, the KSP is not available.
If I run the following command:
i. Regsvr32 c:\windows\system32\venaficsp.dll
ii. Then re-run certutil -csplist
iii. The KSP is then available
All of this was performed with isolation mode set to hyperv.
I can reproduce Venafi's issue. I even added the regsvr32 command to the dockerfile and that had no effect. So it looks like for whatever reason, the registry changes that a KSP makes is not being persisted.
Can someone help investigate?
The text was updated successfully, but these errors were encountered: