<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Parag Mali - tag: confidential-computing</title><description>Posts tagged confidential-computing.</description><link>https://paragmali.com/</link><language>en-US</language><lastBuildDate>Sun, 07 Jun 2026 04:13:14 GMT</lastBuildDate><atom:link href="https://paragmali.com/tags/confidential-computing/rss.xml" rel="self" type="application/rss+xml"/><item><title>Verify Me, Don&apos;t Trust Me: Apple PCC, Azure Confidential AI, and the Architecture of the Modern AI Cloud</title><link>https://paragmali.com/blog/verify-me-dont-trust-me-apple-pcc-azure-confidential-ai-and-/</link><guid isPermaLink="true">https://paragmali.com/blog/verify-me-dont-trust-me-apple-pcc-azure-confidential-ai-and-/</guid><description>Apple Private Cloud Compute and Azure confidential AI ship the same promise through unrecognisably different machinery. On five axes they differ in degree. On one axis -- verifiable transparency of the production fleet -- they differ in kind.</description><pubDate>Mon, 01 Jun 2026 00:00:00 GMT</pubDate><content:encoded>
Apple and Microsoft now ship the same user-facing promise -- &quot;the cloud cannot see your AI prompt&quot; -- through completely different machinery. Apple&apos;s **Private Cloud Compute** (announced June 10, 2024 [@apple-pcc-blog]; source release October 24, 2024 [@apple-pcc-research]) runs custom Apple-Silicon servers with a per-node Secure Enclave Processor and publishes every production image hash to a public, append-only **Transparency Log** that the user&apos;s device cryptographically refuses to bypass. Microsoft&apos;s Azure confidential AI substrate (`NCCads_H100_v5`, GA September 24, 2024 [@ms-h100-ga]) composes AMD SEV-SNP confidential VMs with NVIDIA H100 GPUs in CC-On mode, verifies the composed attestation through Microsoft Azure Attestation, and gates customer-managed keys through Secure Key Release from Azure Key Vault. On five of six architectural axes the two designs differ in *degree*. On the sixth -- verifiable transparency of the production fleet -- they differ in *kind*.
&lt;h2&gt;1. Same Promise, Opposite Architectures&lt;/h2&gt;
&lt;p&gt;On June 10, 2024, Apple announced Private Cloud Compute and promised that &quot;personal user data sent to PCC isn&apos;t accessible to anyone other than the user -- not even to Apple&quot; [@apple-pcc-blog]. On September 24, 2024, Microsoft brought its first confidential GPU SKU to general availability. NVIDIA&apos;s companion blog called Azure &quot;the first cloud provider to offer confidential computing with NVIDIA H100 GPUs&quot; [@nvidia-h100-ga]. Microsoft&apos;s coordinated Trustworthy AI post framed the same architectural commitment: Microsoft itself cannot view or tamper with the data or the model inference process [@ms-h100-ga] [@ms-trustworthy-ai]. Two vendors. The same user-facing contract. Five months apart.&lt;/p&gt;
&lt;p&gt;Open the lid on either one and the machinery is unrecognisable.&lt;/p&gt;
&lt;p&gt;Apple PCC runs on custom Apple-Silicon servers, each with a &lt;a href=&quot;https://paragmali.com/blog/apple-secure-enclave-vs-microsoft-pluton-two-roads-to-hardwa/&quot; rel=&quot;noopener&quot;&gt;Secure Enclave Processor&lt;/a&gt; wired into a vendor-controlled certificate chain. Every production node image hash is published to an append-only public log that the user&apos;s device cryptographically refuses to bypass [@apple-pcc-blog] [@apple-pcc-release-transparency].&lt;/p&gt;
&lt;p&gt;Azure&apos;s confidential-AI substrate runs on the &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU: 40 non-multithreaded 4th-Gen AMD EPYC Genoa vCPUs, 320 GiB of RAM, one NVIDIA H100 NVL GPU with 94 GB of high-bandwidth memory, with the Trusted Execution Environment &quot;spanning confidential VM on the CPU and attached GPU&quot; [@ms-sku-nccads]. Trust is rooted in AMD&apos;s per-chip signing key, Intel&apos;s TDX module on the alternative SKU family, NVIDIA&apos;s on-die hardware root of trust on the GPU, and a Microsoft-operated verifier service called Microsoft Azure Attestation [@ms-maa-overview]. None of those signers are Apple, and Apple&apos;s signer is none of them.&lt;/p&gt;
&lt;p&gt;That is not a difference of brand preference. It is a difference about &lt;em&gt;who you are trusting and how you can check&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This article is a side-by-side architectural treatment of the two designs. It will compare them on six axes you will be able to recite at the end:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Silicon control&lt;/strong&gt; -- who controls the chip, the firmware, the OS, and the inference runtime.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hardware root of trust&lt;/strong&gt; -- which signing keys anchor the attestation chain.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Attestation surface&lt;/strong&gt; -- what cryptographic artefact the relying party actually consumes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Key release and state model&lt;/strong&gt; -- whether the customer holds keys, and how those keys are released to the workload.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPU TEE&lt;/strong&gt; -- how confidential compute extends from the CPU into the GPU.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Network anonymization&lt;/strong&gt; -- whether the operator can correlate requests with their originating client.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;By the end you should be able to read a Microsoft Azure Attestation JSON Web Token and an Apple PCC attestation envelope at the same level of fluency, and explain to a non-specialist what each cryptographic artefact actually proves. You should be able to name the threat each architecture defends against, and the threats neither closes by construction.&lt;/p&gt;
&lt;p&gt;When the user-facing promise is the same, the architectural divergence is the entire story. To understand what that divergence means, we first have to see where each architecture came from. The two designs did not converge on the same problem by coincidence. They descended from two different ancestor problems that took until 2024 to meet.&lt;/p&gt;
&lt;h2&gt;2. Confidential Computing&apos;s Two Parents&lt;/h2&gt;
&lt;p&gt;September 14, 2017. Mark Russinovich, Azure CTO, publishes &quot;Introducing Azure confidential computing.&quot; Microsoft, he writes, is &quot;the first cloud to offer new data security capabilities with a collection of features and services called Azure confidential computing,&quot; and the point of the announcement is &quot;encryption of data while in use&quot; [@ms-russinovich-2017]. Russinovich names &quot;data in use&quot; as the third protection state, the missing companion to &quot;at rest&quot; and &quot;in transit.&quot; Five years later the Confidential Computing Consortium publishes &quot;A Technical Analysis of Confidential Computing&quot; v1.3, the vendor-neutral document both Apple and Microsoft now anchor on, which defines the field formally and gives the lower bounds explicitly [@ccc-technical-analysis] [@ccc-about].&lt;/p&gt;
&lt;p&gt;Russinovich&apos;s framing did not appear from nowhere. It was the cloud-operator-side voice of a conversation that had two parents in the underlying hardware.&lt;/p&gt;
&lt;h3&gt;Parent one: the hardware TEE lineage&lt;/h3&gt;
&lt;p&gt;A &lt;strong&gt;Trusted Execution Environment&lt;/strong&gt; is a hardware-isolated execution context inside a system whose own host operating system or hypervisor is &lt;em&gt;not&lt;/em&gt; trusted to look in. The lineage starts in the early 2000s with ARM TrustZone&apos;s split-world NS-bit, then Intel TXT (Trusted Execution Technology) for measured launch on the CPU side -- originally announced as &lt;strong&gt;LaGrande Technology&lt;/strong&gt; at IDF 2003 and rebranded as TXT around 2007 with the vPro / Q35-Q45 chipset rollout. Apple shipped its first &lt;strong&gt;Secure Enclave Processor&lt;/strong&gt; -- a separate Apple-designed processor core on the same SoC as the main application processor, with its own boot ROM, AES engine, and protected memory -- on the iPhone 5s in September 2013 [@apple-sep-guide].&lt;/p&gt;

A hardware-isolated execution context inside a larger system in which code can run with cryptographic guarantees of confidentiality and integrity even when the system&apos;s own operating system, hypervisor, or peripheral firmware is compromised or controlled by an adversary. TEEs include process-scope enclaves (Intel SGX), VM-scope confidential VMs (AMD SEV-SNP, Intel TDX), and on-die separate-processor designs (Apple Secure Enclave Processor, Microsoft Pluton).
&lt;p&gt;Intel SGX (Software Guard Extensions) arrived as the first widely-available general-purpose TEE on commodity x86 silicon, with the architectural model first described in the McKeen et al. HASP 2013 paper [@mckeen-sgx-hasp] and given general availability on Skylake-era Core CPUs in late 2015. Costan and Devadas&apos;s &quot;Intel SGX Explained&quot; (IACR ePrint 2016/086) became the canonical academic systematization [@costan-sgx]. SGX let an application author carve out an &lt;em&gt;enclave&lt;/em&gt; -- a slice of address space encrypted in DRAM by a per-CPU memory-encryption engine and measured at creation time -- and have a remote party verify, through an Intel-signed attestation report, that a specific code measurement was running before any secret was released to it.&lt;/p&gt;

Per the Confidential Computing Consortium: protection of data in use through computation in a hardware-based, attested Trusted Execution Environment. The CCC explicitly extends the protection state-pair (at rest, in transit) with a third state (in use) and treats hardware TEEs as the substrate that makes the third state cryptographically enforceable. The CCC v1.3 analysis is the vendor-neutral definitional document both Apple and Microsoft cite [@ccc-technical-analysis] [@ms-cc-overview].
&lt;h3&gt;Parent two: the cloud-operator-as-adversary lineage&lt;/h3&gt;
&lt;p&gt;The other parent was the cloud. Once enterprise workloads moved into public clouds, the &lt;em&gt;cloud operator itself&lt;/em&gt; became part of the threat model. AMD &lt;strong&gt;published the first SEV API specification&lt;/strong&gt; (&quot;Secure Encrypted Virtualization&quot;) in April 2016, with silicon support shipping in the EPYC 7001 &quot;Naples&quot; family in June 2017 -- attaching a per-VM memory-encryption key to AMD EPYC processors. SEV-ES followed in February 2017, adding encrypted register state on world switches. &lt;strong&gt;SEV-SNP&lt;/strong&gt; (Secure Nested Paging), described in an AMD whitepaper in January 2020 [@amd-sev-snp-wp], added integrity protection through the Reverse Map Table. Intel&apos;s parallel response was &lt;strong&gt;TDX&lt;/strong&gt; (Trust Domain Extensions), specified in September 2020.&lt;/p&gt;
&lt;p&gt;Both AMD and Intel framed the contribution the same way: protect the guest from a hypervisor that may itself be the adversary. That framing was exactly what Russinovich&apos;s 2017 post had been pointing at, three years earlier, on the cloud side [@ms-russinovich-2017].&lt;/p&gt;
&lt;h3&gt;Convergence&lt;/h3&gt;
&lt;p&gt;The two parents started speaking a common vocabulary in the early 2020s. The Confidential Computing Consortium was founded in August 2019 as a Linux Foundation project community, with members across CPU vendors (AMD, Intel, NVIDIA, ARM), cloud providers (Microsoft, Google, Oracle), and OS / runtime vendors (Red Hat, Canonical, IBM) [@ccc-about].&lt;/p&gt;
&lt;p&gt;In January 2023 the IETF Remote ATtestation procedureS (RATS) Working Group published RFC 9334, &quot;Remote ATtestation procedureS (RATS) Architecture,&quot; giving the field a single vocabulary for the four roles in any attestation flow: the &lt;strong&gt;Attester&lt;/strong&gt; (the workload making the claim), the &lt;strong&gt;Verifier&lt;/strong&gt; (the party that checks the cryptographic evidence), the &lt;strong&gt;Relying Party&lt;/strong&gt; (the party that makes a decision based on the verified result), and the &lt;strong&gt;Endorser&lt;/strong&gt; (the party that vouches for the Attester&apos;s identity, typically the silicon vendor) [@ietf-rfc9334].&lt;/p&gt;
&lt;p&gt;Both Apple PCC and Microsoft Azure Attestation map cleanly onto RFC 9334&apos;s vocabulary. They use the same words for the same roles. The architectures that fill those roles are different.&lt;/p&gt;

timeline
    title TEE and confidential-computing milestones (2003-2024)
    section Hardware TEE lineage
        2003 : ARM TrustZone (mobile split-world)
        2007 : Intel TXT / LaGrande (measured launch)
        2013 : Apple Secure Enclave on iPhone 5s
        2015 : Intel SGX general availability (Skylake)
        2016 : Costan and Devadas SGX Explained
    section Cloud operator as adversary
        2016 : AMD SEV (memory encryption)
        2017 : AMD SEV-ES (encrypted register state)
        2017 : Azure CC introduced (Russinovich)
        2020 : AMD SEV-SNP whitepaper (integrity via RMP)
        2020 : Intel TDX specification
    section Vocabulary and standards
        2019 : Confidential Computing Consortium founded
        2022 : CCC Technical Analysis v1.3
        2023 : IETF RFC 9334 RATS Architecture
        2024 : Apple PCC and Azure H100 CC-On GA
&lt;p&gt;Apple&apos;s lineage is a third tributary the other two largely overlook. The iPhone Data Protection model, anchored in the SEP since 2013, and iCloud Private Relay&apos;s two-hop architecture from 2021 onward both fed into PCC. PCC is the only major-vendor confidential-AI substrate descended from a &lt;em&gt;device-side&lt;/em&gt; TEE origin rather than a &lt;em&gt;cloud-side&lt;/em&gt; one [@apple-sep-guide] [@apple-pcc-blog].&lt;/p&gt;
&lt;p&gt;Both parents converged on the same vocabulary by 2023. But the first attempts at putting that vocabulary into production hit walls neither parent had predicted -- starting with the 128 MB enclave that broke deep learning before it began.&lt;/p&gt;
&lt;h2&gt;3. Process Enclaves and the Operator-Honesty Assumption&lt;/h2&gt;
&lt;p&gt;August 2018, USENIX Security. Jo Van Bulck and nine co-authors publish &quot;Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution&quot; [@foreshadow]. The attack reads L1-cached enclave memory transiently and -- this is the load-bearing detail -- recovers the SGX EPID attestation-signing key for the targeted CPU generation. Once an attestation key leaks, every attestation that platform produces is forgeable to the attacker until microcode is updated and the EPID group is revoked. The whole &quot;the enclave really is what it says it is&quot; property collapses for that CPU generation overnight.&lt;/p&gt;
&lt;p&gt;To understand what Foreshadow was attacking, it helps to walk SGX&apos;s enclave lifecycle. A privileged-mode application invokes &lt;code&gt;ECREATE&lt;/code&gt; to reserve an enclave address range; pages are added with &lt;code&gt;EADD&lt;/code&gt;, each call measuring the page contents into a SHA-256 chain that becomes the enclave&apos;s &lt;code&gt;MRENCLAVE&lt;/code&gt; measurement; &lt;code&gt;EINIT&lt;/code&gt; finalises the chain and locks the enclave; &lt;code&gt;EENTER&lt;/code&gt; is then the only legal entry point [@mckeen-sgx-hasp] [@costan-sgx]. When a remote party asks the enclave to prove its identity, the Quoting Enclave -- a small Intel-signed enclave on every SGX-enabled CPU -- signs a &lt;code&gt;REPORT&lt;/code&gt; structure with the EPID key. The remote party verifies the EPID signature against the Intel Attestation Service and learns &lt;em&gt;which&lt;/em&gt; code measurement the enclave is running.&lt;/p&gt;

sequenceDiagram
    participant App as Untrusted app
    participant CPU as SGX hardware
    participant QE as Quoting Enclave
    participant IAS as Intel Attestation Service
    participant RP as Relying Party
    App-&amp;gt;&amp;gt;CPU: ECREATE (reserve enclave)
    App-&amp;gt;&amp;gt;CPU: EADD pages (measured into MRENCLAVE)
    App-&amp;gt;&amp;gt;CPU: EINIT (finalise measurement)
    App-&amp;gt;&amp;gt;CPU: EENTER (transfer control)
    CPU-&amp;gt;&amp;gt;QE: produce local REPORT
    QE-&amp;gt;&amp;gt;IAS: sign REPORT with EPID key
    IAS-&amp;gt;&amp;gt;RP: verify quote, return result
    RP-&amp;gt;&amp;gt;App: release secret if measurement matches

A dedicated secure subsystem integrated into Apple Silicon, isolated from the main application processor with its own boot ROM, AES Engine, and protected memory. The SEP runs an L4-derived microkernel and was first shipped on the iPhone 5s in 2013. It is not a TPM, not the NFC Secure Element used for Apple Pay, and not architecturally related to Intel SGX. It is the per-node hardware root of trust on every Apple Private Cloud Compute server [@apple-sep-guide] [@apple-pcc-blog].
&lt;p&gt;SGX scaled to a billion CPUs in three or four years, but it never scaled to deep learning. Three killer constraints stopped it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Constraint one: the Enclave Page Cache ceiling.&lt;/strong&gt; On Skylake-class client and Xeon E-2100 / E-2200 (Coffee Lake-based) server SKUs the Enclave Page Cache (EPC) was capped at 128 MB total per socket, of which only ~96 MB was usable for application data after Intel&apos;s bookkeeping overhead. An order of magnitude too small for any modern deep-learning workload, where a single set of weights for even a small model could easily exceed the EPC by a factor of 100 or more. (Skylake-SP and Cascade Lake-SP server Xeons did not ship SGX at all; SGX at server scale only arrived with Ice Lake-SP in 2021, by which point the cloud-AI story had moved past process-scope enclaves.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Constraint two: the programming model.&lt;/strong&gt; SGX required the application author to split the codebase into a trusted (in-enclave) and untrusted (outside-enclave) half, with explicit &lt;code&gt;ECALL&lt;/code&gt; and &lt;code&gt;OCALL&lt;/code&gt; transitions and a fixed serialised data interface across the trust boundary. Production codebases written before SGX existed simply refused to be partitioned that way. The handful of teams that tried -- mainly Intel internal proof-of-concepts -- produced systems that worked but did not generalise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Constraint three: the side-channel cascade.&lt;/strong&gt; Foreshadow / L1TF in August 2018 [@foreshadow]; SgxPectre at IEEE EuroS&amp;amp;P 2019, demonstrating Spectre-v1-style transient-execution attacks inside SGX enclaves [@sgxpectre]; Plundervolt in IEEE S&amp;amp;P 2020, a software-based fault-injection attack via Intel&apos;s privileged voltage-control interface, assigned CVE-2019-11157 [@plundervolt]. Each closed a different residual surface that Intel&apos;s threat model had not named. The principled extension -- that any TEE on shared silicon inherits a microarchitectural side-channel surface that the architectural threat model does not cover -- became the field&apos;s unspoken second axiom.&lt;/p&gt;
&lt;p&gt;SGX&apos;s attestation chain itself went through a generational turnover. The original EPID (Enhanced Privacy ID) scheme tied attestation verification to the Intel Attestation Service as a centralised relying party. By 2018 Intel had begun the transition to DCAP (Data Center Attestation Primitives), letting cloud operators host their own attestation infrastructure. The transition was exactly because EPID-pinned-to-IAS was incompatible with how cloud providers wanted to verify attestations at fleet scale.&lt;/p&gt;
&lt;p&gt;AMD&apos;s first-generation SEV and SEV-ES belong to the same era. They encrypted guest memory and (in SEV-ES) the saved register state on world switches, but they did not yet have the integrity check that would make a malicious hypervisor architecturally unable to mount remap-style attacks. That defence had to wait for SEV-SNP and a different failure that demonstrated, on the other side of the trust boundary, exactly the same lesson Foreshadow had taught on the Intel side.&lt;/p&gt;
&lt;p&gt;Process-scope enclaves were the wrong granularity. The fix had to come from somewhere else. What if you encrypted whole virtual machines instead?&lt;/p&gt;
&lt;h2&gt;4. Three Architectural Waves That Made Cloud Confidential AI Feasible&lt;/h2&gt;
&lt;p&gt;WOOT 2018. Mathias Morbitzer, Manuel Huber, Julian Horsch, and Sascha Wessel publish &quot;SEVered: Subverting AMD&apos;s Virtual Machine Encryption&quot; [@severed]. A malicious hypervisor remaps a guest&apos;s network-facing service to point at &lt;em&gt;other&lt;/em&gt; guest physical pages; the service unwittingly serves the contents of those pages -- still inside the guest, still nominally encrypted at the memory controller -- as plaintext over the network. The encryption did not break. The attack did not need it to.&lt;/p&gt;
&lt;p&gt;This is the architectural insight every Generation-3-and-later confidential VM design is built on.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Confidentiality without integrity is not isolation. A confidential VM that encrypts memory but does not bind the encryption to a specific physical page can be tricked into encrypting and then leaking other guests&apos; contents on the operator&apos;s behalf. Every TEE design from 2020 onward is haunted by the SEVered failure.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Wave 1 (~2020-2022): VM-level TEEs with hardware-enforced page ownership&lt;/h3&gt;
&lt;p&gt;AMD&apos;s response was SEV-SNP and the &lt;strong&gt;Reverse Map Table (RMP)&lt;/strong&gt;: one entry per 4 KB physical page in the system, tracking ownership, validation state, and the permitted size class for that page. Guest pages transition from &lt;code&gt;INVALID&lt;/code&gt; to &lt;code&gt;VALIDATED&lt;/code&gt; only via a guest-initiated &lt;code&gt;PVALIDATE&lt;/code&gt; instruction; subsequent hypervisor remap attempts that would violate the RMP fault out at the hardware level. Intel TDX took a parallel architectural path: a new privilege ring below the hypervisor called &lt;strong&gt;SEAM&lt;/strong&gt; mode, running the Intel-signed TDX Module, with per-VM trust-domain encryption keys managed through MK-TME (Multi-Key Total Memory Encryption).&lt;/p&gt;

A hardware-managed table maintained by AMD SEV-SNP processors with one entry per 4 KB physical page in the system. Each entry records the page&apos;s owner (which guest, if any), its validation state (`VALIDATED` or not), and the permitted size class. The hypervisor cannot remap a guest-owned page into a different guest without triggering a fault. The RMP is AMD&apos;s architectural response to SEVered: it makes the SEVered class of attacks impossible by construction.
&lt;p&gt;Azure brought the SEV-SNP substrate to general availability in 2022 with &lt;a href=&quot;https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/&quot; rel=&quot;noopener&quot;&gt;the &lt;code&gt;DCasv5&lt;/code&gt; and &lt;code&gt;ECasv5&lt;/code&gt; confidential VM families&lt;/a&gt; (the &lt;code&gt;a&lt;/code&gt; denotes AMD silicon, the &lt;code&gt;s&lt;/code&gt; denotes premium storage) [@ms-cc-overview]. Intel TDX entered public preview on Azure in December 2023. Full general availability of the next-generation Intel TDX confidential VMs on 5th-Gen Intel Xeon Scalable Emerald Rapids -- the &lt;code&gt;DCesv6&lt;/code&gt;, &lt;code&gt;DCedsv6&lt;/code&gt;, &lt;code&gt;ECesv6&lt;/code&gt;, and &lt;code&gt;ECedsv6&lt;/code&gt; families -- followed on February 26, 2026 [@ms-tdx-v6-ga] [@ms-dcesv6].&lt;/p&gt;
&lt;p&gt;The earlier SEV and SEV-ES generations were not free of side channels either. Li, Zhang, Wang, Li, and Cheng&apos;s &quot;CipherLeaks&quot; (USENIX Security 2021) showed a deterministic-ciphertext side channel against SEV-ES: identical plaintext at the same physical address produced identical ciphertext, letting a hypervisor observe constant-time cryptographic implementations and recover keys without ever breaking the encryption [@cipherleaks]. SEV-SNP&apos;s tweakable ciphertext mode addressed this, but the architectural lesson -- that &quot;the encryption is intact&quot; is not the same as &quot;the operator learns nothing&quot; -- repeats.&lt;/p&gt;
&lt;h3&gt;Wave 2 (~2022-2024): Attestation and key release as managed services&lt;/h3&gt;
&lt;p&gt;The second wave was less spectacular but more consequential for procurement. &lt;strong&gt;Microsoft Azure Attestation&lt;/strong&gt; (MAA) is a managed verifier that consumes SEV-SNP attestation reports, TDX quotes, SGX quotes, VBS enclave reports, &lt;a href=&quot;https://paragmali.com/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/&quot; rel=&quot;noopener&quot;&gt;vTPM&lt;/a&gt; event logs, and Trusted Launch evidence and issues a JSON Web Token (JWT) with documented &lt;code&gt;x-ms-isolation-tee&lt;/code&gt;, &lt;code&gt;x-ms-compliance-status&lt;/code&gt;, &lt;code&gt;x-ms-sevsnpvm-*&lt;/code&gt;, and &lt;code&gt;x-ms-runtime&lt;/code&gt; claims [@ms-maa-overview]. Per the MAA overview verbatim: &quot;Azure Attestation supports both platform- and guest-attestation of AMD SEV-SNP based Confidential VMs (CVMs)&quot; [@ms-maa-overview]. The JWT can then drive &lt;strong&gt;Secure Key Release&lt;/strong&gt; from Azure Key Vault Premium or Azure Managed HSM: the encrypted customer key carries a &lt;em&gt;release policy&lt;/em&gt; against MAA-issued claims, and the HSM unwraps the key only when the policy is satisfied [@ms-cc-overview].&lt;/p&gt;

A managed Microsoft cloud service that acts as the Verifier (in the IETF RFC 9334 sense) for confidential workloads on Azure. MAA consumes hardware-vendor attestation evidence (SGX quotes, SEV-SNP attestation reports, Intel TDX quotes, vTPM event logs) and produces a signed JSON Web Token whose `x-ms-*` claims describe the attested TEE state. The JWT is the artefact that downstream relying parties -- including Azure Key Vault&apos;s Secure Key Release flow -- consume to decide whether to release a secret to the workload [@ms-maa-overview].

An Azure Key Vault Premium and Azure Managed HSM capability that gates release of a wrapped key on a successful attestation. The customer attaches a *release policy* to the key at creation time; the policy is evaluated against the claims of an MAA-issued JWT presented at unwrap time. The key is released to the workload only when the MAA token&apos;s claims match the policy. SKR makes customer-managed key material a first-class architectural primitive for Azure confidential workloads [@ms-cc-overview] [@ms-maa-overview].
&lt;p&gt;This is the implementation of what RFC 9334 calls the &lt;strong&gt;Passport&lt;/strong&gt; topological pattern: the Attester collects evidence once, hands it to the Verifier, gets back an Attestation Result (the MAA JWT), and then carries that Result to any Relying Party (the HSM, an external policy engine, an audit log) for the rest of the session [@ietf-rfc9334].&lt;/p&gt;

The MAA-as-managed-service shift removed a substantial per-customer engineering burden: customers no longer have to write their own attestation-report parsers, certificate-chain validators, or revocation-list checkers. This is the practical reason confidential VMs moved from research artefact to procurement category in 2022-2024. The trade-off it carries is structural: MAA itself becomes a trust anchor. If MAA&apos;s signing infrastructure or its policy-evaluation code is compromised, every relying party that consumes a MAA JWT is exposed in the same breath. The verifier is now a control point.
&lt;h3&gt;Wave 3 (June-October 2024): GPU TEEs, vendor-controlled fleets, and the public arrival of confidential AI&lt;/h3&gt;
&lt;p&gt;The third wave landed in five months in 2024 and changed what &quot;confidential AI&quot; could mean in production.&lt;/p&gt;
&lt;p&gt;The NVIDIA Hopper H100 confidential-computing whitepaper (WP-11459-001) had landed in July 2023 [@nvidia-whitepaper], and the NVIDIA Developer Blog technical post that accompanied it described the architecture in detail: an on-die hardware root of trust, secure measured boot of the GPU firmware, an SPDM (Security Protocol and Data Model) session connecting the CPU TEE driver to the GPU with mutual authentication, and encrypted bounce-buffer data movement between CPU encrypted memory and GPU encrypted HBM [@nvidia-dev-blog]. The blog states the architectural fact verbatim: &quot;The NVIDIA H100 Tensor Core GPU is the first ever GPU to introduce support for confidential computing&quot; [@nvidia-dev-blog].&lt;/p&gt;
&lt;p&gt;Apple announced Private Cloud Compute on June 10, 2024 at WWDC, with the canonical primary titled &quot;Private Cloud Compute: A new frontier for AI privacy in the cloud&quot; [@apple-pcc-blog]. Microsoft Build 2024 (May 21, 2024) announced confidential inferencing not for GPT-4 but for the Azure OpenAI &lt;strong&gt;Whisper&lt;/strong&gt; speech-to-text model [@ms-workshop-whisper].&lt;/p&gt;
&lt;p&gt;Microsoft&apos;s &lt;code&gt;NCCads_H100_v5&lt;/code&gt; confidential GPU VM family -- 4th-Gen AMD EPYC Genoa CPU plus one NVIDIA H100 NVL GPU per VM, with the TEE spanning both [@ms-sku-nccads] -- reached general availability on September 24, 2024 [@ms-h100-ga]. The companion Microsoft Trustworthy AI post made the same architectural commitment: customer data and models remain inaccessible to Microsoft itself [@ms-trustworthy-ai] [@ms-h100-ga]. NVIDIA&apos;s parallel announcement underscored the same fact verbatim: &quot;Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs&quot; [@nvidia-h100-ga].&lt;/p&gt;
&lt;p&gt;Then on October 24, 2024 Apple published the supporting source code at &lt;code&gt;github.com/apple/security-pcc&lt;/code&gt;, shipped the Virtual Research Environment with macOS Sequoia 15.1 Developer Preview, and extended the Apple Security Bounty to PCC with rewards up to $1,000,000 [@apple-pcc-research] [@apple-pcc-github]. By end of October the substrate for cloud-scale confidential AI existed in two parallel forms. But &quot;shipping&quot; does not mean &quot;settling on one architecture.&quot; Two distinct breakthroughs landed within five months of each other and took the substrate in opposite directions.&lt;/p&gt;

flowchart LR
    A[Attacker&lt;br /&gt;controls hypervisor] --&amp;gt;|Remaps guest GPA tables| B[SEV guest&lt;br /&gt;network service]
    B --&amp;gt;|Reads memory under remapped pages| C[Other guest memory&lt;br /&gt;still under encryption]
    B --&amp;gt;|Serves bytes over network| D[Attacker collects&lt;br /&gt;plaintext]
    style A fill:#fee,stroke:#c33,color:#7f1d1d
    style D fill:#fee,stroke:#c33,color:#7f1d1d
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; SEVered did not recover an encryption key. It did not need to. By remapping page tables the malicious hypervisor convinced the guest to serve its own encrypted contents as plaintext. The fix -- per-page ownership tracking in hardware via the AMD Reverse Map Table and analogous mechanisms in Intel TDX -- defines what a Generation-3 confidential VM is. Earlier generations encrypted memory but did not authenticate ownership. They were not isolation; they were just encryption.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;5. Two Distinct 2024 Designs&lt;/h2&gt;
&lt;p&gt;June 10, 2024, WWDC. Apple Security Engineering and Architecture -- the institutional author block of the post, along with User Privacy, Core OS, Services Engineering, and Machine Learning and AI -- publishes &quot;Private Cloud Compute: A new frontier for AI privacy in the cloud&quot; [@apple-pcc-blog]. The post enumerates five core requirements verbatim: &lt;em&gt;stateless computation on personal user data, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency&lt;/em&gt; [@apple-pcc-blog]. The fifth requirement is the one nothing in the field had ever shipped at this scale.&lt;/p&gt;
&lt;h3&gt;(a) Apple&apos;s Verifiable Transparency model&lt;/h3&gt;
&lt;p&gt;Every production PCC node software image hash is published to an append-only &lt;strong&gt;Transparency Log&lt;/strong&gt;. Apple&apos;s canonical terminology is &quot;Transparency Log&quot; and &quot;Release Transparency&quot; -- both are reflected in the URL path of the Apple documentation page that defines the model [@apple-pcc-release-transparency] [@apple-pcc-doc]. The user&apos;s device cryptographically refuses to forward a request to a node whose image hash is not in the log; in Apple&apos;s words, &quot;your device won&apos;t issue requests to PCC unless the OS image running in PCC is logged for inspection&quot; [@apple-pcc-blog].&lt;/p&gt;

An append-only public log of every production Private Cloud Compute node software image hash. The log is structured along the lines of RFC 6962 Certificate Transparency -- a Merkle tree of measurement entries that can be audited end-to-end without trusting any single party. Apple&apos;s canonical primary uses the terms &quot;Transparency Log&quot; and &quot;Release Transparency&quot;; &quot;Verifiable Image Catalog&quot; is not Apple terminology. The user&apos;s device refuses to forward a request to a PCC node whose image hash is not in the log, making the log a precondition for any data flow [@apple-pcc-blog] [@apple-pcc-release-transparency].
&lt;p&gt;On October 24, 2024 Apple released the supporting source code at &lt;code&gt;github.com/apple/security-pcc&lt;/code&gt;, shipped the &lt;strong&gt;Virtual Research Environment&lt;/strong&gt; (VRE) with macOS Sequoia 15.1 Developer Preview to let researchers run the PCC software stack (including a virtual Secure Enclave Processor) inside a Mac, and extended the Apple Security Bounty to PCC with rewards up to $1,000,000 [@apple-pcc-research] [@apple-pcc-github]. The README on the source release states the scope plainly: &quot;The publication of this code is intended for security research and verification purposes only&quot; [@apple-pcc-github]. The components in the release include &lt;code&gt;CloudAttestation&lt;/code&gt; (the attestation envelope library), &lt;code&gt;Thimble&lt;/code&gt; (the on-device PCC client), &lt;code&gt;splunkloggingd&lt;/code&gt; (the audited logging path), and &lt;code&gt;srd_tools&lt;/code&gt; (security-research tooling).&lt;/p&gt;

Personal user data sent to PCC isn&apos;t accessible to anyone other than the user -- not even to Apple. -- Apple Security Engineering and Architecture, June 10, 2024 [@apple-pcc-blog]
&lt;p&gt;The network ingress path to PCC reinforces the non-targetability requirement. Client requests are routed through an &lt;strong&gt;Oblivious HTTP&lt;/strong&gt; relay, operated by an independent third party rather than by Apple, that strips the client IP address before forwarding the request to the PCC cluster. OHTTP is standardised in IETF RFC 9458 by Martin Thomson and Christopher A. Wood, January 2024, with the explicit goal of letting &quot;a client make multiple requests to an origin server without that server being able to link those requests to the client or to identify the requests as having come from the same client&quot; [@ietf-rfc9458].&lt;/p&gt;
&lt;p&gt;Apple&apos;s Target Diffusion design layers an &lt;a href=&quot;https://paragmali.com/blog/the-age-gate-that-doesnt-know-your-age-how-anonymous-credent/&quot; rel=&quot;noopener&quot;&gt;RSA Blind Signatures&lt;/a&gt; protocol -- RFC 9474 [@ietf-rfc9474] -- on top of the OHTTP path to issue single-use credentials, so even the relay cannot link two requests as having come from the same client.&lt;/p&gt;
&lt;p&gt;The OHTTP relay is third-party operated -- not Apple-operated. This is the architectural detail that makes non-targetability work. If Apple operated both the relay and the PCC cluster, Apple would observe the client IP at the relay and the request payload at the cluster and could correlate them. By splitting the two roles across two organizations whose business interests are not aligned, Apple can argue (and the architecture can enforce) that no single organization holds both halves of the correlation.&lt;/p&gt;

sequenceDiagram
    participant Dev as User device
    participant Log as Transparency Log
    participant Relay as OHTTP relay (third party)
    participant Node as PCC node (SEP-rooted)
    Dev-&amp;gt;&amp;gt;Log: fetch current log root
    Log--&amp;gt;&amp;gt;Dev: signed root, inclusion proofs
    Dev-&amp;gt;&amp;gt;Dev: verify target image hash is in log
    Dev-&amp;gt;&amp;gt;Relay: encrypted request (no client IP at origin)
    Relay-&amp;gt;&amp;gt;Node: forwarded request (relay IP only)
    Node-&amp;gt;&amp;gt;Node: enforce stateless processing
    Node--&amp;gt;&amp;gt;Relay: response, SEP-signed attestation envelope
    Relay--&amp;gt;&amp;gt;Dev: response delivered
    Dev-&amp;gt;&amp;gt;Dev: verify SEP attestation matches logged image
&lt;h3&gt;(b) Microsoft and NVIDIA&apos;s cross-vendor CPU+GPU TEE composition&lt;/h3&gt;
&lt;p&gt;The other 2024 breakthrough was a composition. The &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU is a confidential VM whose Trusted Execution Environment &quot;spans confidential VM on the CPU and attached GPU, enabling secure offload of data, models, and computation to the GPU&quot; [@ms-sku-nccads]. The substrate is an AMD SEV-SNP confidential VM on a 4th-Gen AMD EPYC Genoa CPU. The accelerator is an NVIDIA H100 NVL GPU with 94 GB of high-bandwidth memory, operating in &lt;strong&gt;CC-On mode&lt;/strong&gt; [@ms-sku-nccads] [@nvidia-dev-blog].&lt;/p&gt;
&lt;p&gt;The H100 in CC-On mode performs secure measured boot of its firmware against an on-die hardware root of trust, then establishes mutually-authenticated SPDM (Security Protocol and Data Model) sessions with the CPU TEE driver, and routes all data movement between CPU encrypted memory and GPU encrypted HBM through an encrypted bounce buffer. The NVIDIA Developer Blog states it verbatim: &quot;a chain of trust is established through ... a security protocols and data models (SPDM) session to securely connect to the driver in a CPU TEE&quot; [@nvidia-dev-blog]. The GPU&apos;s attestation report is signed against NVIDIA&apos;s on-die root of trust and consumable through NVIDIA&apos;s NRAS (NVIDIA Remote Attestation Service) and the open-source nvtrust SDK [@nvidia-nvtrust].&lt;/p&gt;

An IETF protocol for forwarding HTTP requests through an intermediary in a way that prevents either the intermediary or the target from linking requests to a single client. Per RFC 9458 verbatim: &quot;Oblivious HTTP allows a client to make multiple requests to an origin server without that server being able to link those requests to the client or to identify the requests as having come from the same client, while placing only limited trust in the nodes used to forward the messages&quot; [@ietf-rfc9458]. Apple Private Cloud Compute uses an OHTTP relay operated by an independent third party to enforce non-targetability.
&lt;p&gt;The CPU-to-GPU interconnect throughput in H100 CC-On is bounded by CPU encryption performance, not by raw PCIe or NVLink bandwidth. The NVIDIA Developer Blog measures it verbatim: &quot;It is limited by CPU encryption performance, which we currently measure at roughly 4 GBytes/sec&quot; [@nvidia-dev-blog]. Practitioners sizing throughput around H100 NVL&apos;s 94 GB HBM3 capacity should reason about the ~4 GB/s encryption ceiling, not the headline NVLink rate. The ceiling is what makes large-model long-sequence workloads amortise the overhead well, and what makes small-model short-prompt workloads pay a higher relative cost.&lt;/p&gt;

A DMTF standard (DSP0274) that defines a mutually-authenticated message-exchange protocol between two PCIe endpoints, used in the NVIDIA H100 CC-On architecture to establish a secure session between the host CPU TEE driver and the GPU. The session protects all subsequent control-plane and data-plane traffic and lets each endpoint verify the other&apos;s identity and measurements before any sensitive data crosses the PCIe link [@dmtf-spdm] [@nvidia-dev-blog] [@nvidia-nvtrust].
&lt;p&gt;The SPDM handshake itself is specified by &lt;strong&gt;DMTF DSP0274 v1.1.0&lt;/strong&gt; [@dmtf-spdm] and walks a precise message sequence the relying-party implementer needs to know exists: &lt;code&gt;GET_VERSION&lt;/code&gt; (§10.2) negotiates the protocol version; &lt;code&gt;GET_CAPABILITIES&lt;/code&gt; (§10.3) negotiates supported capabilities; &lt;code&gt;NEGOTIATE_ALGORITHMS&lt;/code&gt; (§10.4) negotiates the cryptographic algorithm family; &lt;code&gt;GET_DIGESTS&lt;/code&gt; (§10.7) fetches device-certificate digests; &lt;code&gt;GET_CERTIFICATE&lt;/code&gt; (§10.8) retrieves the per-die device-identity certificate; &lt;code&gt;CHALLENGE_AUTH&lt;/code&gt; (§10.9) verifies the device&apos;s signature over a host-supplied nonce; &lt;code&gt;GET_MEASUREMENTS&lt;/code&gt; (§10.11) retrieves the device&apos;s runtime measurement vector; and &lt;code&gt;KEY_EXCHANGE&lt;/code&gt; (§10.16) establishes the session key over ECDHE on P-384 [@dmtf-spdm]. The first three messages are an ordered prerequisite: per DSP0274 §10.6, no other request is valid until the three-step negotiation completes [@dmtf-spdm].&lt;/p&gt;
&lt;p&gt;The negotiated crypto family for the H100 in CC-On mode is SHA-384 / ECDSA-P384 / AES-256-GCM. The device-identity certificate is signed with a per-die ECC-384 hardware-bound key burned into H100 fuses, and revocation runs through the NVIDIA OCSP endpoint -- the GPU-side analogue of the AMD KDS CRL path described later [@nvidia-dev-blog].&lt;/p&gt;

sequenceDiagram
    participant Req as Host CVM (Requester)
    participant Resp as NVIDIA H100 (Responder)
    Req-&amp;gt;&amp;gt;Resp: GET_VERSION (DSP0274 10.2)
    Resp--&amp;gt;&amp;gt;Req: VERSION
    Req-&amp;gt;&amp;gt;Resp: GET_CAPABILITIES (10.3)
    Resp--&amp;gt;&amp;gt;Req: CAPABILITIES
    Req-&amp;gt;&amp;gt;Resp: NEGOTIATE_ALGORITHMS (10.4)
    Resp--&amp;gt;&amp;gt;Req: ALGORITHMS (SHA-384, ECDSA-P384, AES-256-GCM)
    Req-&amp;gt;&amp;gt;Resp: GET_DIGESTS (10.7)
    Resp--&amp;gt;&amp;gt;Req: DIGESTS
    Req-&amp;gt;&amp;gt;Resp: GET_CERTIFICATE (10.8)
    Resp--&amp;gt;&amp;gt;Req: CERTIFICATE (per-die ECC-384)
    Req-&amp;gt;&amp;gt;Resp: CHALLENGE (10.9)
    Resp--&amp;gt;&amp;gt;Req: CHALLENGE_AUTH (signature over nonce)
    Req-&amp;gt;&amp;gt;Resp: GET_MEASUREMENTS (10.11)
    Resp--&amp;gt;&amp;gt;Req: MEASUREMENTS
    Req-&amp;gt;&amp;gt;Resp: KEY_EXCHANGE (10.16, ECDHE P-384)
    Resp--&amp;gt;&amp;gt;Req: KEY_EXCHANGE_RSP
&lt;p&gt;The NVIDIA-side verifier reference moved generations recently: the Python SDK in &lt;code&gt;NVIDIA/nvtrust&lt;/code&gt; [@nvidia-nvtrust] is now superseded by &lt;code&gt;nv-attestation-sdk-cpp&lt;/code&gt; (also called &quot;NV Attest&quot;), which NVIDIA describes as &quot;a new and improved version of the NVIDIA nvtrust attestation SDK, redesigned to address key limitations&quot; [@nvidia-attest-sdk-cpp]. The C++ SDK is the current canonical reference; the older Python SDK still works but is deprecated. The NVIDIA CC documentation index links both [@nvidia-cc-docs].&lt;/p&gt;
&lt;p&gt;The composed attestation -- the AMD SEV-SNP attestation report from the host CVM, joined with the NVIDIA-signed GPU attestation report from the H100 -- is consumable by Microsoft Azure Attestation as a single policy decision [@ms-maa-overview]. Secure Key Release from Azure Key Vault Premium or Azure Managed HSM then gates customer key material on that composite attestation, so the model weights or the user&apos;s prompt encryption key are released to the workload only when the entire chain (AMD silicon, AMD firmware, Microsoft hypervisor, customer guest OS, NVIDIA GPU firmware, NVIDIA hardware root of trust) verifies [@ms-maa-overview] [@ms-cc-overview].&lt;/p&gt;

flowchart TD
    A[Customer workload] --&amp;gt; B[Host CVM&lt;br /&gt;AMD SEV-SNP + RMP]
    B --&amp;gt;|SPDM session, mutual auth| C[NVIDIA H100 NVL&lt;br /&gt;CC-On mode]
    C --&amp;gt;|Signed GPU attestation| D[NVIDIA NRAS]
    B --&amp;gt;|SEV-SNP attestation report| E[Microsoft Azure Attestation]
    D --&amp;gt; E
    E --&amp;gt;|MAA JWT, x-ms claims| F[Azure Key Vault Premium&lt;br /&gt;or Managed HSM]
    F --&amp;gt;|SKR release policy check| G[Customer key released&lt;br /&gt;to workload]
    style C fill:#e6f3ff,stroke:#36c,color:#1a365d
    style E fill:#fff3e6,stroke:#c63,color:#7b341e

The NVIDIA H100 Tensor Core GPU is the first ever GPU to introduce support for confidential computing. -- NVIDIA Developer Blog [@nvidia-dev-blog]
&lt;p&gt;Two breakthroughs. Two cryptographic envelopes. Both prove something about a workload. Both are signed by hardware. Both will satisfy a JWT verifier. And underneath that surface similarity sits a genuinely different epistemological model.&lt;/p&gt;
&lt;p&gt;Apple PCC commits, &lt;em&gt;publicly and in advance&lt;/em&gt;, to the exact image hash that will be served, and refuses to serve any other. Azure CC-AI does not publicly commit in advance to the bits the verifier runs against -- it produces a JWT that says &quot;I verified what I was given.&quot; Both are cryptographic; one is structurally auditable by an independent researcher, the other is a single vendor&apos;s word.&lt;/p&gt;
&lt;p&gt;This is the aha moment to mark with both hands. &quot;Verify me&quot; is architecturally different from &quot;trust me,&quot; even when both produce a JWT.&lt;/p&gt;
&lt;p&gt;To turn that distinction into something a reader can carry into procurement, we have to actually walk the six axes. On which do these architectures genuinely differ, and on which do they differ only in implementation strategy?&lt;/p&gt;
&lt;h2&gt;6. Six Axes, One Difference In Kind&lt;/h2&gt;
&lt;p&gt;Of the six architectural axes, five are differences in &lt;em&gt;degree&lt;/em&gt; -- both PCC and Azure CC-AI do similar things differently. Exactly one is a difference in &lt;em&gt;kind&lt;/em&gt;: verifiable transparency of the production fleet. Apple ships a public append-only log of every production node image hash; no other major-cloud confidential-AI substrate ships an architectural equivalent as of mid-2026. The rest of this section walks each axis with the trade-off named, the threat model spelled out, and the primary cited.&lt;/p&gt;
&lt;h3&gt;Axis 1: Silicon control&lt;/h3&gt;
&lt;p&gt;PCC is a single-vendor stack end to end. Apple controls the SoC, the SEP, the firmware, the OS, the Swift-based inference runtime, and the bug-bounty program [@apple-pcc-blog]. Apple has not publicly named the specific chip family used in PCC nodes; firmware identifiers and independent analyses point to M2-Ultra-class silicon at launch (firmware identifier &lt;code&gt;ComputeModule14,1&lt;/code&gt; [@appledb-cm14]) with a transition to M5-class silicon during 2026 (identifier &lt;code&gt;J226C&lt;/code&gt; [@nine-to-five-mac-m5] [@winbuzzer-m5]), and the Apple Machine Learning Research introduction confirms only that the cloud-side model runs on &quot;Apple silicon servers&quot; without naming a generation [@apple-foundation-models].&lt;/p&gt;
&lt;p&gt;Azure CC-AI is a multi-vendor commodity composition by design. AMD provides the EPYC CPU and the AMD Platform Security Processor; Intel provides the Xeon CPU and the TDX module on the alternate Intel SKU family; NVIDIA provides the H100 GPU and the on-die hardware root of trust; Microsoft provides the hypervisor and MAA; the customer chooses the guest OS [@ms-cc-overview] [@ms-sku-nccads] [@nvidia-dev-blog].&lt;/p&gt;
&lt;p&gt;The trade-off is direct. Apple&apos;s single-vendor stack is operationally simpler and the trust posture is internally consistent, but the trust root collapses to Apple. Azure&apos;s multi-vendor stack spreads trust across four independent signers, but no one of them sees the entire system, and the composition itself is a source of complexity.&lt;/p&gt;
&lt;h3&gt;Axis 2: Hardware root of trust&lt;/h3&gt;
&lt;p&gt;PCC anchors per-node trust in the Secure Enclave Processor on each Apple-Silicon server. The SEP is bound to an Apple-controlled certificate authority; the SEP signs the node&apos;s attestation envelope; the Apple-controlled CA&apos;s chain is the root the user&apos;s device trusts [@apple-pcc-blog] [@apple-sep-guide].&lt;/p&gt;
&lt;p&gt;Azure&apos;s hardware root of trust is structurally distributed. A vTPM exposed to the CVM provides one anchor; the AMD Platform Security Processor signs SEV-SNP attestation reports with a per-chip &lt;strong&gt;Versioned Chip Endorsement Key (VCEK)&lt;/strong&gt; [@amd-kds] [@amd-sev-snp-wp]; the NVIDIA on-die RoT signs the GPU attestation; MAA operates as the verifier-of-record that joins these into a single decision artefact [@ms-maa-overview].&lt;/p&gt;

A per-die ECDSA signing key derived inside the AMD Platform Security Processor (PSP) from a chip-specific secret fused into the silicon at manufacture. The VCEK signs SEV-SNP attestation reports; the certificate chain runs `VCEK -&amp;gt; AMD SEV signing key (ASK) -&amp;gt; AMD Root Key (ARK)`, with the ARK pinned out-of-band against AMD&apos;s published fingerprint and the per-chip VCEK fetched from the AMD Key Distribution Service (KDS) at `kdsintf.amd.com` keyed on the chip ID plus the four TCB-version-vector `*Spl` parameters (`blSpl`, `teeSpl`, `snpSpl`, `ucodeSpl`) parsed out of the 1184-byte attestation report [@amd-kds] [@amd-sev-snp-wp].
&lt;p&gt;The chain itself is short and walkable. The ARK and ASK PEMs are served as a single bundle from the KDS endpoint &lt;code&gt;/vcek/v1/&amp;lt;family&amp;gt;/cert_chain&lt;/code&gt; on host &lt;code&gt;kdsintf.amd.com&lt;/code&gt; (returning, on the Milan family, an &lt;code&gt;ARK-Milan&lt;/code&gt; and &lt;code&gt;SEV-Milan&lt;/code&gt; certificate pair issued from AMD Engineering&apos;s Santa Clara CA with 25-year validity dated 2020-10-22 [@amd-kds]). The per-die VCEK is served from &lt;code&gt;/vcek/v1/&amp;lt;family&amp;gt;/&amp;lt;chip_id&amp;gt;?blSpl=..&amp;amp;teeSpl=..&amp;amp;snpSpl=..&amp;amp;ucodeSpl=..&lt;/code&gt; on the same KDS host, where the chip ID and the four &lt;code&gt;*Spl&lt;/code&gt; TCB-version-vector query parameters are parsed out of the SEV-SNP attestation report itself.&lt;/p&gt;
&lt;p&gt;A relying party that wants to verify a SEV-SNP attestation &lt;em&gt;without&lt;/em&gt; trusting MAA fetches the chain from KDS, validates the chain against an out-of-band-pinned ARK fingerprint, and checks that the chip ID and TCB version in the report match the chain. The canonical open-source CLI for this is &lt;code&gt;virtee/snpguest&lt;/code&gt; [@virtee-snpguest], the active successor to the deprecated &lt;code&gt;AMDESE/sev-tool&lt;/code&gt; [@amd-sev-tool].&lt;/p&gt;
&lt;h3&gt;Axis 3: Attestation surface&lt;/h3&gt;
&lt;p&gt;PCC produces a per-device attestation envelope cross-checked against the public Transparency Log. The user&apos;s device does not just verify the SEP signature; it verifies that the image hash named in the envelope is included in the public log. If the hash is not in the log, the device refuses to forward the request [@apple-pcc-blog] [@apple-pcc-release-transparency].&lt;/p&gt;
&lt;p&gt;Azure produces an MAA-issued JWT. The customer&apos;s relying party parses the JWT and matches claims. The MAA overview documents the SEV-SNP-specific claims and the platform-vs-guest distinction explicitly [@ms-maa-overview]. For confidential GPU workloads, NVIDIA&apos;s NRAS claims about the H100 are joined into the same JWT.&lt;/p&gt;
&lt;p&gt;The procurement-grade payoff: a customer can verify SEV-SNP attestation &lt;em&gt;without&lt;/em&gt; trusting MAA by running the &lt;code&gt;snpguest&lt;/code&gt; workflow directly against the AMD KDS [@virtee-snpguest] [@amd-kds]. Or they can trust MAA&apos;s JWT and validate it against the MAA JWKS, trading one trust anchor (AMD&apos;s ARK fingerprint) for another (Microsoft&apos;s JWKS). Both paths are real; most production customers deploy the MAA path because it is operationally simpler, but the &lt;code&gt;snpguest&lt;/code&gt;-based path is what unlocks &quot;we do not have to trust MAA&quot; for a procurement audit.&lt;/p&gt;
&lt;p&gt;{`
// Demonstrates the structure of an MAA JWT for an AMD SEV-SNP confidential VM.
// In production the JWT would be signed by an MAA tenant key and verified
// against the tenant&apos;s JWKS endpoint. This example just decodes a sample payload.&lt;/p&gt;
&lt;p&gt;const sampleMaaJwt = [
  // header (base64url)
  &apos;eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9&apos;,
  // payload (base64url) -- sample x-ms claims
  &apos;eyJ4LW1zLWlzb2xhdGlvbi10ZWUiOiJzZXZzbnB2bSIsIngtbXMtY29tcGxpYW5jZS1zdGF0dXMiOiJhenVyZS1jb21wbGlhbnQtY3ZtIiwieC1tcy1zZXZzbnB2bS1ndWVzdHN2biI6OCwieC1tcy1zZXZzbnB2bS1sYXVuY2htZWFzdXJlbWVudCI6InhEa0...&quot;,&quot;x-ms-runtime&quot;:&quot;e30=&quot;}&apos;,
  // signature placeholder
  &apos;signature&apos;
].join(&apos;.&apos;);&lt;/p&gt;
&lt;p&gt;function decodeJwtPayload(jwt) {
  const [, payload] = jwt.split(&apos;.&apos;);
  // base64url -&amp;gt; base64
  const b64 = payload.replace(/-/g, &apos;+&apos;).replace(/_/g, &apos;/&apos;);
  return JSON.parse(atob(b64));
}&lt;/p&gt;
&lt;p&gt;const payload = decodeJwtPayload(sampleMaaJwt);
console.log(&apos;TEE family:        &apos;, payload[&apos;x-ms-isolation-tee&apos;]);
console.log(&apos;Compliance status: &apos;, payload[&apos;x-ms-compliance-status&apos;]);
console.log(&apos;Guest SVN:         &apos;, payload[&apos;x-ms-sevsnpvm-guestsvn&apos;]);
console.log(&apos;Launch measurement:&apos;, payload[&apos;x-ms-sevsnpvm-launchmeasurement&apos;]);&lt;/p&gt;
&lt;p&gt;// A Secure Key Release policy would gate key release on claims like:
//   &quot;x-ms-isolation-tee&quot; == &quot;sevsnpvm&quot;
//   &quot;x-ms-compliance-status&quot; == &quot;azure-compliant-cvm&quot;
//   &quot;x-ms-sevsnpvm-guestsvn&quot; &amp;gt;= 8
// matched against the MAA-issued JWT.
`}&lt;/p&gt;

The MAA path hides KDS fetching, certificate-chain validation, and TCB-rollback policy enforcement from the relying party by emitting a JWT whose `x-ms-attestation-type` claim is `sevsnpvm` and `x-ms-compliance-status` claim is `azure-compliant-cvm`. The relying party then validates against the MAA JWKS instead of pinning the AMD ARK fingerprint. Operationally simpler, but it trades trust in AMD for trust in MAA. A customer that wants a procurement-defensible &quot;we do not have to trust MAA&quot; posture runs the six-step `snpguest` Regular Attestation Workflow directly against the AMD KDS [@virtee-snpguest]. The `snpguest verify certs` step validates the VCEK -&amp;gt; ASK -&amp;gt; ARK chain but cannot detect a substituted ARK; the ARK fingerprint must be pinned out-of-band against AMD&apos;s published value before the chain is trusted. The other architectural delta: `snpguest verify attestation` checks the TCB version vector in the attestation report against the version baked into the VCEK certificate, surfacing TCB rollback. Once both checks pass, the relying party has cryptographic evidence the workload is running on a specific physical AMD CPU at a specific firmware level -- without ever talking to Microsoft.
&lt;p&gt;{`# The six-step Regular Attestation Workflow from the virtee/snpguest README.&lt;/p&gt;
Each step maps to a wire-level KDS GET except step 1 (which talks to the SNP
guest firmware device locally). Run this from inside an SEV-SNP guest VM on
Azure (e.g. on a DCasv5 SKU) -- not from the host.
Step 1: ask the guest firmware for a fresh attestation report bound to a
64-byte nonce. The report includes chip_id and the four *Spl TCB vector
fields the next steps will use to fetch the per-die VCEK.
&lt;p&gt;snpguest report attestation-report.bin request-data.bin --random&lt;/p&gt;
Step 2: fetch the ARK + ASK PEM bundle for this CPU family from AMD KDS.
Endpoint: GET /vcek/v1//cert_chain on host kdsintf.amd.com
&lt;p&gt;snpguest fetch ca pem milan ./certs&lt;/p&gt;
Step 3: fetch the per-die VCEK certificate from AMD KDS, keyed on chip_id
and the four *Spl values parsed out of the attestation report.
Endpoint: GET /vcek/v1//?blSpl=..&amp;amp;... on the KDS host
&lt;p&gt;snpguest fetch vcek pem milan ./certs attestation-report.bin&lt;/p&gt;
Step 4: fetch the current AMD CRL so revoked VCEKs can be rejected.
Endpoint: GET /vcek/v1//crl on the KDS host
&lt;p&gt;snpguest fetch crl pem milan ./certs&lt;/p&gt;
Step 5: validate the chain locally (VCEK -&amp;gt; ASK -&amp;gt; ARK).
IMPORTANT: snpguest cannot detect a substituted ARK. Before running this
command, pin the ARK fingerprint out-of-band against AMD&apos;s published value.
&lt;p&gt;snpguest verify certs ./certs&lt;/p&gt;
Step 6: verify the attestation signature with the validated VCEK and check
the TCB version vector in the report against the VCEK certificate.
This is the step that surfaces TCB rollback.
&lt;p&gt;snpguest verify attestation ./certs attestation-report.bin
`}&lt;/p&gt;
&lt;h3&gt;Axis 4: Key release and state model&lt;/h3&gt;
&lt;p&gt;This is where the architectural philosophies diverge most visibly. PCC nodes are &lt;em&gt;stateless by design&lt;/em&gt;. There is no customer key material on the node, no key release ceremony, no HSM gating. Apple&apos;s first core requirement names this verbatim: &quot;stateless computation on personal user data&quot; [@apple-pcc-blog]. State that needs to persist across requests does so on the user&apos;s device, not on the PCC fleet.&lt;/p&gt;
&lt;p&gt;Azure treats stateful, customer-managed keys as a first-class architectural primitive. Secure Key Release from Azure Key Vault Premium or Azure Managed HSM gates key release on an MAA-issued JWT whose claims must match the release policy attached to the encrypted key [@ms-cc-overview]. The Microsoft reference confidential-LLM tutorial walks the SKR-from-AKV-Premium flow end to end on a &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU [@ms-workshop-llm]. Customer-managed keys, customer-controlled HSMs, and customer audit logs are how regulated buyers reason about confidential workloads, and Azure&apos;s design accommodates that workflow directly.&lt;/p&gt;

A minimal SKR release policy is a JSON document referencing MAA-issued claims. A simplified example for an SEV-SNP CVM target:&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &quot;version&quot;: &quot;1.0.0&quot;,
  &quot;anyOf&quot;: [
    {
      &quot;authority&quot;: &quot;&amp;lt;your MAA tenant URL&amp;gt;&quot;,
      &quot;allOf&quot;: [
        { &quot;claim&quot;: &quot;x-ms-isolation-tee&quot;, &quot;equals&quot;: &quot;sevsnpvm&quot; },
        { &quot;claim&quot;: &quot;x-ms-compliance-status&quot;, &quot;equals&quot;: &quot;azure-compliant-cvm&quot; },
        { &quot;claim&quot;: &quot;x-ms-sevsnpvm-guestsvn&quot;, &quot;greater-than-or-equals&quot;: 8 }
      ]
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At unwrap time the HSM evaluates the policy against the JWT the workload presents. Only if every condition is met is the key material released. The policy is bound to the key at creation time and cannot be modified after the fact without rewrapping under a fresh policy.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h3&gt;Axis 5: GPU TEE&lt;/h3&gt;
&lt;p&gt;PCC uses Apple GPUs that are integrated on the same SoC as the CPU and SEP. By construction they sit inside the same SEP-rooted attestation envelope -- there is no separate cross-vendor PCIe attestation handshake because there is no PCIe handshake to begin with [@apple-pcc-blog].&lt;/p&gt;
&lt;p&gt;Azure uses NVIDIA H100 NVL GPUs in CC-On mode, with the architecture described above: on-die RoT, SPDM session, encrypted bounce buffer, NRAS-signed attestation report joined to the SEV-SNP CVM attestation through MAA [@ms-sku-nccads] [@nvidia-dev-blog]. The NVIDIA H100 exposes &lt;em&gt;three&lt;/em&gt; confidential-computing modes: &lt;strong&gt;CC-Off&lt;/strong&gt; (the normal non-confidential default; no isolation, no encryption); &lt;strong&gt;CC-On&lt;/strong&gt; (full confidential mode, the only mode that should be used in production); and &lt;strong&gt;CC-DevTools&lt;/strong&gt; (per NVIDIA&apos;s developer blog, &quot;a partial CC mode that will match the workflows of CC-On mode, but with security protections disabled and performance counters enabled&quot; [@nvidia-dev-blog]) [@nvidia-cc-docs]. The three modes share a bring-up surface, but only CC-On enforces the full isolation contract.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; NVIDIA&apos;s documentation is explicit that CC-DevTools weakens isolation specifically so that profiling and debugging tools that need performance-counter access can work [@nvidia-cc-docs]. Production confidential-AI workloads must run in CC-On. Verification step for relying parties: the GPU attestation report includes a mode field; the MAA JWT and the NRAS attestation that compose into it both surface this. A release policy that does not check the GPU mode field can release customer key material to a workload running on a partially-protected GPU. Treat CC-DevTools as a bring-up state, not a deployment state.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;AMD&apos;s MI300X GPU ships as compute across multiple clouds (Oracle OCI, DigitalOcean, Vultr, Crusoe, TensorWave, Hot Aisle, Seeweb [@mi300x-cloud-list]) but has &lt;em&gt;no&lt;/em&gt; production-equivalent confidential-GPU mode at GA on a major commercial cloud as of mid-2026. PCIe TDISP and SEV-TIO Linux support is landing in 2025-2026 kernels, but the GA gap is the load-bearing fact for any procurement that prefers AMD over NVIDIA at the accelerator tier. Azure&apos;s confidential GPU offering is H100-only at GA.&lt;/p&gt;
&lt;p&gt;A subtle and procurement-critical detail: Microsoft Azure Attestation does not directly attest the GPU. The MAA overview documents the SEV-SNP path and the platform-vs-guest distinction, but the GPU attestation is produced and signed by NVIDIA NRAS, not MAA [@ms-maa-overview] [@nvidia-dev-blog]. The composed MAA JWT &lt;em&gt;carries&lt;/em&gt; the NVIDIA-signed GPU attestation as a nested claim. A customer&apos;s relying party that wants to verify the GPU attestation against NVIDIA&apos;s hardware root of trust must validate the NRAS signature, not the MAA signature, on that nested portion.&lt;/p&gt;
&lt;p&gt;This is the &lt;strong&gt;double attestation&lt;/strong&gt; pattern: the SEV-SNP CVM attestation is signed by AMD VCEK; the H100 GPU attestation is signed by NVIDIA&apos;s on-die root of trust; MAA composes them into one JWT, but the two signatures must be verified against two different roots. The Azure &lt;code&gt;confidential-computing-cvm-guest-attestation&lt;/code&gt; and &lt;code&gt;az-cgpu-onboarding&lt;/code&gt; repositories provide the reference patterns for both halves of this verification [@az-cgpu-onboarding].&lt;/p&gt;
&lt;p&gt;The double attestation is one place the &quot;MAA is the verifier of record&quot; framing oversimplifies. MAA is the verifier of record for the &lt;em&gt;composition&lt;/em&gt; -- but the underlying signatures still come from AMD and NVIDIA. A relying party that wants to refuse a workload running on a TCB-rolled-back AMD CPU plus a CC-DevTools-mode H100 needs to check the AMD TCB version vector against a TCB-version policy (snpguest can do this) and the NVIDIA GPU mode field against a &quot;CC-On only&quot; policy. MAA can be configured to enforce both of these in the release policy, but the customer has to actively write the policy; the defaults will not catch a CC-DevTools-mode H100.&lt;/p&gt;
&lt;p&gt;Performance overhead is small. Zhu, Yin, Deng, Almeida, and Zhou (Phala / Fudan / io.net), in arXiv 2409.03992 (v4, November 5, 2024), benchmarked H100 CC-On on vLLM v0.5.4 with the ShareGPT dataset on Llama-3.1-8B-Instruct and report that &quot;for the majority of typical LLM queries, the overhead remains below 7%, with larger models and longer sequences experiencing nearly zero overhead&quot; [@phala-benchmark]. The dominant overhead source is the PCIe encrypted bounce buffer, capped at the ~4 GB/s CPU-encryption ceiling discussed in §5(b); large models amortise that cost across many tokens.&lt;/p&gt;
&lt;p&gt;The &quot;below 7%&quot; overhead number is benchmarked on a specific stack (vLLM v0.5.4, ShareGPT dataset, Llama-3.1-8B-Instruct) and depends on sequence length and batch size in non-trivial ways [@phala-benchmark]. Smaller models with short prompts and high batch turnover spend a larger fraction of wall-clock time on the bounce-buffer crossings; larger models with long context windows amortise that cost. Quoting &quot;below 7%&quot; without the workload qualification is misleading.&lt;/p&gt;
&lt;h3&gt;Axis 6: Network anonymization&lt;/h3&gt;
&lt;p&gt;This is the axis where the two architectures differ in kind.&lt;/p&gt;
&lt;p&gt;PCC routes client requests through a third-party-operated &lt;strong&gt;Oblivious HTTP&lt;/strong&gt; relay -- RFC 9458 [@ietf-rfc9458] -- that strips the client IP address before the request reaches the PCC cluster. This implements one of Apple&apos;s five named core requirements, non-targetability: an attacker who compromises the PCC fleet cannot single out a specific user&apos;s traffic because the fleet does not know which IP issued which request [@apple-pcc-blog]. Apple&apos;s Target Diffusion design layers RSA Blind Signatures (RFC 9474) [@ietf-rfc9474] on top to issue single-use credentials, so even the relay cannot link two requests from the same client.&lt;/p&gt;
&lt;p&gt;Azure has no equivalent operator-level anonymization layer. This is intentional in Azure&apos;s design: an enterprise customer who knows that traffic &lt;em&gt;originates from their own employees&lt;/em&gt; generally does not want to anonymize that traffic from their own audit logs. But it is an axis the two architectures differ on &lt;em&gt;in kind&lt;/em&gt; rather than in degree, and worth naming as such -- a procurement reader who needs operator-level anonymization will not get it from Azure CC-AI without building it themselves.&lt;/p&gt;
&lt;h3&gt;The six axes, side by side&lt;/h3&gt;
&lt;p&gt;The following table consolidates the comparison.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Apple Private Cloud Compute&lt;/th&gt;
&lt;th&gt;Azure Confidential AI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Silicon control&lt;/td&gt;
&lt;td&gt;Single-vendor end-to-end (Apple SoC, SEP, firmware, OS, runtime) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Multi-vendor commodity composition (AMD EPYC, Intel Xeon, NVIDIA H100, Microsoft hypervisor) [@ms-cc-overview] [@ms-sku-nccads]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware root of trust&lt;/td&gt;
&lt;td&gt;Per-node SEP bound to Apple-controlled CA [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;vTPM + AMD PSP / VCEK + NVIDIA on-die RoT + MAA as verifier-of-record [@ms-maa-overview] [@amd-kds]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attestation surface&lt;/td&gt;
&lt;td&gt;Per-device envelope cross-checked against public Transparency Log [@apple-pcc-release-transparency]&lt;/td&gt;
&lt;td&gt;MAA-issued JWT with documented &lt;code&gt;x-ms-*&lt;/code&gt; claims [@ms-maa-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key release / state&lt;/td&gt;
&lt;td&gt;Stateless nodes; no customer keys; no release ceremony [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;SKR from AKV Premium / Managed HSM gated on MAA JWT [@ms-cc-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU TEE&lt;/td&gt;
&lt;td&gt;Integrated Apple GPU in same SEP-rooted envelope [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;NVIDIA H100 CC-On + SPDM + NRAS joined to MAA [@nvidia-dev-blog] [@ms-sku-nccads]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network anonymization&lt;/td&gt;
&lt;td&gt;Third-party OHTTP relay strips client IP [@ietf-rfc9458] [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;No equivalent operator-level anonymization layer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    subgraph PCC[&quot;Apple PCC stack&quot;]
        P1[Apple SoC + integrated GPU]
        P2[SEP per node&lt;br /&gt;Apple-controlled CA]
        P3[Transparency Log&lt;br /&gt;append-only public]
        P4[Stateless node&lt;br /&gt;no customer keys]
        P5[OHTTP relay&lt;br /&gt;third party]
    end
    subgraph AZ[&quot;Azure CC-AI stack&quot;]
        A1[AMD EPYC + NVIDIA H100&lt;br /&gt;multi-vendor]
        A2[AMD PSP + vTPM&lt;br /&gt;NVIDIA on-die RoT]
        A3[MAA JWT&lt;br /&gt;x-ms claims]
        A4[SKR from AKV Premium&lt;br /&gt;customer-managed keys]
        A5[no operator-level&lt;br /&gt;anonymization layer]
    end

An architectural property whereby every production software image actually serving customer requests is committed in advance to a public, append-only log accessible to any third party. The property requires both that the cryptographic log be publicly auditable (a Certificate-Transparency-style Merkle tree, for example) and that the system refuse to serve requests against images not present in the log. Apple Private Cloud Compute ships verifiable transparency as a first-class architectural primitive; no other major-cloud confidential-AI substrate ships an architectural equivalent as of mid-2026 [@apple-pcc-blog] [@apple-pcc-release-transparency].
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The two architectures differ in &lt;em&gt;degree&lt;/em&gt; on five axes: silicon control, hardware root of trust, attestation surface, key release, and GPU TEE. On the sixth -- verifiable transparency of the production fleet -- they differ in &lt;em&gt;kind&lt;/em&gt;. Apple&apos;s Transparency Log is not a slightly-better MAA. It is an architectural primitive Microsoft does not ship.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A procurement assumption that PCC and Azure differ only in vendor preference misses the real architectural point. PCC&apos;s trust root collapses to Apple alone. Azure&apos;s trust root is spread across AMD, Intel, NVIDIA, and Microsoft as four independent signers. A single-vendor compromise on Azure (a leaked AMD VCEK signing key, an NVIDIA firmware bug, an MAA outage) does not collapse the whole stack the way an Apple-CA compromise would collapse PCC. This is a different security posture, not just a different brand. Whether trust diffusion is more valuable than verifiable transparency depends on the regulatory and threat-model context.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Six axes, two architectures, one axis where the divergence is in kind. But Apple PCC and Microsoft Azure are not the only games in town. Where do AWS Nitro Enclaves and Google Cloud Confidential Space fit on the same six axes?&lt;/p&gt;
&lt;h2&gt;7. Beyond the Two Headliners&lt;/h2&gt;
&lt;p&gt;If verifiable transparency is the architectural difference, the obvious question is why AWS and Google have not just shipped a Transparency Log too. The short answer is that the three other production substrates each chose a different epistemic model, and shifting any one of them to PCC&apos;s model would require rebuilding the trust root from scratch.&lt;/p&gt;
&lt;h3&gt;AWS Nitro Enclaves&lt;/h3&gt;
&lt;p&gt;AWS Nitro Enclaves does not anchor in a CPU-vendor TEE at all. Trust is rooted in AWS-as-signer through the Nitro Hypervisor and the Nitro Security Chip [@aws-nitro-hw]. The Nitro System &quot;provides enhanced security that continuously monitors, protects, and verifies the instance hardware and firmware&quot; and offloads virtualization resources to dedicated hardware [@aws-nitro-hw]. A Nitro Enclave is created from a parent EC2 instance and is &quot;isolated from the parent EC2 instance through the Nitro Hypervisor&quot;; per the AWS documentation verbatim, &quot;the Nitro Hypervisor ensures that the parent instance has no access to the isolated vCPUs and memory of the enclave&quot; [@aws-nitro-enclave].&lt;/p&gt;
&lt;p&gt;The trust model is different in kind from SGX, SEV, or TDX. Attestation is rooted in AWS&apos;s signing key, not in a CPU-vendor key. The Nitro architecture is processor-agnostic over Intel, AMD, and AWS Graviton, which is a different posture again -- the enclave&apos;s confidentiality does not depend on a specific silicon vendor&apos;s TEE primitive. There is also no published GPU confidential-computing extension for Nitro Enclaves as of mid-2026.&lt;/p&gt;
&lt;h3&gt;Google Cloud Confidential Space&lt;/h3&gt;
&lt;p&gt;Google Cloud Confidential Space combines Intel TDX (and AMD SEV / SEV-SNP) with Google Cloud Attestation and Workload Identity Federation. Per the GCA documentation: &quot;Google Cloud Attestation provides a unified solution for remotely verifying the trustworthiness of all Google confidential environments ... The service supports attestation of confidential environments backed by a Virtual Trusted Platform Module (vTPM) for SEV and the TDX Module for Intel TDX&quot; [@gcp-gca]. The overview page describes the multi-party-collaboration use case for PII, PHI, IP, and LLM-interaction data [@gcp-cs-overview].&lt;/p&gt;
&lt;p&gt;Google added an interesting wrinkle in 2025: an Intel Trust Authority integration that lets a GCP customer use ITA as a &lt;em&gt;second&lt;/em&gt; verifier alongside Google Cloud Attestation. Per the integration documentation: &quot;GCP Confidential Space provides a method for isolating a workload and sensitive data ensuring that data is released only to authorized workloads ... Intel Trust Authority is used to validate the evidence&quot; [@ita-gcp]. A second verifier is not the same architectural primitive as a public transparency log -- it provides cross-checking but not append-only public auditability -- but it is the closest move any other major-cloud confidential platform has made toward PCC&apos;s direction as of mid-2026.&lt;/p&gt;
&lt;h3&gt;Confidential Containers and the orchestration tier&lt;/h3&gt;
&lt;p&gt;Confidential Containers (CoCo) is a CNCF Sandbox project that wraps Kubernetes pods in confidential VMs running on AMD SEV-SNP, Intel TDX, or IBM Secure Execution [@coco-gh]. Per the project: &quot;Confidential Containers is an open source community working to enable cloud native confidential computing by ... Trusted Execution Environments to protect containers and data&quot; [@coco-gh]. CoCo composes &lt;em&gt;on top of&lt;/em&gt; the same Generation-3 silicon Azure CC-AI uses; it does not compete with PCC architecturally because it is at a different layer of the stack.&lt;/p&gt;
&lt;p&gt;Around CoCo and the underlying TEEs sits a small set of orchestration-tier vendors that take responsibility for what the raw SKUs do not. The procurement-relevant distinctions between them are sharper than the marketing copy suggests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Anjuna Seaglass&lt;/strong&gt; is the cross-cloud unified confidential-deployment plane. It packages AWS Nitro Enclave, Azure CVM, and GCP Confidential Space behind a single command and a customer-supplied policy [@anjuna], with the explicit value proposition of &quot;any cloud, any region, with the only Universal Confidential Computing platform.&quot; Anjuna&apos;s Seaglass platform supplanted the older Anjuna Northstar nomenclature, but reads the same way to a procurement audit: a single control plane spanning three different silicon vendors&apos; TEE primitives, with a uniform policy DSL on top.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edgeless Systems&apos; Contrast&lt;/strong&gt; is the runtime-and-runtime-encryption layer for confidential Kubernetes. Contrast runs confidential container deployments on Kubernetes at scale, built on Kata Containers and the Confidential Containers concept, and provides PKI, mTLS, and encrypted state disks across the deployment [@edgeless-contrast]. The architecture documentation is explicit that &quot;the Contrast Coordinator is the central remote-attestation service for a Contrast deployment&quot; and verifies the Contrast components inside a confidential VM [@contrast-arch] [@contrast-docs]. Contrast is the active successor to Edgeless Constellation, which is now archived (&quot;This repository has been archived ... Edgeless Systems has shifted focus to Contrast, our solution for confidential containers, which addresses the modern needs of confidential cloud workloads&quot; [@edgeless-constellation]). The procurement signal is that customers evaluating Constellation should be redirected to Contrast in any new deployment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fortanix&lt;/strong&gt; is two distinct products that the marketing collapses into one. &lt;strong&gt;Fortanix Confidential Computing Manager (CCM)&lt;/strong&gt; is the orchestration and policy management layer that &quot;is used to securely deploy and manage confidential computing applications using Intel SGX, AMD SEV-SNP, and Intel TDX runtimes&quot; [@fortanix-ccm]. &lt;strong&gt;Fortanix Data Security Manager (DSM)&lt;/strong&gt; is the FIPS 140-2 Level 3 HSM that holds the keys; per Fortanix&apos;s DSM page, DSM &quot;delivers Cryptographic Services, Key Management Services, Secrets Management, Tokenization, Code Signing ... powered by Confidential Computing&quot; [@fortanix-dsm] and carries FIPS 140-2 Level 3 certification on the underlying platform [@fortanix-fips]. Procurement teams that need a customer-managed-keys story almost always need both: CCM to orchestrate the confidential-workload deployment, DSM to custody the keys.&lt;/p&gt;
&lt;p&gt;CCM is not DSM. CCM is the orchestration plane (which workload runs where, attested by what); DSM is the FIPS 140-2 Level 3 HSM (which holds the keys, releases them on attested workload verification, audits the access). A procurement that asks for &quot;Fortanix&quot; without specifying CCM or DSM is asking for two different products at two different price points with two different compliance postures. The two integrate but they are not the same SKU.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Pick when...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Anjuna Seaglass&lt;/td&gt;
&lt;td&gt;Cross-cloud confidential deployment control plane [@anjuna]&lt;/td&gt;
&lt;td&gt;You run the same regulated workload on more than one cloud and need one policy DSL spanning AWS Nitro + Azure CVM + GCP Confidential Space&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edgeless Contrast&lt;/td&gt;
&lt;td&gt;Confidential Kubernetes runtime with mTLS and encrypted state [@contrast-arch] [@contrast-docs]&lt;/td&gt;
&lt;td&gt;You run confidential workloads as Kubernetes pods and want a remote-attestation Coordinator inside the deployment rather than an external SaaS verifier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fortanix CCM&lt;/td&gt;
&lt;td&gt;Confidential-app orchestration on SGX/SEV-SNP/TDX [@fortanix-ccm]&lt;/td&gt;
&lt;td&gt;You need centralized policy for which signed confidential workloads run on which TEEs, with audit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fortanix DSM&lt;/td&gt;
&lt;td&gt;FIPS 140-2 Level 3 HSM with attested key release [@fortanix-dsm] [@fortanix-fips]&lt;/td&gt;
&lt;td&gt;You need customer-managed keys, FIPS 140-2 L3 custody, and attested-workload-gated release as a single SKU&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The third-party tier exists because the raw cloud SKUs sell the &lt;em&gt;substrate&lt;/em&gt; but not the &lt;em&gt;operational pattern&lt;/em&gt;. Procurement decisions in this category typically pair a cloud SKU with one or two of these orchestration vendors to get something workable for a regulated workload.&lt;/p&gt;
&lt;h3&gt;Where these fit on the six axes&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Substrate&lt;/th&gt;
&lt;th&gt;Silicon&lt;/th&gt;
&lt;th&gt;Root of trust&lt;/th&gt;
&lt;th&gt;Transparency&lt;/th&gt;
&lt;th&gt;GPU TEE&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Apple PCC&lt;/td&gt;
&lt;td&gt;Apple end-to-end [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;SEP + Apple CA [@apple-sep-guide]&lt;/td&gt;
&lt;td&gt;Public Transparency Log [@apple-pcc-release-transparency]&lt;/td&gt;
&lt;td&gt;Integrated Apple GPU [@apple-pcc-blog]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure CC-AI&lt;/td&gt;
&lt;td&gt;AMD + Intel + NVIDIA + MS [@ms-cc-overview]&lt;/td&gt;
&lt;td&gt;AMD PSP + NVIDIA RoT + vTPM + MAA [@ms-maa-overview] [@amd-kds]&lt;/td&gt;
&lt;td&gt;None (MAA claims only) [@ms-maa-overview]&lt;/td&gt;
&lt;td&gt;NVIDIA H100 CC-On [@nvidia-dev-blog]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS Nitro Enclaves&lt;/td&gt;
&lt;td&gt;AWS-signed, CPU-agnostic [@aws-nitro-hw]&lt;/td&gt;
&lt;td&gt;Nitro Hypervisor + Security Chip [@aws-nitro-enclave]&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None at GA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GCP Confidential Space&lt;/td&gt;
&lt;td&gt;Intel TDX + AMD SEV-SNP [@gcp-cs-overview]&lt;/td&gt;
&lt;td&gt;vTPM + TDX Module + GCA (+ optional ITA) [@gcp-gca] [@ita-gcp]&lt;/td&gt;
&lt;td&gt;None (second verifier via ITA)&lt;/td&gt;
&lt;td&gt;None at GA on Confidential Space&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Third-party tier (CoCo / Contrast / Anjuna)&lt;/td&gt;
&lt;td&gt;Composes on top of cloud SKUs [@coco-gh] [@edgeless-contrast]&lt;/td&gt;
&lt;td&gt;Inherits underlying TEE root&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Inherits underlying GPU TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Five substrates, one rough trade-off space. But every one of them rests on silicon, and silicon has its own theoretical limits. What can no TEE-based confidential AI architecture do?&lt;/p&gt;
&lt;h2&gt;8. What No TEE Can Do&lt;/h2&gt;
&lt;p&gt;The Confidential Computing Consortium&apos;s &quot;A Technical Analysis of Confidential Computing&quot; v1.3 -- the vendor-neutral definitional document both Apple and Microsoft anchor on -- explicitly enumerates side-channels as a residual risk [@ccc-technical-analysis]. This is not a contestable empirical claim. It is the field&apos;s own lower bound on what TEE-based confidential AI can deliver. The CCC names what the architecture &lt;em&gt;does not&lt;/em&gt; close, in plain text, in the same document that defines what it &lt;em&gt;does&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;There are roughly six classes of limit, and the architectures we have walked do not close any of them by construction.&lt;/p&gt;
&lt;h3&gt;1. Side-channels on shared silicon&lt;/h3&gt;
&lt;p&gt;The Foreshadow / L1TF, SgxPectre, and Plundervolt cascade [@foreshadow] [@sgxpectre] [@plundervolt] is the historical evidence. The principled extension is direct: any TEE built on shared microarchitectural state -- shared caches, shared branch predictors, shared functional units, shared voltage / frequency control -- inherits a side-channel surface that the architectural threat model does not name. Both Apple&apos;s SEP and the AMD-Intel-NVIDIA composition rest on silicon that does not have an architectural primitive that closes this surface. Wojtczuk and Rutkowska&apos;s 2009 paper on Intel TXT made the same point fifteen years earlier in a different generation, demonstrating that SMM-based bypasses of TXT were not addressed by TXT&apos;s own threat model [@txt-attack]. The cycle keeps repeating.&lt;/p&gt;

Even Intel SGX&apos;s memory encryption/authentication technology cannot protect against Plundervolt. -- the Plundervolt project page [@plundervolt]
&lt;h3&gt;2. Trust-anchor compromise&lt;/h3&gt;
&lt;p&gt;Every vendor behind a hardware root of trust is itself a trust anchor that nothing inside the architecture can close. AMD-as-signer through the PSP and VCEK certificate chains [@amd-kds]; Intel-as-signer for the TDX Module, SEAMLDR, and Provisioning Service; NVIDIA-as-signer for the on-die RoT and NRAS; Microsoft-as-signer for the MAA service [@ms-maa-overview]; and Apple-as-signer for the SEP-bound CA and the Apple-controlled Transparency Log [@apple-pcc-blog]. If any of those signing infrastructures is compromised, the architecture cannot defend itself against the signer. PCC&apos;s trust root collapses to Apple; Azure&apos;s spreads across four vendors but each one is still a trust anchor for the workload that depends on it.&lt;/p&gt;
&lt;h3&gt;3. ROM-burned single-signer revocation&lt;/h3&gt;
&lt;p&gt;Fuse-burned silicon roots of trust are not field-revocable on a chip already deployed. If an attacker recovers a vendor-signing key that has been burned into the boot ROM of millions of chips, the recovery path is fleet rotation, not credential revocation. This is not a flaw of any specific vendor; it is a property of how hardware roots of trust are physically anchored. The recovery model for a leaked AMD ARK key, an Intel SEAM key, or an Apple SEP signing key is the same: replace the silicon. That is a multi-quarter operation at fleet scale.&lt;/p&gt;
&lt;h3&gt;4. Supply-chain compromise of the AI model&lt;/h3&gt;
&lt;p&gt;Apple binds the model into the attested image hash. The same Transparency Log that proves what &lt;em&gt;code&lt;/em&gt; is running also proves what &lt;em&gt;model weights&lt;/em&gt; are running, because the model is part of the published image [@apple-pcc-blog] [@apple-pcc-release-transparency]. PCC closes the model supply-chain question at the architecture level.&lt;/p&gt;
&lt;p&gt;Azure shifts model integrity to customer-controlled SKR of model artefacts. The model weights become encrypted blobs that the workload unwraps inside the TEE using a customer-managed key released only on a satisfying MAA JWT [@ms-cc-overview] [@ms-workshop-llm]. The customer is the trust anchor for the model&apos;s identity, not the cloud provider. This is a different trust-rooting model -- not stronger or weaker in the abstract, but routed through different organizations. It is &lt;em&gt;not&lt;/em&gt; accurate to say only Apple defends against model supply-chain compromise.&lt;/p&gt;
&lt;h3&gt;5. Prompt-output exfiltration via the model itself&lt;/h3&gt;
&lt;p&gt;The TEE protects the &lt;em&gt;input&lt;/em&gt; boundary -- it can prove the cloud operator never saw the prompt. It does not constrain what the model puts in the &lt;em&gt;output&lt;/em&gt;. A model that is fine-tuned, prompt-injected, or simply chooses to emit memorised data can exfiltrate information through its own output channel, and no architectural primitive in either PCC or Azure CC-AI prevents that. Both architectures are equally exposed on this axis. This is also why prompt-output safety, content filtering, and model-side privacy controls are unrelated work that confidential computing does not subsume.&lt;/p&gt;
&lt;h3&gt;6. Compelled vendor and lawful access&lt;/h3&gt;
&lt;p&gt;A property of the trust-rooting model, not of any one architecture. If a vendor is compelled by law to push a software update that exfiltrates user data, the architecture cannot defend itself against that vendor. PCC&apos;s compelled-vendor exposure is concentrated on Apple. Azure&apos;s is distributed across AMD, Intel, NVIDIA, and Microsoft, but a compelled Microsoft is sufficient to compromise an MAA-rooted workload; the diffusion does not multiply protections.&lt;/p&gt;
&lt;h3&gt;And one more: MAA-as-service compromise&lt;/h3&gt;
&lt;p&gt;Azure&apos;s centralised verifier is a control point Apple does not have, because Apple&apos;s verifier is the user&apos;s device itself. If MAA is compromised -- if an attacker controls the MAA signing key, or if the MAA policy-evaluation code is modified maliciously -- every relying party that trusts MAA-issued JWTs trusts the attacker.&lt;/p&gt;

The CCC&apos;s &quot;A Technical Analysis of Confidential Computing&quot; v1.3 explicitly enumerates side-channels as a residual risk that the architecture does not close by construction. This is the field&apos;s own acknowledged lower bound. Any product claim that &quot;our confidential computing stack defends against all side-channels&quot; is, in 2026, either overstated or contradicting the CCC&apos;s own technical analysis [@ccc-technical-analysis]. The honest framing is that confidential computing defends against the architecturally-named threats (memory disclosure to the operator, hypervisor-mediated remap, plaintext-in-DRAM at-rest exposure) and that side-channels remain a separate research and engineering domain.
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Threat&lt;/th&gt;
&lt;th&gt;Apple PCC&lt;/th&gt;
&lt;th&gt;Azure CC-AI&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Malicious cloud operator (passive memory disclosure)&lt;/td&gt;
&lt;td&gt;Defended (SEP-rooted attestation, OHTTP relay) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Defended (SEV-SNP / TDX guest measurement, MAA verifier) [@ms-maa-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compromised hypervisor (active remap / Iago attacks)&lt;/td&gt;
&lt;td&gt;Defended (Apple-controlled kernel + SEP-rooted measured boot) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Defended (SEV-SNP RMP enforces page ownership; TDX Module isolates) [@ms-cc-overview]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Supply-chain compromise of the AI model&lt;/td&gt;
&lt;td&gt;Defended at architecture level (model bound into Transparency-Log-published image) [@apple-pcc-blog]&lt;/td&gt;
&lt;td&gt;Defended via customer-controlled SKR of model artefacts; trust shifts to customer [@ms-workshop-llm]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Side-channels on shared silicon&lt;/td&gt;
&lt;td&gt;Not closed by construction [@ccc-technical-analysis] [@plundervolt]&lt;/td&gt;
&lt;td&gt;Not closed by construction [@ccc-technical-analysis] [@cipherleaks]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compelled-vendor / lawful access&lt;/td&gt;
&lt;td&gt;Not closed by construction (trust collapses to Apple)&lt;/td&gt;
&lt;td&gt;Not closed by construction (trust spreads across four vendors; compelled MAA suffices)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier / signer compromise&lt;/td&gt;
&lt;td&gt;Apple SEP-CA + Transparency Log signer is a control point&lt;/td&gt;
&lt;td&gt;MAA signer + AMD / Intel / NVIDIA signers are control points&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prompt-output exfiltration via model&lt;/td&gt;
&lt;td&gt;Not closed by construction&lt;/td&gt;
&lt;td&gt;Not closed by construction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Neither architecture closes the gap by construction. Apple&apos;s verifier is the user&apos;s device, and the user&apos;s device trusts Apple&apos;s SEP-bound CA and the Apple-controlled Transparency Log signer. Azure&apos;s verifier is MAA, which is a Microsoft-operated service with its own signing infrastructure. Apple&apos;s single-vendor problem and Microsoft&apos;s centralised-verifier problem are two shapes of the same architectural gap: the verifier itself is a trust root the architecture cannot externally audit.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Trust diffusion (Azure&apos;s contribution) and verifiable transparency (Apple&apos;s contribution) close &lt;em&gt;different&lt;/em&gt; trust-anchor gaps. Neither closes both. No production substrate as of mid-2026 closes both gaps simultaneously. A hypothetical Generation-7 design that combined Azure-style multi-vendor TEE composition with Apple-style append-only transparency of production images would close that gap. No vendor has shipped it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Two architectures, two distinct upper bounds, neither closing the same gap. So what is the field actually working on?&lt;/p&gt;
&lt;h2&gt;9. Where Active Work Is Happening&lt;/h2&gt;
&lt;p&gt;September 5, 2024, arXiv. Ceren Kocaoğullar (University of Cambridge), Tina Marjanov (Cambridge), Ivan Petrov (Google), Ben Laurie (Google), Al Cutter (Google), Christoph Kern (Google), Alice Hutchings (Cambridge), and Alastair R. Beresford (Cambridge) post &quot;A Confidential Computing Transparency Framework for a Trust Chain&quot; [@kocaogullar-transparency]. The paper does not name MAA specifically. It generalises the question Apple PCC raises in concrete form: can the verifiable-transparency primitive be replicated on commodity multi-vendor silicon without collapsing to a single trust root? The authors propose &quot;a three-level conceptual framework providing organisations with a practical pathway to incrementally improve Confidential Computing transparency&quot; [@kocaogullar-transparency]. The inclusion of Ben Laurie -- one of the original architects of Certificate Transparency (RFC 6962) -- is not incidental. The paper is the direct architectural descendant of CT brought into the confidential-computing domain.&lt;/p&gt;
&lt;p&gt;The v2 December 5, 2024 revision of the Kocaoğullar et al. paper added an 800+ participant empirical study showing that greater transparency improves end-user trust in confidential computing services [@kocaogullar-transparency]. That empirical signal is the closest thing the field has, as of mid-2026, to a measurement of the procurement consequences of verifiable transparency vs verifier-as-a-service. The framework itself is conceptual; the empirical contribution is the part procurement teams should read.&lt;/p&gt;
&lt;p&gt;Six open problems are visible in the current production work.&lt;/p&gt;
&lt;h3&gt;9.1 Verifiable transparency of the verifier itself&lt;/h3&gt;
&lt;p&gt;No major-cloud verifier ships a public append-only log of its own code. MAA does not; Google Cloud Attestation does not; AWS Nitro&apos;s hypervisor signer does not. The Intel Trust Authority integration on GCP introduces a &lt;em&gt;second&lt;/em&gt; verifier, which is a partial cross-check, but a second verifier is not the same architectural primitive as a transparency log [@ita-gcp]. Where the work is happening: the CCC Attestation Special Interest Group on GitHub coordinates Formal Specifications of Attestation Mechanisms, an RA-TLS proof of concept, an interoperable RA-TLS effort, an IETF RATS terms cheat sheet, and a formal-spec-KBS (key broker service) project [@ccc-attestation-gh]. The IETF RATS Working Group continues to extend RFC 9334 with Entity Attestation Token (EAT) and Concise Reference Integrity Manifest (CoRIM) drafts [@ietf-rfc9334].&lt;/p&gt;
&lt;h3&gt;9.2 GPU confidential-computing parity across vendors&lt;/h3&gt;
&lt;p&gt;NVIDIA H100 CC-On is the only confidential-GPU mode at GA on a major commercial cloud as of mid-2026 [@nvidia-dev-blog] [@ms-sku-nccads]. AMD MI300X ships as compute across multiple clouds but has no production-equivalent SEV-TIO confidential-GPU mode at GA on a major commercial cloud. PCIe TDISP and SEV-TIO Linux support is landing in 2025-2026 kernels, but the GA gap is the load-bearing fact for any procurement that wants AMD silicon end-to-end. AMD&apos;s MI400X-class roadmap is forward-looking. Until a second confidential GPU is at GA, single-vendor lock-in at the accelerator tier is the unavoidable procurement reality for any cloud confidential-AI workload.&lt;/p&gt;
&lt;h3&gt;9.3 Cross-vendor attestation portability&lt;/h3&gt;
&lt;p&gt;IETF RFC 9334 standardises the vocabulary [@ietf-rfc9334]; CoRIM and EAT, in active drafting in the IETF RATS WG, aim at portable claim formats. The vocabulary work matters because a confidential workload that wants to run unchanged on Azure SEV-SNP and Azure TDX and GCP TDX needs a single attestation parser that understands all three evidence formats. The MAA approach maps onto RFC 9334&apos;s Passport pattern; the GCA approach maps onto OIDC tokens that play well with federated-identity tooling. As of mid-2026 no single relying-party library handles all three production verifiers transparently, and that is one of the things the CCC Attestation SIG is working on [@ccc-attestation-gh].&lt;/p&gt;
&lt;h3&gt;9.4 Confidential inferencing for Azure OpenAI models&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s &lt;code&gt;Azure-Samples/confidential-ai-workshop&lt;/code&gt; repository [@ms-workshop] is the cleanest procurement-grade reference for what confidential inferencing actually looks like in production on Azure today. It contains three end-to-end tutorials at three different points on the cost-versus-isolation curve, and reading them in sequence is the fastest way for a procurement team to map the abstract architecture to concrete SKU lines.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tutorial 1: ML-training on a CPU-only confidential VM (&lt;code&gt;Standard_DCasv5&lt;/code&gt;).&lt;/strong&gt; The &lt;code&gt;confidential-ml-training&lt;/code&gt; directory walks training of an XGBoost-class classical-ML model on a &lt;code&gt;Standard_DCasv5&lt;/code&gt; SKU, which is an AMD SEV-SNP confidential VM &lt;em&gt;without&lt;/em&gt; a confidential GPU [@ms-workshop-ml]. The workload posture is plaintext-data-and-model on a TEE-protected substrate, with the SEV-SNP attestation gating access to encrypted training data in Azure Storage via the standard MAA + SKR path. The deliberate choice of XGBoost over a deep-learning model is the architectural lesson: when the model and training data fit in CPU memory and TCB-sealed CPU compute is sufficient, the confidential GPU SKU is overkill. This is the lowest-cost on-ramp into the architecture.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tutorial 2: LLM inferencing on a confidential GPU (&lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt;).&lt;/strong&gt; The &lt;code&gt;confidential-llm-inferencing&lt;/code&gt; directory walks serving &lt;code&gt;microsoft/Phi-4-mini-reasoning&lt;/code&gt; on a &lt;code&gt;Standard_NCC40ads_H100_v5&lt;/code&gt; SKU [@ms-workshop-llm]. Phi-4-mini-reasoning is a 3.8 B-parameter dense decoder-only Transformer with a 128 K-token context window, MIT-licensed on Hugging Face [@hf-phi4-mini], chosen because it fits comfortably in the H100 NVL&apos;s 94 GB HBM3 capacity with room for activation memory. The novel architectural feature here is &lt;strong&gt;double attestation&lt;/strong&gt;: the tutorial&apos;s setup script uses &lt;code&gt;Azure/az-cgpu-onboarding&lt;/code&gt; [@az-cgpu-onboarding] to verify both the SEV-SNP CVM attestation (against AMD VCEK) &lt;em&gt;and&lt;/em&gt; the NVIDIA H100 GPU attestation (against NVIDIA&apos;s on-die root of trust via NRAS) before model weights are released from Azure Key Vault Premium via SKR. This is the architectural pattern any production GPU-confidential workload should match.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tutorial 3: Inferencing via the Confidential Whisper service (OHTTP + HPKE).&lt;/strong&gt; Whisper, the speech-to-text model, is the publicly-demoed Microsoft Build 2024 confidential inferencing reference workload. The &lt;code&gt;confidential-whisper-inferencing&lt;/code&gt; tutorial directory confirms the Azure AI Foundry Confidential Whisper service uses &lt;strong&gt;Oblivious HTTP&lt;/strong&gt; with &lt;strong&gt;HPKE&lt;/strong&gt; end-to-end encryption to keep audio encrypted until it reaches the TEE-protected Whisper model [@ms-workshop-whisper]. The reference OHTTP gateway implementation is &lt;code&gt;microsoft/attested-ohttp-client&lt;/code&gt; and its server-side counterpart, &quot;an Attested OHTTP gateway and client implementation by Microsoft&quot; that &quot;uses the Cloudflare OHTTP client/server implementation as a basis&quot; [@ms-attested-ohttp]. This is the closest architectural pattern Azure has to PCC&apos;s non-targetability requirement -- a third-party-operated OHTTP relay strips the client IP before the request reaches the confidential inferencing endpoint, the same architectural primitive Apple uses for PCC at network ingress.&lt;/p&gt;
&lt;p&gt;The three tutorials are the canonical references because they walk the wire-level flow. A procurement team that wants to know &quot;what does confidential inferencing actually look like on Azure&quot; can read the README files, the Bicep templates, the attestation-policy JSON, and the SKR-policy JSON, and answer the question without speculation. GPT-class confidential endpoints staging through 2024-2026 are forward-looking roadmap. There is no May-2024 GA for &quot;Confidential GPT-4,&quot; but the three workshop tutorials cover the architectural primitives that such a GA would compose.&lt;/p&gt;
&lt;h3&gt;9.5 The Apple PCC node-chip transition&lt;/h3&gt;
&lt;p&gt;Apple has not publicly named the chip family used in PCC nodes. Firmware identifiers and independent analyses make the transition story concrete enough to reason about. At launch in June 2024 the PCC nodes ran on M2-Ultra-class silicon, identified by the firmware string &lt;code&gt;ComputeModule14,1&lt;/code&gt; visible in independent device-identifier databases [@appledb-cm14]. During 2026 the PCC fleet transitioned to a new node generation identified as &lt;code&gt;J226C&lt;/code&gt; and reported (independently, not by Apple) as built around M5-class silicon manufactured in Houston, Texas [@nine-to-five-mac-m5] [@winbuzzer-m5]. The 9to5Mac report dated February 17, 2026 describes Apple&apos;s M5-based Private Cloud Compute servers tied to iOS 26.4 [@nine-to-five-mac-m5], and the parallel Winbuzzer coverage from the next day confirms a new &quot;Private Cloud Compute Agent Worker&quot; component running on M5-class node hardware [@winbuzzer-m5].&lt;/p&gt;
&lt;p&gt;What is architecturally interesting is not the chip identity. It is what the transition &lt;em&gt;did not&lt;/em&gt; change. The Transparency Log architecture absorbs a generational chip change as a matter of routine policy because the log&apos;s verifier policy is a list of approved image hashes and the SEP-rooted attestation envelope structure, not a list of approved chip families. New node generation, new image hashes (visible in &lt;code&gt;PrivateCloudCompute/Release.swift&lt;/code&gt; and validated by &lt;code&gt;PrivateCloudCompute/NodeValidator.swift&lt;/code&gt; [@apple-pcc-nodevalidator] [@apple-pcc-release-swift]), same envelope structure, same client-side verification. From a procurement-trust perspective, the transition was an architectural non-event in exactly the way Apple&apos;s public commitments said it should be.&lt;/p&gt;

**Two invariants held across the M2-Ultra to M5 node transition.** First, the device-side envelope check is stable: the `NodeValidator` validates SEP-signed attestation against the `SEPAttestationPolicy` it parses from the release artefact [@apple-pcc-nodevalidator] [@apple-pcc-sepattestpolicy], and the policy schema did not change. Second, the public transparency log absorbed the transition without any client-side trust ceremony because the chip family is not in the verifier policy -- only the image hash is. A device that started talking to the M2-Ultra fleet in 2024 and woke up in 2026 talking to the M5 fleet did exactly one new thing: it fetched the new approved image hashes from the log. **Three things did change.** First, the on-node software stack (firmware, kernel, OS, inference runtime) is rebuilt for the new silicon; that is why the image hashes change. Second, the routing policy may shift -- some workloads may schedule onto the new node generation preferentially. Third, the chip family itself is not publicly named by Apple; the M5 identification is inferential from independent reporting plus firmware identifiers, not from a primary Apple source. Procurement narratives should use &quot;Apple-designed silicon, not publicly named&quot; when precision matters, and reach for the inferential M5 identification only when chip-family granularity is load-bearing.
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The architectural payoff of a public transparency log is precisely that it absorbs a generational chip transition without any client-side trust ceremony, because the chip family is not in the verifier policy -- only the image hash is. This is what &quot;verifiable transparency&quot; buys procurement teams in practice: the trust contract survives silicon turnover because the contract was never about silicon. It was about which bits the silicon ran.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;9.6 Third-party PCC equivalents&lt;/h3&gt;
&lt;p&gt;Could AWS or Google replicate Apple&apos;s Transparency-Log model on commodity multi-vendor silicon? The architectural feasibility is open. The Kocaoğullar et al. framework provides a conceptual pathway [@kocaogullar-transparency]. The CCC Attestation SIG&apos;s interoperable-ra-tls work is one of several substrates that a multi-vendor transparency log could ride on top of [@ccc-attestation-gh]. Whether any major cloud will actually ship it is the architectural bet the next generation hinges on. No GA product as of mid-2026.&lt;/p&gt;

A regulated workload that needs second-source availability has to be able to run on at least two confidential substrates. As of mid-2026 the practical cross-vendor option for a TEE-based confidential workload is &quot;AMD SEV-SNP on Azure, Intel TDX on GCP, AWS Nitro on AWS&quot; -- three different attestation evidence formats consumed by three different verifiers. CoRIM and EAT in the IETF RATS WG are trying to make those three formats parseable by one library. Until that lands, second-source confidential AI is an integration project, not a configuration change.
&lt;p&gt;The field is wide open. But the reader&apos;s procurement deadline is not. How do you actually choose between PCC and Azure today?&lt;/p&gt;
&lt;h2&gt;10. A Procurement Decision Tree&lt;/h2&gt;
&lt;p&gt;Six questions, asked in order. The first determines whether PCC is even in play; the rest sharpen the choice.&lt;/p&gt;
&lt;h3&gt;Question 1: Do you control the device that originates the request, and is it Apple-Intelligence-capable?&lt;/h3&gt;
&lt;p&gt;PCC requires Apple-Intelligence-capable client devices. The supported set as of mid-2026 is iPhone 15 Pro and later, iPads on M1 silicon or later, and Macs on M1 silicon or later [@apple-pcc-blog]. If your end users are on Windows laptops, Android phones, browsers, or any non-Apple endpoint, PCC is out of scope by construction. Azure / GCP / AWS confidential AI workloads do not have an analogous client-side requirement -- they are workload-shape-agnostic and the client can be any HTTPS-speaking device.&lt;/p&gt;
&lt;h3&gt;Question 2: Can you accept Apple-as-signer as the trust root?&lt;/h3&gt;
&lt;p&gt;PCC&apos;s trust collapses to Apple&apos;s signing infrastructure. The SEP-bound CA, the Apple-operated Transparency Log signer, the Apple bug-bounty program, and the Apple Security Engineering and Architecture team are the entire trust root [@apple-pcc-blog]. Azure spreads trust across AMD plus Intel plus NVIDIA plus Microsoft as separate signers [@ms-maa-overview] [@amd-kds] [@nvidia-dev-blog]. If your security posture explicitly requires multi-vendor trust diffusion -- for example, because your regulator does not accept single-vendor SBOMs as evidence -- Azure wins this axis (see §6 for the architectural reasoning).&lt;/p&gt;
&lt;h3&gt;Question 3: Do you need customer-managed key material?&lt;/h3&gt;
&lt;p&gt;Azure: yes, via SKR from Azure Key Vault Premium or Azure Managed HSM, with a release policy bound to MAA-issued claims [@ms-cc-overview] [@ms-maa-overview]. Apple: no by design, because PCC nodes are stateless and there is no customer key material on the node to be released [@apple-pcc-blog]. Regulated buyers whose framework requires customer-held keys -- for example, a FIPS 140-3 Level 3 customer-key-escrow requirement -- cannot map PCC into that framework, because PCC does not have the architectural primitive the framework is asking for.&lt;/p&gt;
&lt;h3&gt;Question 4: Do you need verifiable transparency of the actually-running code?&lt;/h3&gt;
&lt;p&gt;Apple: yes, via the published Transparency Log [@apple-pcc-release-transparency]. Azure: not via the architecture itself. You can build a customer-side log of the MAA tokens you have observed, or you can accept MAA&apos;s claims at face value. There is no Azure architectural primitive that proves the bits MAA verified are the same bits the workload is actually executing today, in the way that PCC&apos;s Transparency Log proves the image hash served to &lt;em&gt;you&lt;/em&gt; is the same one served to every other PCC user.&lt;/p&gt;
&lt;p&gt;This is the one axis where the architectures differ in &lt;em&gt;kind&lt;/em&gt;. If your threat model requires that &lt;em&gt;you&lt;/em&gt; be able to confirm what code the cloud is running, not just that &lt;em&gt;the cloud&lt;/em&gt; says it is running specific code, PCC is the only production answer.&lt;/p&gt;
&lt;h3&gt;Question 5: Do you need GPU-class confidential compute?&lt;/h3&gt;
&lt;p&gt;Both ship it. Pay attention to two facts. First, Azure&apos;s confidential GPU is H100 only at GA in mid-2026 [@nvidia-dev-blog] [@ms-sku-nccads]. AMD MI300X CC-On is not at GA on a major commercial cloud; NVIDIA H200 and Blackwell-class GB200 GPUs are GA on Azure as non-confidential SKUs. If you need confidential GPU compute, the only major-cloud answer is &lt;code&gt;NCCads_H100_v5&lt;/code&gt; (or its successor). Second, Apple&apos;s GPU is integrated on the SoC and is inside the SEP-rooted attestation envelope by construction; there is no separate cross-vendor GPU attestation step, which simplifies the trust analysis at the cost of being available only on the Apple stack.&lt;/p&gt;
&lt;h3&gt;Question 6: What does your auditor accept as evidence?&lt;/h3&gt;
&lt;p&gt;The MAA JWT is consumable by every off-the-shelf JWT verifier. It is also broadly accepted in regulated audits because the JWT format and the &lt;code&gt;x-ms-*&lt;/code&gt; claim names are documented in publicly-fetchable Microsoft Learn pages [@ms-maa-overview], and auditors can map MAA tokens onto NIST SP 800-53 attestation evidence requirements without exotic tooling.&lt;/p&gt;
&lt;p&gt;PCC&apos;s Transparency Log proof is newer. An audit that accepts a Merkle inclusion proof against an Apple-published log root as evidence is uncommon as of mid-2026; most regulated audit programs were designed before such a primitive existed in cloud AI. If your auditor needs PCC evidence, expect to write explainer documentation that translates &quot;your image hash is in append-only public log at Merkle position N with signed root R&quot; into the language your audit framework uses.&lt;/p&gt;
&lt;p&gt;{`
// Sketch of a Certificate-Transparency-style Merkle inclusion proof check.
// The PCC Transparency Log inherits this structural primitive from RFC 6962.
// This is educational -- a production verifier would use a maintained library.&lt;/p&gt;
&lt;p&gt;const sha256Hex = async (data) =&amp;gt; {
  const bytes = typeof data === &apos;string&apos; ? new TextEncoder().encode(data) : data;
  const buf = await crypto.subtle.digest(&apos;SHA-256&apos;, bytes);
  return [...new Uint8Array(buf)].map((b) =&amp;gt; b.toString(16).padStart(2, &apos;0&apos;)).join(&apos;&apos;);
};&lt;/p&gt;
&lt;p&gt;const concat = (a, b) =&amp;gt; {
  const out = new Uint8Array(a.length + b.length);
  out.set(a); out.set(b, a.length);
  return out;
};&lt;/p&gt;
&lt;p&gt;async function verifyInclusion(leafHashHex, leafIndex, treeSize, sibling, root) {
  // sibling is the audit path (array of sibling node hashes, leaf to root)
  let node = Uint8Array.from(leafHashHex.match(/.{2}/g).map(h =&amp;gt; parseInt(h, 16)));
  let idx = leafIndex;
  let size = treeSize;
  for (const s of sibling) {
    const sBytes = Uint8Array.from(s.match(/.{2}/g).map(h =&amp;gt; parseInt(h, 16)));
    // RFC 6962 prefixes internal hashes with 0x01
    const prefixed = (left, right) =&amp;gt; concat(new Uint8Array([0x01]), concat(left, right));
    const combined = (idx % 2 === 0)
      ? prefixed(node, sBytes)
      : prefixed(sBytes, node);
    const h = await sha256Hex(combined);
    node = Uint8Array.from(h.match(/.{2}/g).map(x =&amp;gt; parseInt(x, 16)));
    idx = Math.floor(idx / 2);
    size = Math.floor((size + 1) / 2);
  }
  const computedRoot = [...node].map((b) =&amp;gt; b.toString(16).padStart(2, &apos;0&apos;)).join(&apos;&apos;);
  return computedRoot === root;
}&lt;/p&gt;
&lt;p&gt;// In production: fetch (signed log root, audit path) from the log
// and the leaf hash from the attestation envelope&apos;s image-hash field.
// If verifyInclusion returns true AND the signed root matches what your
// device trusts, the image you are about to talk to is in the public log.
console.log(&apos;Educational sketch only; use a maintained CT library in production.&apos;);
`}&lt;/p&gt;
&lt;h3&gt;The decision tree in one diagram&lt;/h3&gt;

flowchart TD
    Q1{&quot;Apple-Intelligence-capable&lt;br /&gt;client device required?&quot;}
    Q2{&quot;Single-vendor (Apple)&lt;br /&gt;trust root acceptable?&quot;}
    Q3{&quot;Customer-managed key&lt;br /&gt;material required?&quot;}
    Q4{&quot;Need public-log&lt;br /&gt;verifiable transparency?&quot;}
    Q5{&quot;Need GPU TEE&lt;br /&gt;at fleet scale?&quot;}
    Q6{&quot;Auditor accepts&lt;br /&gt;Merkle inclusion proof?&quot;}
    Q1 --&amp;gt;|No| AZ[Azure / GCP / AWS]
    Q1 --&amp;gt;|Yes| Q2
    Q2 --&amp;gt;|No| AZ
    Q2 --&amp;gt;|Yes| Q3
    Q3 --&amp;gt;|Yes| AZ
    Q3 --&amp;gt;|No| Q4
    Q4 --&amp;gt;|Yes| Q5
    Q4 --&amp;gt;|No| AZ
    Q5 --&amp;gt;|Yes, Apple integrated GPU OK| PCC[Apple PCC]
    Q5 --&amp;gt;|Yes, need NVIDIA H100| AZ
    PCC --&amp;gt; Q6
    Q6 --&amp;gt;|Yes| PCC2[PCC fits the audit posture]
    Q6 --&amp;gt;|No| PCC3[Write explainer documentation,&lt;br /&gt;or fall back to Azure JWT-based evidence]

The MAA JWT maps cleanly onto NIST SP 800-53 SA-12 (Supply Chain Protection) and SC-12 (Cryptographic Key Establishment and Management) evidence requirements, because the JWT format and the claim semantics are publicly documented and JWT verifiers are standard library code [@ms-maa-overview]. PCC&apos;s Transparency Log evidence is newer; SA-12-style framings exist for Certificate Transparency in the web-PKI context but not yet (as of mid-2026) as a recognised confidential-AI evidence pattern. Expect explainer documentation to be required. Both architectures interact with FedRAMP, but Azure&apos;s confidential AI offerings are further along the FedRAMP path because Microsoft&apos;s broader Azure compliance suite is older.

Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs. -- NVIDIA Blog, September 24, 2024 [@nvidia-h100-ga]
&lt;h3&gt;What the verifier actually does, on the wire&lt;/h3&gt;
&lt;p&gt;Once procurement has chosen the architecture, an engineer somewhere has to &lt;em&gt;write the verifier&lt;/em&gt;. The two architectures end up being symmetric in this regard: each produces a cryptographic envelope, and a relying party has to parse, validate signatures, and check inclusion or claims. Three procurement-grade reference primitives anchor the choice -- two from Azure (already shown above), one from Apple PCC.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On Azure&lt;/strong&gt;, the relying party walks an MAA JWT verification flow (decode the JWT, validate signature against the MAA JWKS, match claims against an SKR release policy -- the JavaScript reference appears in §6 Axis 3 alongside the MAA JWT decode) [@ms-maa-overview]. For customers who want to &lt;em&gt;not&lt;/em&gt; trust MAA, the alternative path uses &lt;code&gt;snpguest&lt;/code&gt; to fetch the AMD VCEK chain and verify the SEV-SNP attestation directly (the bash reference also in §6 Axis 3) [@virtee-snpguest]. The two paths produce structurally equivalent confidence in the same evidence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On Apple PCC&lt;/strong&gt;, the relying-party verifier is &lt;code&gt;PrivateCloudCompute/NodeValidator.swift&lt;/code&gt; and friends [@apple-pcc-nodevalidator]. The flow is: parse the &lt;code&gt;AttestationBundle&lt;/code&gt; from the response (the bundle structure is defined in &lt;code&gt;SEPAttestation.swift&lt;/code&gt; [@apple-pcc-sepattest]); call the SEP attestation context verifier (&lt;code&gt;aks_attest_context_verify&lt;/code&gt;) on the SEP signature against the per-die Apple-rooted certificate chain; parse the &lt;code&gt;Release.swift&lt;/code&gt; &lt;code&gt;Release&lt;/code&gt; struct as ASN.1 DER and compute its SHA-256 digest [@apple-pcc-release-swift]; check the SEP attestation policy claims (&lt;code&gt;SEPAttestationPolicy.swift&lt;/code&gt; [@apple-pcc-sepattestpolicy]) constrain the release digest; then call &lt;code&gt;SWTransparencyVerifier.verifyExpiringInclusion&lt;/code&gt; to verify the release digest&apos;s inclusion proof in the public transparency log [@apple-pcc-swtrans-verifier] [@apple-pcc-transparencypolicy]. The full reference is the &lt;code&gt;apple/private-cloud-compute&lt;/code&gt; repository&apos;s &lt;code&gt;VerifiableReleasesExtension&lt;/code&gt; directory and the &lt;code&gt;VerifiableReleasesExtension&lt;/code&gt; tutorial [@apple-pcc-vre].&lt;/p&gt;
&lt;p&gt;{`# This is a procurement-grade SKETCH, not production code. It walks the four&lt;/p&gt;
verification steps a real PCC client performs (see PrivateCloudCompute/
NodeValidator.swift for the canonical reference [@apple-pcc-nodevalidator]).
Each function is a stub showing the contract the caller must satisfy.
&lt;p&gt;from hashlib import sha256
from typing import Optional
from dataclasses import dataclass&lt;/p&gt;
&lt;p&gt;@dataclass
class AttestationBundle:
    &quot;&quot;&quot;The Apple PCC AttestationBundle, parsed from the response envelope.
    Structure defined in SEPAttestation.swift [@apple-pcc-sepattest].&quot;&quot;&quot;
    sep_signature: bytes
    sep_cert_chain: list
    release_der: bytes
    sep_attestation_policy_claims: dict
    transparency_inclusion_proof: dict&lt;/p&gt;
&lt;p&gt;def aks_attest_context_verify(
    sep_signature: bytes,
    sep_cert_chain: list,
    apple_root_anchor: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;Step 1: verify the SEP signature against the per-die Apple-rooted
    certificate chain. In the real client this calls the Security framework&apos;s
    aks_attest_context_verify; the SEP cert chain is rooted at Apple&apos;s PCC CA.
    Returns True if the signature chains to the pinned anchor.&quot;&quot;&quot;
    raise NotImplementedError(&quot;calls Security.framework in a real client&quot;)&lt;/p&gt;
&lt;p&gt;def compute_release_digest(release_der: bytes) -&amp;gt; bytes:
    &quot;&quot;&quot;Step 2: the Release struct is serialised as ASN.1 DER; the canonical
    release digest is SHA-256 over the DER bytes. See Release.swift for the
    schema [@apple-pcc-release-swift].&quot;&quot;&quot;
    return sha256(release_der).digest()&lt;/p&gt;
&lt;p&gt;def check_sep_attestation_policy(
    claims: dict,
    expected_release_digest: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;Step 3: the SEP attestation policy claims must constrain the release
    digest. See SEPAttestationPolicy.swift for the policy schema
    [@apple-pcc-sepattestpolicy]. A real client checks the policy version,
    the claimed release digest, and the attestation freshness window.&quot;&quot;&quot;
    claimed_digest = claims.get(&quot;release_digest&quot;)
    return claimed_digest == expected_release_digest&lt;/p&gt;
&lt;p&gt;def verify_expiring_inclusion(
    release_digest: bytes,
    inclusion_proof: dict,
    log_witness_root: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;Step 4: verify the release digest&apos;s inclusion in the public PCC
    transparency log against a witness-cosigned tree head. Reference impl:
    SWTransparencyVerifier.verifyExpiringInclusion
    [@apple-pcc-swtrans-verifier] [@apple-pcc-transparencypolicy].&quot;&quot;&quot;
    raise NotImplementedError(&quot;merkle proof + cosigned witness check&quot;)&lt;/p&gt;
&lt;p&gt;def verify_pcc_envelope(
    bundle: AttestationBundle,
    apple_root_anchor: bytes,
    log_witness_root: bytes,
) -&amp;gt; bool:
    &quot;&quot;&quot;The four-step PCC verifier flow. Returns True only if every step
    passes. A real client refuses to send the user&apos;s prompt if this returns
    False.&quot;&quot;&quot;
    if not aks_attest_context_verify(
        bundle.sep_signature, bundle.sep_cert_chain, apple_root_anchor
    ):
        return False
    release_digest = compute_release_digest(bundle.release_der)
    if not check_sep_attestation_policy(
        bundle.sep_attestation_policy_claims, release_digest
    ):
        return False
    if not verify_expiring_inclusion(
        release_digest, bundle.transparency_inclusion_proof, log_witness_root
    ):
        return False
    return True
`}&lt;/p&gt;
&lt;p&gt;The symmetry is the procurement point. &lt;strong&gt;Azure&lt;/strong&gt;: validate JWT signature against MAA JWKS, match claims against SKR policy. &lt;strong&gt;Apple PCC&lt;/strong&gt;: validate SEP signature against Apple PCC CA, validate inclusion proof against transparency log witness root. Both are cryptographic; both produce a yes/no decision against a hardware-anchored chain of trust. The architectural difference is what the relying party is allowed to know: with PCC, the relying party knows the exact image hash that ran (because the log says so); with Azure, the relying party knows the workload met an MAA policy (because the JWT says so). The two are not interchangeable evidence, but the verifier code-paths are roughly the same shape.&lt;/p&gt;
&lt;p&gt;The decision tree handles the typical questions. The atypical questions, and the misconceptions, are next.&lt;/p&gt;
&lt;h2&gt;11. Frequently Asked Questions&lt;/h2&gt;


Yes, in both architectures, against the threats the architecture names. Apple PCC&apos;s SEP-rooted attestation envelope plus the Transparency Log refusal to forward to unlogged images defends against a malicious Apple operator passively reading prompts [@apple-pcc-blog]. Azure CC-AI&apos;s SEV-SNP RMP-enforced memory plus MAA-gated SKR defends against a malicious Microsoft operator on the SEV-SNP path [@ms-maa-overview]. Neither closes side-channels on shared silicon [@ccc-technical-analysis]; neither closes compelled-vendor or lawful-access exposure; neither closes prompt-output exfiltration via the model itself. The &quot;the cloud cannot see your prompt&quot; claim is true against the named threat model and not against every conceivable threat.

Yes. The 2018-2020 cascade closed the SGX-era residuals -- Foreshadow / L1TF [@foreshadow], SgxPectre [@sgxpectre], Plundervolt (CVE-2019-11157) [@plundervolt] -- and the principled extension is that any TEE built on shared microarchitectural state inherits a similar surface. The CCC&apos;s &quot;A Technical Analysis of Confidential Computing&quot; v1.3 names this explicitly as a residual risk that the architecture does not close by construction [@ccc-technical-analysis]. CipherLeaks (USENIX Security 2021) demonstrated the same point on the AMD SEV side via a deterministic-ciphertext side channel [@cipherleaks]. Vendor microcode updates are an ongoing operational requirement, not a one-time fix.

No. Per the `apple/security-pcc` README verbatim: &quot;The publication of this code is intended for security research and verification purposes only&quot; [@apple-pcc-github]. The publication&apos;s purpose is research-grade transparency -- so that an independent researcher can inspect what is running, exercise the architecture inside the Virtual Research Environment, and submit findings to the Apple Security Bounty program with rewards up to \$1,000,000 [@apple-pcc-research]. It is not a typical open-source contribution model and the license and intended use are explicitly different. The substantive thing PCC ships is verifiable transparency of the running fleet, not community-driven development.

No. Both Linux and Windows guest OSes are supported on Azure confidential VMs, and the reference confidential-inferencing stack Microsoft publishes is Linux-based. The `microsoft/confidential-ai-workshop` repository contains three Linux-based tutorial directories: `confidential-llm-inferencing`, `confidential-whisper-inferencing`, and `confidential-ml-training`, with reusable modules for attestation, key management, key origin, model sourcing, and OS disk encryption [@ms-workshop]. The LLM inferencing tutorial deploys a `Standard_NCC40ads_H100_v5` confidential VM with a vLLM-plus-Streamlit-plus-Caddy stack [@ms-workshop-llm]. Windows is supported; Linux is the canonical reference.

Confidential Containers is an orchestration-layer abstraction that maps Kubernetes pods onto Generation-3 confidential VMs running on AMD SEV-SNP, Intel TDX, or IBM Secure Execution [@coco-gh]. It composes on top of the same substrate Azure CC-AI uses. It does not compete with Apple PCC architecturally -- they live at different layers of the stack. A CoCo deployment on Azure can use MAA and SKR for its attestation and key-release primitives, and orchestration vendors like Edgeless Systems&apos; Contrast wrap that pattern into a workload-level confidential-computing primitive on Kubernetes [@edgeless-contrast].

No. Both rest on vendor-controlled signing infrastructure. PCC&apos;s compelled-vendor exposure is concentrated on Apple, because the signer of every PCC attestation chain is Apple. Azure&apos;s is distributed across AMD, Intel, NVIDIA, and Microsoft, but a compelled Microsoft is sufficient to compromise an MAA-rooted workload because MAA is the single verifier whose JWT every downstream relying party trusts [@ms-maa-overview]. Trust diffusion across multiple vendors makes the *collapse* harder, but it does not make any one vendor&apos;s compelled-update path architecturally impossible. This is a property of the trust-rooting model, not a flaw of either architecture, and neither closes it by construction.

No. The canonical late-2024 Mark Russinovich confidential-AI session is **Microsoft Ignite 2024 BRK430**, &quot;Inside Azure Innovations with Mark Russinovich,&quot; also published on YouTube as &quot;Confidential AI and Inference -- Inside Azure Innovations.&quot; Russinovich&apos;s &quot;data in use&quot; framing for confidential computing originally appeared in his September 14, 2017 Azure blog &quot;Introducing Azure confidential computing,&quot; not in an academic OSDI venue [@ms-russinovich-2017]. Microsoft Build 2024&apos;s confidential-inferencing session was BRK227, &quot;Inside AI Security with Mark Russinovich,&quot; which announced confidential inferencing for the Azure OpenAI Whisper speech-to-text model -- not for GPT-4, and not under the title &quot;Confidential GPT&quot; [@ms-workshop-whisper].

&lt;h3&gt;What to carry into the next conversation&lt;/h3&gt;
&lt;p&gt;Two architectures. One promise. One axis on which they differ in kind. The end-user pitch -- &quot;the cloud cannot see your prompt&quot; -- is now functionally identical across Apple Private Cloud Compute and Azure Confidential AI, but the architectural machinery underneath ships two genuinely different things. PCC ships &lt;em&gt;verifiable transparency of the production fleet&lt;/em&gt; through an Apple-controlled stack and a public Transparency Log. Azure CC-AI ships &lt;em&gt;multi-vendor trust diffusion plus customer-managed keys&lt;/em&gt; through AMD SEV-SNP plus NVIDIA H100 CC-On plus MAA plus SKR. Each closes a trust-anchor gap the other leaves open. Neither closes the gap the other closes. Neither closes the side-channel, compelled-vendor, or model-output exfiltration gaps -- the CCC&apos;s own v1.3 analysis names these as residual risks for any TEE-based design [@ccc-technical-analysis].&lt;/p&gt;
&lt;p&gt;The next architectural generation -- the one that combines Azure-style multi-vendor TEE composition with Apple-style append-only transparency of production images -- would close the gap both leave open. The Kocaoğullar et al. transparency framework is the conceptual sketch [@kocaogullar-transparency]; the CCC Attestation SIG and the IETF RATS Working Group are where the production work is happening [@ccc-attestation-gh] [@ietf-rfc9334]. No vendor has shipped it.&lt;/p&gt;
&lt;p&gt;For now, the load-bearing decision is the one Question 4 in §10 asks. If your threat model requires that &lt;em&gt;you&lt;/em&gt; be able to confirm what code the cloud is actually running -- and not just that &lt;em&gt;the cloud&lt;/em&gt; says it is running specific code -- PCC is the only production answer in mid-2026. If your threat model is satisfied by multi-vendor trust diffusion and a managed-verifier JWT, Azure CC-AI gives you a richer key-management story and broader silicon optionality. The architectures are not better and worse. They are answers to different questions. The first useful step in any confidential-AI procurement is naming which question you are actually trying to answer.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;apple-pcc-vs-azure-confidential-ai&quot; keyTerms={[
  { term: &quot;Trusted Execution Environment (TEE)&quot;, definition: &quot;Hardware-isolated execution context that protects confidentiality and integrity of code and data even from the host OS, hypervisor, or peripheral firmware.&quot; },
  { term: &quot;Secure Enclave Processor (SEP)&quot;, definition: &quot;Apple-designed separate processor core on the same SoC as the main application processor, with its own boot ROM, AES engine, and protected memory. Per-node hardware root of trust on every Apple PCC server.&quot; },
  { term: &quot;Reverse Map Table (RMP)&quot;, definition: &quot;Hardware-maintained table in AMD SEV-SNP recording owner and validation state for every 4 KB physical page. Defends against SEVered-style hypervisor remap attacks by construction.&quot; },
  { term: &quot;Microsoft Azure Attestation (MAA)&quot;, definition: &quot;Managed Microsoft verifier service that consumes hardware attestation evidence (SEV-SNP, TDX, SGX, vTPM) and issues a signed JWT whose claims downstream relying parties consume.&quot; },
  { term: &quot;Secure Key Release (SKR)&quot;, definition: &quot;Azure Key Vault Premium / Managed HSM capability that gates release of a wrapped key on a successful MAA JWT verification against a customer-defined release policy.&quot; },
  { term: &quot;Transparency Log (Apple PCC)&quot;, definition: &quot;Append-only public log of every production PCC node software image hash. The user&apos;s device refuses to forward a request to a node whose image hash is not in the log.&quot; },
  { term: &quot;Security Protocol and Data Model (SPDM)&quot;, definition: &quot;DMTF DSP0274 standard for mutually-authenticated PCIe-endpoint sessions, used by the NVIDIA H100 CC-On architecture to bind the host CPU TEE to the GPU.&quot; },
  { term: &quot;Oblivious HTTP (OHTTP, RFC 9458)&quot;, definition: &quot;IETF protocol for forwarding HTTP requests through a third-party relay that strips the client IP, preventing the origin or any single intermediary from linking requests to a client.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>confidential-computing</category><category>apple-private-cloud-compute</category><category>azure-confidential-computing</category><category>attestation</category><category>trusted-execution-environment</category><category>ai-privacy</category><category>h100</category><category>transparency-log</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Hyper-V Enlightenments, VMBus, and the Synthetic Device Model</title><link>https://paragmali.com/blog/hyper-v-enlightenments-vmbus-and-the-synthetic-device-model/</link><guid isPermaLink="true">https://paragmali.com/blog/hyper-v-enlightenments-vmbus-and-the-synthetic-device-model/</guid><description>How Hyper-V guests get high-performance device I/O without emulating legacy hardware: enlightenments, the TLFS, VMBus rings, the VSP/VSC pair, and why the host-side parser is the attack surface.</description><pubDate>Thu, 14 May 2026 00:00:00 GMT</pubDate><content:encoded>
Hyper-V&apos;s guest OSes do not see emulated 1990s hardware. They see a published, versioned hypervisor ABI called the **Top-Level Functional Specification**, a transport called **VMBus** that consists of two ring buffers per channel, and a catalogue of synthetic devices whose backends live in the privileged root partition. This design is what makes Windows and Linux equally fast inside Hyper-V, and it is also why the host-side parsers in `vmswitch.sys` keep producing critical CVEs. The 2024 OpenHCL paravisor moves those parsers into the guest&apos;s own trust boundary in memory-safe Rust, which is the most consequential change to the Hyper-V device model since 2008.
&lt;h2&gt;1. The Type-1 hypervisor foundation&lt;/h2&gt;
&lt;p&gt;Open &lt;code&gt;Task Manager&lt;/code&gt; on a modern Windows 11 desktop, switch to the &lt;code&gt;Performance&lt;/code&gt; tab, and look at the line that says &quot;Virtualization: Enabled.&quot; That single line hides one of the most consequential design choices in modern operating systems: when Microsoft shipped Hyper-V with Windows Server 2008 in June 2008 [@ms-hyperv-server-overview], they did not bolt a virtualization product on top of Windows. They put a small hypervisor &lt;em&gt;underneath&lt;/em&gt; it.&lt;/p&gt;
&lt;p&gt;That ordering matters more than it sounds. In the older Microsoft Virtual Server 2005 model, Windows ran on the bare metal and a user-mode service emulated PC hardware for guests inside it. In the Hyper-V architecture documented by Microsoft in 2008 [@ms-hyperv-architecture], the hypervisor boots first and Windows itself becomes a guest of the hypervisor. Microsoft calls this guest the &lt;strong&gt;root partition&lt;/strong&gt;. Every other VM on the box is a &lt;strong&gt;child partition&lt;/strong&gt;.&lt;/p&gt;

A hypervisor that runs directly on the physical hardware rather than inside a host operating system. Hyper-V, VMware ESXi, and Xen are Type-1; VirtualBox and the original Microsoft Virtual Server are Type-2 (hosted). In a Type-1 design no general-purpose OS sits between the hypervisor and the silicon, which lets the hypervisor enforce isolation directly using CPU virtualization extensions like Intel VT-x and AMD-V.
&lt;p&gt;The root partition is not just another VM. It is a privileged partition: it owns the physical I/O devices, runs the parent stack of synthetic-device backends, and brokers everything that touches real hardware. Children get virtual processors and a slice of memory, and they communicate with the root over a software bus called VMBus that we will spend most of this article taking apart.&lt;/p&gt;

flowchart TD
    HW[&quot;Physical hardware (CPU, RAM, NICs, NVMe)&quot;]
    HV[&quot;Hyper-V hypervisor (microkernel)&quot;]
    Root[&quot;Root partition (Windows Server)&quot;]
    VSP[&quot;Virtualization Service Providers (VSPs): vmswitch.sys, storvsp.sys, ...&quot;]
    C1[&quot;Child partition: Windows VM&quot;]
    C2[&quot;Child partition: Linux VM&quot;]
    VSC1[&quot;VSCs: netvsc, storvsc, ...&quot;]
    VSC2[&quot;VSCs: hv_netvsc, hv_storvsc, ...&quot;]
    HW --&amp;gt; HV
    HV --&amp;gt; Root
    HV --&amp;gt; C1
    HV --&amp;gt; C2
    Root --&amp;gt; VSP
    VSP -. &quot;VMBus channel&quot; .-&amp;gt; VSC1
    VSP -. &quot;VMBus channel&quot; .-&amp;gt; VSC2
    C1 --&amp;gt; VSC1
    C2 --&amp;gt; VSC2
&lt;p&gt;The hypervisor itself is small by design. The Hyper-V architecture page on Microsoft Learn [@ms-hyperv-architecture-perf] describes it as a microkernel: it does the minimum a hypervisor must do (CPU scheduling, memory partitioning, interrupt routing, an inter-partition message bus) and pushes everything else, including the device models, out to the root partition. This is the opposite of the early VMware ESX design, where the hypervisor itself contained large device drivers.The microkernel choice was pragmatic, not ideological. A monolithic hypervisor with built-in NIC and storage drivers would have been a catastrophic certification problem: every NIC firmware update would risk a hypervisor patch. By delegating I/O to the Windows root partition, Microsoft re-used the entire Windows driver stack.&lt;/p&gt;
&lt;p&gt;The split also explains why Hyper-V &quot;feels Windows-shaped&quot; even though it is technically not Windows. The root partition is Windows, with all of its drivers, its WMI, its event log, its &lt;code&gt;Get-VM&lt;/code&gt; PowerShell cmdlets. The hypervisor underneath is a small, separate binary (&lt;code&gt;hvix64.exe&lt;/code&gt; on Intel, &lt;code&gt;hvax64.exe&lt;/code&gt; on AMD) that you almost never have a reason to think about. Microsoft itself goes further: in the same architecture document, it stresses that all device-model traffic flows through the root: &quot;the management operating system hosts virtual service providers (VSPs) that communicate over the VMBus to handle device access requests from child partitions&quot; (Microsoft Learn: Overview of Hyper-V [@ms-overview-hyper-v]).&lt;/p&gt;
&lt;p&gt;This sets up the question the rest of the article answers: if the hypervisor is small, the guest is unmodified Windows or Linux, and the root partition owns the real devices, then how does a guest actually do disk and network I/O at gigabit-or-better speeds without paying enormous costs to traverse all of these boundaries?&lt;/p&gt;
&lt;p&gt;The short answer is in three pieces: &lt;strong&gt;enlightenments&lt;/strong&gt; (the guest knows it is virtualized and uses hypercalls), &lt;strong&gt;VMBus&lt;/strong&gt; (the inter-partition transport), and the &lt;strong&gt;VSP/VSC pair&lt;/strong&gt; (split drivers that share memory through VMBus rings). The next section starts with the first of those three.&lt;/p&gt;
&lt;h2&gt;2. Enlightenments: what &quot;knowing you are virtualized&quot; buys you&lt;/h2&gt;
&lt;p&gt;In the early 2000s, the dominant intuition was that a hypervisor&apos;s job is to fool the guest. A perfectly faithful emulation of an Intel 440BX motherboard, a DEC 21140 NIC, and an IDE controller is what made VMware Workstation a useful product in 1999. It is also what made Microsoft Virtual Server 2005 too slow to saturate gigabit links: every &lt;code&gt;out&lt;/code&gt; instruction on a fake NIC port trapped to the hypervisor, was decoded against an in-memory chip model, and produced a synthetic interrupt that itself trapped on the way out. The Microsoft Virtual Server retrospective on Wikipedia [@wikipedia-virtual-server] notes that the architecture had no paravirtualization support and that performance was constrained relative to later hardware-assisted designs.&lt;/p&gt;
&lt;p&gt;Hyper-V&apos;s answer was to drop the pretence. If the guest &lt;em&gt;knows&lt;/em&gt; it is in a VM, it can use a fast path designed for VMs instead of pretending to drive imaginary chips. Microsoft calls this knowledge an &lt;strong&gt;enlightenment&lt;/strong&gt;, and the Hyper-V feature discovery page [@ms-tlfs-feature-discovery] is the contract a guest uses to learn what enlightenments the hypervisor offers.&lt;/p&gt;

A modification or feature in a guest operating system that takes advantage of running under a specific hypervisor. An enlightened guest detects the hypervisor (on x86, by reading the `cpuid` leaves at `0x40000000` and above), then opts in to using paravirtual interfaces (hypercalls, synthetic timers, synthetic interrupt controllers, shared TSC pages) instead of trapping on emulated hardware. An unmodified guest would still boot, but slower.
&lt;p&gt;Detection is the cheap part. The Linux kernel&apos;s Hyper-V overview document [@kernel-hyperv-overview] describes four cooperating mechanisms, layered atop one another: implicit traps that the hypervisor handles transparently, &lt;strong&gt;explicit hypercalls&lt;/strong&gt; the guest issues on purpose, &lt;strong&gt;synthetic registers&lt;/strong&gt; exposed as model-specific registers (MSRs) in the architectural CPU register file, and &lt;strong&gt;VMBus&lt;/strong&gt; for high-bandwidth device traffic. Each layer builds on the one below it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The contract between Hyper-V and its guests is &lt;em&gt;published&lt;/em&gt;. Microsoft maintains the &lt;strong&gt;Top-Level Functional Specification&lt;/strong&gt; as a public document under the Open Specification Promise. That single decision is why Linux ships an in-tree Hyper-V driver stack and why VMBus is not a black box.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The hypercall page&lt;/h3&gt;
&lt;p&gt;The first thing an enlightened guest does is set up a hypercall page. The TLFS Hypercall Interface page [@ms-tlfs-hypercall] describes the dance: the guest writes its identity into &lt;code&gt;HV_X64_MSR_GUEST_OS_ID&lt;/code&gt; (MSR &lt;code&gt;0x40000000&lt;/code&gt;), then writes a guest-physical address and an &lt;code&gt;enable&lt;/code&gt; bit into &lt;code&gt;HV_X64_MSR_HYPERCALL&lt;/code&gt; (MSR &lt;code&gt;0x40000001&lt;/code&gt;). The hypervisor responds by populating that page with the right opcode for the current CPU: &lt;code&gt;vmcall&lt;/code&gt; on Intel, &lt;code&gt;vmmcall&lt;/code&gt; on AMD. From that moment on, &quot;make a hypercall&quot; is a normal &lt;code&gt;call&lt;/code&gt; into a known address rather than an opcode the kernel must hand-assemble per CPU vendor.This trick neatly externalises the vendor-specific calling convention. Microsoft can later swap to a new opcode (say, on ARM64, where the equivalent is an &lt;code&gt;HVC&lt;/code&gt; instruction) without any guest code change. The guest just learns the new page contents.&lt;/p&gt;
&lt;p&gt;The same TLFS page documents two hypercall classes: &lt;strong&gt;simple&lt;/strong&gt; hypercalls (one operation, returns or faults) and &lt;strong&gt;rep&lt;/strong&gt; (repeated) hypercalls that take a counter and a start index, so a long-running operation can yield mid-flight without losing work. Three calling conventions exist: a memory-based one for large parameter blocks, a register-only fast variant for the very common case of one or two inputs, and an XMM-register variant that lets a guest pass up to 112 bytes of input through SSE registers.&lt;/p&gt;
&lt;p&gt;That XMM variant is unusual enough to flag. Most kernel ABIs do not touch SSE in privileged code because saving and restoring the full SSE state is expensive. Hyper-V&apos;s hypercall ABI uses XMM precisely because the round-trip cost of a hypercall is dominated by the &lt;code&gt;VMEXIT&lt;/code&gt; itself, so squeezing a few more bytes into registers is cheaper than spilling them to memory and reading them back.&lt;/p&gt;
&lt;h3&gt;Synthetic interrupts and synthetic timers&lt;/h3&gt;
&lt;p&gt;A guest&apos;s virtual processor has its own emulated local APIC by default, but an enlightened guest can also use a &lt;strong&gt;Synthetic Interrupt Controller (SynIC)&lt;/strong&gt;, defined in the TLFS. Each virtual processor gets 16 SINT slots, a per-CPU shared message page, and a per-CPU shared event page. SINTs are how VMBus signals events to the guest without going through the legacy LAPIC fast path.&lt;/p&gt;

One of 16 logical interrupt sources per virtual processor that the Hyper-V Synthetic Interrupt Controller can signal. SINTs are reachable through MSRs (`HV_X64_MSR_SINT0` through `HV_X64_MSR_SINT15`) and back the doorbell mechanism for VMBus channels and for synthetic timers. They are paravirtual: they would not exist on a bare-metal CPU.
&lt;p&gt;The clock side is even more interesting. The Linux kernel Hyper-V clocks documentation [@kernel-clocks] describes a &lt;strong&gt;reference TSC page&lt;/strong&gt; that the hypervisor maintains in shared memory: it contains a scale factor and an offset such that&lt;/p&gt;
&lt;p&gt;$$
\text{guest_time} = (\text{TSC} \times \text{scale}) &amp;gt;&amp;gt; 64 + \text{offset}
$$&lt;/p&gt;
&lt;p&gt;ticks at a constant 10 MHz frequency regardless of the underlying TSC. The guest&apos;s &lt;code&gt;clock_gettime&lt;/code&gt; and &lt;code&gt;gettimeofday&lt;/code&gt; can read TSC, multiply, shift, add, and return, all in user space via vDSO, with no kernel transition and no hypercall.&lt;/p&gt;

A web server that calls `clock_gettime` once per request, on a million-requests-per-second box, is a ridiculous workload that real systems run constantly. Without enlightenments, every call would be a `rdmsr` on a virtualised TSC or a trap into the hypervisor. With the reference TSC page, the same call is four arithmetic ops and a memory load. The kernel doc explains that this scale and offset survive live migration: &quot;in the case of a live migration to a host with a different TSC frequency, Hyper-V adjusts the scale and offset values in the shared page so that the 10 MHz frequency is maintained&quot; (Linux kernel: Hyper-V clocks [@kernel-clocks]).
&lt;p&gt;Synthetic timers complete the picture. Each virtual CPU has four synthetic timers programmable via MSRs; they fire SINTs into the SynIC. The guest does not need to touch an emulated PIT or HPET. Combined, SynIC + synthetic timers + the reference TSC page mean that an enlightened guest can do most of its time-keeping and inter-partition signalling without ever touching the legacy interrupt/timer chip surface.&lt;/p&gt;
&lt;h3&gt;The TLFS as a contract&lt;/h3&gt;
&lt;p&gt;All of this is published. The Top-Level Functional Specification [@ms-tlfs] is the document a guest author reads to know which MSRs to write, which &lt;code&gt;cpuid&lt;/code&gt; leaves to query, which hypercalls exist, and which features the hypervisor signals via feature flags. Microsoft maintains it under the Open Specification Promise. That promise is a deliberate contractual choice. Without it, Linux could not ship &lt;code&gt;drivers/hv/&lt;/code&gt; in-tree and Microsoft could not credibly claim that Linux is a first-class Hyper-V guest. The TLFS is the artefact that makes the rest of the architecture cooperative rather than reverse-engineered.&lt;/p&gt;
&lt;p&gt;The next layer up uses these primitives to build something more ambitious: a general-purpose inter-partition transport.&lt;/p&gt;
&lt;h2&gt;3. VMBus: the inter-partition transport&lt;/h2&gt;
&lt;p&gt;If enlightenments are the alphabet, VMBus is the language that synthetic devices speak. The Linux kernel VMBus document [@kernel-vmbus] puts the definition tersely: &quot;VMBus is a software construct provided by Hyper-V to guest VMs. It consists of a control path and common facilities used by synthetic devices that Hyper-V presents to guest VMs. The common facilities include software channels for communicating between the device driver in the guest VM and the synthetic device implementation that is part of Hyper-V, and signaling primitives to allow Hyper-V and the guest to interrupt each other.&quot;&lt;/p&gt;
&lt;p&gt;There is a lot in that paragraph. Let me unpack it, because this is the architectural core.&lt;/p&gt;

A software-only inter-partition communication bus provided by Hyper-V. It has a control path (channel offer, open, close, rescind), and per-device data channels built on shared memory ring buffers. VMBus is not a real bus in any hardware sense; nothing on the PCIe topology is named VMBus. It is a contract between guest drivers and the hypervisor.
&lt;h3&gt;Channels and the offer protocol&lt;/h3&gt;
&lt;p&gt;Every synthetic device a guest sees corresponds to a &lt;strong&gt;VMBus channel&lt;/strong&gt;. The root partition advertises (&lt;code&gt;OfferChannel&lt;/code&gt;) the list of devices a guest is permitted to use. The guest&apos;s VMBus driver iterates the offers, matches each to a class GUID (synthetic SCSI is one GUID, synthetic NIC is another, the input-style &lt;code&gt;vmbusrhid&lt;/code&gt; device is a third), and binds an in-kernel device driver to each one. The reverse operation, &lt;code&gt;RescindChannel&lt;/code&gt;, lets the host revoke a device cleanly, which is what happens during live migration when an SR-IOV virtual function gets pulled out from under a running VM.&lt;/p&gt;

sequenceDiagram
    participant Root as Root partition (VSP)
    participant HV as Hyper-V hypervisor
    participant Guest as Guest VM (VSC)
    Root-&amp;gt;&amp;gt;HV: OfferChannel(class_guid, instance_guid)
    HV-&amp;gt;&amp;gt;Guest: ChannelOffer message via SynIC
    Guest-&amp;gt;&amp;gt;HV: OpenChannel(ringbuf_gpa, signal_event)
    HV-&amp;gt;&amp;gt;Root: Channel opened
    loop steady-state I/O
        Guest-&amp;gt;&amp;gt;Root: write descriptor + payload to ring, signal SINT
        Root-&amp;gt;&amp;gt;Guest: write response to ring, signal SINT
    end
    Root-&amp;gt;&amp;gt;HV: RescindChannel(instance_guid)
    HV-&amp;gt;&amp;gt;Guest: ChannelRescind via SynIC
    Guest-&amp;gt;&amp;gt;Root: CloseChannel
&lt;h3&gt;Two ring buffers, one channel&lt;/h3&gt;
&lt;p&gt;Each open channel is two unidirectional ring buffers in shared memory: one for guest-to-host messages, one for host-to-guest. Each ring has a 4 KiB header page that holds the read index, the write index, and control flags, plus a power-of-two payload region. The guest tells the hypervisor which guest-physical pages back the ring through an object called a &lt;strong&gt;GPA Descriptor List&lt;/strong&gt; (GPADL), built up via the &lt;code&gt;vmbus_establish_gpadl&lt;/code&gt; API.&lt;/p&gt;
&lt;p&gt;The kernel doc reveals a small but durable engineering detail. It maps the ring buffer twice in the guest&apos;s kernel virtual address space: header page first, ring contents next, and then &lt;em&gt;the ring contents again&lt;/em&gt;, contiguously. Why? Because that lets a copy loop walk past the end of the ring without writing wrap-around code; the next byte after the ring&apos;s last byte is the ring&apos;s first byte, by virtual-memory arrangement. It is the same trick used inside the Linux page cache for &lt;code&gt;fbdev&lt;/code&gt; and inside DPDK&apos;s mempool. It costs a little address space; it saves a branch on every payload byte.The Linux kernel doc is explicit that this double-mapping convenience exists in the guest only. If you are writing a userspace tool that ingests a captured VMBus ring (for forensics or debugging) you must implement wrap-around manually. This is exactly the kind of detail that source code documentation captures and prose articles forget.&lt;/p&gt;
&lt;p&gt;The total amount of GPADL-shared memory a single guest can hold is capped per Windows version. The kernel doc records the numbers: roughly &lt;strong&gt;1280 MiB on Windows Server 2019 and later&lt;/strong&gt;, roughly &lt;strong&gt;384 MiB on earlier hosts&lt;/strong&gt; (Linux kernel: VMBus [@kernel-vmbus]). For a guest with 30+ channels (multiple netvsc subchannels, multiple storvsc subchannels, vPCI, KVP, time sync, VSS, balloon, framebuffer), that ceiling is real but not yet limiting at typical ring sizes of 1 to 16 MiB per direction.&lt;/p&gt;
&lt;h3&gt;The doorbell&lt;/h3&gt;
&lt;p&gt;Shared memory alone is not enough. The guest can write into the ring all it wants; the host will not look until it is told to. Conversely, the host can write into the ring; the guest will not check until something signals it. That signal is the doorbell, and it is implemented via the &lt;strong&gt;Synthetic Interrupt Controller&lt;/strong&gt; SINTs introduced in the previous section.&lt;/p&gt;
&lt;p&gt;When the guest enqueues a request and the host&apos;s read pointer is already chasing it (i.e., the host is still processing the last batch), the guest can suppress the doorbell entirely. Only the &lt;em&gt;first&lt;/em&gt; request after the host has caught up triggers a hypercall. This is &lt;strong&gt;interrupt coalescing in software&lt;/strong&gt;, and it is the single most important performance lever on a software data plane: the round-trip cost of a &lt;code&gt;VMEXIT&lt;/code&gt; is amortised across many packets.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This same shape, shared memory rings plus an event-channel doorbell, was the central insight of Xen&apos;s split-driver paravirtualization model in 2003 [@xen-pv-wiki]). Hyper-V&apos;s contribution was not the shape; it was packaging the shape so unmodified Windows guests could use it via in-box drivers, and publishing the protocol so unmodified Linux could too.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;VSPs and VSCs&lt;/h3&gt;
&lt;p&gt;The two endpoints of a channel have specific names. The &lt;strong&gt;Virtualization Service Provider (VSP)&lt;/strong&gt; is the kernel module in the root partition that owns the device backend. The &lt;strong&gt;Virtualization Service Client (VSC)&lt;/strong&gt; is the guest-side driver that talks to the VSP through the channel. Microsoft&apos;s own architecture page is precise: &quot;the Hyper-V-specific I/O architecture consists of virtualization service providers (VSPs) in the root partition and virtualization service clients (VSCs) in the child partition. Each service is exposed as a device over VM Bus, which acts as an I/O bus and enables high-performance communication between VMs that use mechanisms such as shared memory&quot; (Microsoft Learn: Hyper-V architecture [@ms-hyperv-architecture-perf]).&lt;/p&gt;

**VSP** (Virtualization Service Provider): a kernel module in the root partition that exposes a synthetic device backend to guests over a VMBus channel. Examples: `vmswitch.sys` (synthetic NIC), `storvsp.sys` (synthetic SCSI), the `vmbusrhid` server (synthetic input). **VSC** (Virtualization Service Client): the matching driver in the guest that consumes the channel and presents an OS-native device interface (a NIC, a SCSI controller, a keyboard) to the rest of the kernel.
&lt;p&gt;The split is symmetric in transport (both sides use the same ring) but asymmetric in trust. The VSP runs in the &lt;em&gt;most&lt;/em&gt; privileged context on the box, the root partition&apos;s kernel. The VSC runs in a normal guest kernel. Every byte that flows from guest to host crosses a trust boundary and gets parsed by code with full system privilege. The next two sections will return to this fact at length, because it is where the security story lives.&lt;/p&gt;
&lt;h3&gt;Why this works for closed-source guests&lt;/h3&gt;
&lt;p&gt;The Xen project tried something similar in 2003 with &lt;code&gt;netfront&lt;/code&gt;/&lt;code&gt;blkfront&lt;/code&gt; rings and event channels, but Xen PV required a paravirtualised guest kernel: the guest had to know it was running on Xen at compile time. Closed-source guests like Windows could not be modified, so Xen&apos;s wiki [@xen-pv-wiki]) eventually documents PV-on-HVM as a workaround.&lt;/p&gt;
&lt;p&gt;Hyper-V finessed this with hardware virtualization. The guest kernel runs unmodified inside VT-x or AMD-V; CPU-level privilege separation handles the privileged instructions. The only thing the guest needs to do to opt into VMBus is &lt;em&gt;load a driver&lt;/em&gt;. Every supported Windows version since Windows 7 / Server 2008 R2 ships those drivers in-box. Linux ships them in-tree from kernel 2.6.32 onward. There is no separate &quot;install paravirt drivers&quot; step, which is why Hyper-V &quot;just works&quot; for almost any guest you point at it.&lt;/p&gt;
&lt;p&gt;The transport is settled. What rides on it is a catalogue.&lt;/p&gt;
&lt;h2&gt;4. Synthetic device classes: storage, network, input, video, vPCI&lt;/h2&gt;
&lt;p&gt;A modern Hyper-V guest, on first boot, sees a small zoo of devices that have nothing to do with PC hardware. There is no IDE controller, no PS/2 keyboard, no Cirrus VGA. There is a synthetic SCSI controller, a synthetic NIC, a synthetic keyboard and mouse, a synthetic framebuffer, and (often) a synthetic PCI passthrough channel. Each is a VSP/VSC pair on top of VMBus.&lt;/p&gt;
&lt;p&gt;The Linux kernel VMBus document [@kernel-vmbus] enumerates the catalogue: synthetic SCSI controller (&lt;code&gt;storvsc&lt;/code&gt;), synthetic NIC (&lt;code&gt;netvsc&lt;/code&gt;), synthetic framebuffer (&lt;code&gt;synthvid&lt;/code&gt;), synthetic keyboard, synthetic mouse, PCI passthrough, plus the non-device services: heartbeat, time sync, shutdown, memory balloon, KVP exchange, and online backup (VSS).&lt;/p&gt;

flowchart LR
    subgraph Guest
        nv[&quot;netvsc (NIC)&quot;]
        st[&quot;storvsc (SCSI)&quot;]
        sv[&quot;synthvid (framebuffer)&quot;]
        kb[&quot;hyperv-keyboard&quot;]
        ms[&quot;hyperv-mouse&quot;]
        pc[&quot;pci-hyperv (vPCI)&quot;]
        kvp[&quot;hv_kvp (KVP)&quot;]
        ts[&quot;hv_utils (timesync, shutdown, heartbeat)&quot;]
    end
    subgraph Root
        vsw[&quot;vmswitch.sys&quot;]
        sto[&quot;storvsp.sys&quot;]
        sfb[&quot;synthvid VSP&quot;]
        rhid[&quot;vmbusrhid VSP&quot;]
        vpci[&quot;vPCI VSP&quot;]
        kvpd[&quot;KVP daemon&quot;]
        tsd[&quot;IS daemons&quot;]
    end
    nv -- &quot;VMBus channel&quot; --- vsw
    st -- &quot;VMBus channel(s)&quot; --- sto
    sv -- &quot;VMBus channel&quot; --- sfb
    kb -- &quot;VMBus channel&quot; --- rhid
    ms -- &quot;VMBus channel&quot; --- rhid
    pc -- &quot;VMBus channel&quot; --- vpci
    kvp -- &quot;VMBus channel&quot; --- kvpd
    ts -- &quot;VMBus channel&quot; --- tsd
&lt;h3&gt;Synthetic SCSI: storvsc&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;storvsc&lt;/code&gt; VSC presents itself to the guest as a SCSI host bus adapter. Disks attached to the VM appear as SCSI LUNs hanging off that HBA. The wire protocol uses ring buffers carrying SRB (SCSI Request Block) style commands. To scale, storvsc can open multiple &lt;strong&gt;sub-channels&lt;/strong&gt;, one per host CPU, so that I/O completion interrupts and request submission spread across cores rather than serialising on a single VMBus channel.&lt;/p&gt;
&lt;p&gt;This is also why Hyper-V&apos;s &quot;Generation 2&quot; VMs work. A Generation 2 VM [@ms-gen1-gen2-vms], introduced in Windows Server 2012 R2 in 2013, has no IDE controller in the boot path at all. UEFI loads the OS loader from a synthetic SCSI device, the OS loader hands off to the kernel, and the kernel binds storvsc to the same device. The legacy IDE emulator simply never runs. That removes a lot of attack surface and lets boot volumes grow up to 64 TB on VHDX.&lt;/p&gt;
&lt;h3&gt;Synthetic NIC: netvsc&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;netvsc&lt;/code&gt; is the synthetic NIC. The wire protocol historically wrapped Microsoft&apos;s NDIS-style RNDIS frames around payloads sent through the channel ring, which is why some Linux discussions mention &quot;RNDIS frames over VMBus.&quot; The Linux driver lives in &lt;code&gt;drivers/net/hyperv/&lt;/code&gt; and the kernel netvsc documentation [@kernel-netvsc] describes how it can spread receive-side traffic across multiple VMBus subchannels via Receive Side Scaling.&lt;/p&gt;
&lt;p&gt;netvsc is also the one device class where Hyper-V composes with hardware passthrough. Section 8 will take this apart in detail; for now, note that the same &lt;code&gt;netvsc&lt;/code&gt; VSC can run alongside an SR-IOV virtual function in the guest, with &lt;code&gt;netvsc&lt;/code&gt; acting as the slow-path failover and the VF carrying the steady-state traffic.&lt;/p&gt;
&lt;h3&gt;Synthetic input: vmbusrhid&lt;/h3&gt;
&lt;p&gt;The synthetic keyboard, the synthetic mouse, and a few related input streams ride on a server in the root partition called &lt;code&gt;vmbusrhid&lt;/code&gt; (the name is shorthand for &quot;VMBus relay HID&quot;). It is a small surface in bytes, but architecturally it has the same shape as netvsc: guest-controllable messages parsed in kernel mode in the root partition. Anyone evaluating the trust boundary should treat it the same way as netvsc, even though the data rate is six orders of magnitude lower.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A path that carries 100 keystrokes per second is, on the wire, almost free. As an attack surface, it is identical to a path that carries a million packets per second: both are guest-controlled bytes parsed by privileged code. Section 7 walks through why the security community treats &lt;code&gt;vmbusrhid&lt;/code&gt; the way it treats &lt;code&gt;vmswitch.sys&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Synthetic video: synthvid&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;synthvid&lt;/code&gt; is a synthetic framebuffer. It is what lets you connect to a Hyper-V VM through the Virtual Machine Connection client without dragging in an emulated VGA. It is intentionally simple: there is no 3D acceleration in the synthetic path. Workloads that need GPU acceleration use a different mechanism, vPCI / DDA, to assign a real GPU to the VM.&lt;/p&gt;
&lt;h3&gt;vPCI: synthetic PCI passthrough&lt;/h3&gt;
&lt;p&gt;The most subtle device class is &lt;code&gt;pci-hyperv&lt;/code&gt;, which exposes a virtual PCIe topology to the guest. The Linux kernel vPCI document [@kernel-vpci] describes the trick: a passthrough device is offered to the guest &lt;em&gt;initially&lt;/em&gt; over VMBus (the channel carries the device&apos;s PCI configuration space and BARs), and once the guest&apos;s vPCI driver has constructed a real PCI device object for it, the device dual-identifies as a normal PCIe device. The vendor driver can then load against it.&lt;/p&gt;
&lt;p&gt;This is the mechanism behind both Hyper-V&apos;s Discrete Device Assignment (DDA) [@ms-dda] and Azure&apos;s Accelerated Networking, which we will return to in Section 8. The DDA planning document is explicit that Microsoft formally supports DDA for &lt;strong&gt;GPUs and NVMe storage&lt;/strong&gt; as device classes; other PCIe devices are &quot;likely to work&quot; but require vendor support.&lt;/p&gt;
&lt;h3&gt;Generation-1 vs Generation-2: a quick decoder&lt;/h3&gt;
&lt;p&gt;Putting the device classes side by side clarifies why the move from Generation-1 to Generation-2 VMs simplified so much:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Element&lt;/th&gt;
&lt;th&gt;Generation-1 VM (legacy)&lt;/th&gt;
&lt;th&gt;Generation-2 VM (since 2013)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Firmware&lt;/td&gt;
&lt;td&gt;BIOS&lt;/td&gt;
&lt;td&gt;UEFI with Secure Boot&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Boot disk&lt;/td&gt;
&lt;td&gt;Emulated IDE&lt;/td&gt;
&lt;td&gt;Synthetic SCSI (&lt;code&gt;storvsc&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network on boot&lt;/td&gt;
&lt;td&gt;Emulated DEC 21140 fallback&lt;/td&gt;
&lt;td&gt;Synthetic NIC (&lt;code&gt;netvsc&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Input&lt;/td&gt;
&lt;td&gt;Emulated PS/2 + &lt;code&gt;vmbusrhid&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vmbusrhid&lt;/code&gt; only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Display&lt;/td&gt;
&lt;td&gt;Emulated VGA + &lt;code&gt;synthvid&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;synthvid&lt;/code&gt; only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max boot VHDX&lt;/td&gt;
&lt;td&gt;2 TB&lt;/td&gt;
&lt;td&gt;64 TB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source&lt;/td&gt;
&lt;td&gt;Microsoft Learn: Gen 1 vs Gen 2 [@ms-gen1-gen2-vms]&lt;/td&gt;
&lt;td&gt;Same&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Generation-2 is what the Hyper-V architecture wanted to be from the beginning: an all-synthetic stack with no fallback to imaginary 1990s chipsets. The two-generation existence was not a design preference; it was the cost of supporting older operating systems whose boot loaders only knew about BIOS and IDE. Today, every modern Windows and modern Linux supports Generation-2; Generation-1 remains for legacy guests.&lt;/p&gt;
&lt;h3&gt;Counting boundary crossings&lt;/h3&gt;
&lt;p&gt;The shape of the hot path is now visible. To send one network packet from a guest:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The guest writes one descriptor and one payload copy into the netvsc TX ring (one memory copy).&lt;/li&gt;
&lt;li&gt;The guest possibly fires a doorbell (one hypercall, often suppressed if the host has not caught up).&lt;/li&gt;
&lt;li&gt;The host&apos;s &lt;code&gt;vmswitch.sys&lt;/code&gt; reaps the descriptor, parses it, and forwards it through the virtual switch to a real NIC.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A single packet&apos;s hot path is &lt;strong&gt;at most one hypercall and one memory copy in the guest&lt;/strong&gt;, plus host-side ring traversal. Section 8&apos;s comparison table will quantify how this stacks up against virtio and SR-IOV, but the scale is clear: paravirt I/O on Hyper-V is orders of magnitude cheaper per packet than full PC emulation, and the gap closes only when you go all the way to hardware passthrough.&lt;/p&gt;
&lt;p&gt;The catalogue is set. Now, who actually wrote the Linux side of all this?&lt;/p&gt;
&lt;h2&gt;5. Linux Integration Services: Microsoft writes Linux drivers&lt;/h2&gt;
&lt;p&gt;In December 2009, Microsoft did something quietly historic. Linux kernel 2.6.32 merged a set of drivers under &lt;code&gt;drivers/staging/hv/&lt;/code&gt;, contributed by Microsoft itself, that taught the Linux kernel to be an enlightened Hyper-V guest. The kernel.org Hyper-V index page [@kernel-hyperv-index] is the maintained landing page for that work. Over the next several releases the drivers moved out of &lt;code&gt;staging/&lt;/code&gt;, settled at &lt;code&gt;drivers/hv/&lt;/code&gt;, &lt;code&gt;drivers/net/hyperv/&lt;/code&gt;, &lt;code&gt;drivers/scsi/storvsc_drv.c&lt;/code&gt;, and &lt;code&gt;drivers/pci/controller/pci-hyperv.c&lt;/code&gt;, and became the default in every mainstream distribution.&lt;/p&gt;
&lt;p&gt;That set of drivers is collectively called &lt;strong&gt;Linux Integration Services (LIS)&lt;/strong&gt;.&lt;/p&gt;

The set of in-kernel Hyper-V guest drivers that Microsoft contributes to upstream Linux. Includes `hv_vmbus` (the VMBus core), `hv_netvsc` (synthetic NIC), `hv_storvsc` (synthetic SCSI), `hv_utils` (KVP, time sync, shutdown, heartbeat, VSS), `pci-hyperv` (vPCI), and `hv_balloon` (memory ballooning). The same code that Microsoft maintains in the Linux tree powers Linux guests on Hyper-V on Windows Server, on Azure, and on developer Hyper-V on Windows 11.
&lt;p&gt;The reason this matters is bigger than convenience. In 2009, Linux had a long, painful history with Hyper-V&apos;s competitors. VMware shipped &lt;code&gt;open-vm-tools&lt;/code&gt; but the deepest paravirt drivers (VMXNET3, PVSCSI) lived in vendor packages. Xen&apos;s PV drivers existed in-tree but their evolution depended on Citrix and the Xen project. By contributing the full driver stack upstream and committing to keep it there, Microsoft chose a different route: they put the &lt;em&gt;spec&lt;/em&gt; (the TLFS) and the &lt;em&gt;implementation&lt;/em&gt; (LIS) in the open at the same time.&lt;/p&gt;

Microsoft did not just publish a hypervisor specification and hope Linux would adopt it. They wrote the Linux drivers themselves and upstreamed them, and then they kept doing it for fifteen years.
&lt;p&gt;You can see the maintenance pattern in any current kernel. The &lt;code&gt;drivers/hv/&lt;/code&gt; directory has continuous commit activity from Microsoft engineers. Kernel-doc files like the VMBus [@kernel-vmbus], clocks [@kernel-clocks], vPCI [@kernel-vpci], overview [@kernel-hyperv-overview], and CoCo VM [@kernel-coco] pages are written by the same engineers who write the drivers. Several of those documents are the most lucid descriptions of the architecture that exist anywhere in public.One unexpected consequence: the Linux kernel docs are often easier to read for the architecture than Microsoft&apos;s own customer-facing docs. The customer docs answer &quot;how do I configure this?&quot;; the kernel docs answer &quot;what is actually happening?&quot; When researching this article, I found that the cleanest single description of VMBus channel lifecycle is the Linux kernel doc, not the TLFS.&lt;/p&gt;
&lt;h3&gt;What &quot;in-box&quot; really means&lt;/h3&gt;
&lt;p&gt;Both major guests now ship VMBus support without any post-install step:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;On Windows, the VMBus client stack is built into every supported Windows version since Windows 7 / Windows Server 2008 R2. The legacy Integration Services package, which once shipped as an ISO you mounted into the VM, is no longer needed on supported Windows.&lt;/li&gt;
&lt;li&gt;On Linux, the drivers are in-tree from kernel 2.6.32 (December 2009) onward and ship in every mainstream distro.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The kernel.org Hyper-V overview document [@kernel-hyperv-overview] explicitly warns against installing legacy LIS packages on top of a kernel that already has the in-tree drivers: it can break MSI-X handling and PCI passthrough. This is the kind of operational footgun that survives precisely because the in-box answer is correct and the LIS package is a holdover from earlier kernels.&lt;/p&gt;
&lt;h3&gt;A practical smoke test&lt;/h3&gt;
&lt;p&gt;You can confirm a Linux guest is using its enlightenments without any vendor tooling. The kernel exposes &lt;code&gt;cpuid&lt;/code&gt; leaves and Hyper-V detection through &lt;code&gt;dmesg&lt;/code&gt; and through &lt;code&gt;/sys&lt;/code&gt;. A small script makes it concrete:&lt;/p&gt;
&lt;p&gt;{&lt;code&gt; // This logic mirrors what \&lt;/code&gt;dmesg | grep -i hyperv` and a peek into
// /sys/devices/virtual/misc/vmbus would tell you on a real Linux Hyper-V guest.&lt;/p&gt;
&lt;p&gt;const guestObservations = {
  cpuidSig: &apos;0x40000000&apos;,         // Microsoft&apos;s vendor signature for Hyper-V
  guestOsIdMsr: 0x40000000,       // HV_X64_MSR_GUEST_OS_ID, written by the guest
  hypercallMsr: 0x40000001,       // HV_X64_MSR_HYPERCALL, returns the hypercall page
  vmbusModuleLoaded: true,
  netvscDevice: &apos;/sys/class/net/eth0/device/driver&apos;,
  netvscDriverName: &apos;hv_netvsc&apos;,
  storvscModuleLoaded: true,
};&lt;/p&gt;
&lt;p&gt;function isEnlightenedHyperVGuest(o) {
  if (o.cpuidSig !== &apos;0x40000000&apos;) return false;
  if (!o.vmbusModuleLoaded) return false;
  if (o.netvscDriverName !== &apos;hv_netvsc&apos;) return false;
  return true;
}&lt;/p&gt;
&lt;p&gt;console.log(
  isEnlightenedHyperVGuest(guestObservations)
    ? &apos;Yes: Hyper-V enlightened, using netvsc + storvsc&apos;
    : &apos;No: running on emulated PC hardware or non-Hyper-V hypervisor&apos;
);
`}&lt;/p&gt;
&lt;p&gt;The point is not the script itself (anyone can write a few lines of &lt;code&gt;awk&lt;/code&gt; against &lt;code&gt;dmesg&lt;/code&gt;); it is that the verification surface is &lt;em&gt;public&lt;/em&gt;. The CPU vendor signature, the MSRs, the kernel module names, the &lt;code&gt;/sys&lt;/code&gt; paths are all documented. There is nothing to reverse-engineer.&lt;/p&gt;
&lt;h3&gt;Why this earned trust&lt;/h3&gt;
&lt;p&gt;Two pieces of practical evidence persuaded the Linux community that LIS was not a strategic trap:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The drivers stayed upstream.&lt;/strong&gt; From 2009 to the present, Microsoft has maintained the &lt;code&gt;drivers/hv/&lt;/code&gt; tree, responded to maintainer feedback, and shipped patches through the normal kernel process.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The TLFS stayed accurate.&lt;/strong&gt; Successive Hyper-V releases either matched what the TLFS said or updated the TLFS. There was no second, secret protocol.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The combination put Microsoft in the unusual position of being the most open hypervisor vendor for Linux guest support. (VirtIO on KVM has a richer cross-vendor story; that comparison is Section 8.) This open posture is also what set up the 2024 OpenVMM open-sourcing as a credible move rather than a stunt.&lt;/p&gt;
&lt;p&gt;But before we get to OpenVMM, we need to look at a different way Hyper-V matters: not just as a substrate for VMs, but as a substrate for in-VM security boundaries inside Windows itself.&lt;/p&gt;
&lt;h2&gt;6. VBS and HVCI: Hyper-V as the trust anchor inside Windows&lt;/h2&gt;
&lt;p&gt;Up to this point the article has treated Hyper-V as a virtualization product: a thing that hosts VMs. Starting in Windows 10 and Windows Server 2016 [@ms-server-2016], Microsoft began using the same hypervisor for a different job: enforcing security boundaries inside a single OS install. The umbrella name is &lt;strong&gt;Virtualization-Based Security (VBS)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The mechanism is simple in description and subtle in consequences. The hypervisor splits a single guest&apos;s address space into two &lt;strong&gt;Virtual Trust Levels (VTLs)&lt;/strong&gt;. The lower one, VTL0, runs the normal Windows kernel and user mode (this is where &lt;code&gt;explorer.exe&lt;/code&gt; and your browser live). The higher one, VTL1, runs a much smaller stack called the &lt;strong&gt;Secure Kernel&lt;/strong&gt; plus a set of isolated user-mode services called &lt;strong&gt;trustlets&lt;/strong&gt;. A compromise of VTL0, even of &lt;code&gt;ntoskrnl.exe&lt;/code&gt;, cannot read or write VTL1 memory because the hypervisor enforces that boundary using the same hardware machinery (Intel EPT / AMD NPT, plus Intel VT-d / AMD-Vi for DMA) that it uses to isolate one VM from another.&lt;/p&gt;

A Hyper-V construct that partitions a single guest&apos;s address space into multiple privilege tiers enforced by the hypervisor. VTL0 hosts the normal kernel and user mode; VTL1 hosts the Secure Kernel and trustlets. The hypervisor presents each VTL with its own separate set of memory mappings, system registers, and interrupt state, so code running at VTL0 cannot read VTL1&apos;s memory even if it has run-as-NT-AUTHORITY-SYSTEM privilege.

flowchart TD
    HV[&quot;Hyper-V hypervisor&quot;]
    subgraph Guest[&quot;A single Windows guest&quot;]
        subgraph VTL0[&quot;VTL0 (normal world)&quot;]
            User0[&quot;User mode: apps&quot;]
            Kernel0[&quot;NT kernel&quot;]
        end
        subgraph VTL1[&quot;VTL1 (secure world)&quot;]
            SK[&quot;Secure Kernel&quot;]
            Trustlets[&quot;Trustlets: LSAIso, BIOiso, ...&quot;]
        end
    end
    HV --&amp;gt; Guest
    HV -. &quot;EPT + IOMMU enforcement&quot; .-&amp;gt; VTL0
    HV -. &quot;EPT + IOMMU enforcement&quot; .-&amp;gt; VTL1
    Kernel0 -. &quot;VTL switch (hypercall)&quot; .-&amp;gt; SK
&lt;h3&gt;What lives in VTL1&lt;/h3&gt;
&lt;p&gt;The flagship inhabitant of VTL1 is &lt;strong&gt;Hypervisor-protected Code Integrity (HVCI)&lt;/strong&gt;, which moves kernel-mode page-table integrity checking into the Secure Kernel. With HVCI on, no VTL0 driver can mark a kernel page as both writable and executable; the Secure Kernel mediates the page tables and refuses the request. The result is that attackers who already have code execution in the NT kernel cannot trivially load arbitrary unsigned kernel code or build new executable JIT pages on the fly.&lt;/p&gt;
&lt;p&gt;The other tenants of VTL1 are &lt;strong&gt;trustlets&lt;/strong&gt;. The most familiar is &lt;code&gt;lsaiso.exe&lt;/code&gt; (LSA Isolation), which holds the cached domain credentials that historically lived in &lt;code&gt;lsass.exe&lt;/code&gt; and were the prime target for tools like Mimikatz. With Credential Guard on, those secrets move to a trustlet whose memory is unreadable from VTL0; even SYSTEM-level malware in the normal world cannot extract them. Other trustlets handle biometric template storage, key isolation for code integrity policy, and similar small, security-sensitive workloads.&lt;/p&gt;
&lt;h3&gt;Why the hypervisor is the right place for this&lt;/h3&gt;
&lt;p&gt;Putting these protections inside the hypervisor rather than inside the kernel has a property that no in-kernel mitigation can match: &lt;strong&gt;the protected component does not share an address space with the attacker&lt;/strong&gt;. A defence built inside &lt;code&gt;ntoskrnl.exe&lt;/code&gt; (&lt;code&gt;PatchGuard&lt;/code&gt;, &lt;code&gt;KASLR&lt;/code&gt;, control-flow guard) lives in the same memory the attacker is trying to corrupt. A defence built inside VTL1 lives in memory the attacker cannot touch, because the page tables that map it are themselves invisible from VTL0.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Pre-VBS Windows had decades of memory-safety bugs in the NT kernel. After VBS, exploiting one of those bugs no longer immediately yields the attacker the ability to read LSASS secrets or load arbitrary kernel code. The attacker now needs a &lt;em&gt;second&lt;/em&gt; bug, in the much smaller Secure Kernel codebase. The defender&apos;s effective budget went up by a large multiplier without rewriting a single line of NT.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;How this connects back to VMBus&lt;/h3&gt;
&lt;p&gt;VBS would not be possible without the work the previous sections described. The Secure Kernel is what runs in VTL1; it needs to communicate with VTL0 for ordinary system services (the &lt;code&gt;lsaiso.exe&lt;/code&gt; process must respond to authentication requests from VTL0 callers, the HVCI mediator must answer page-table requests, and so on). The signalling and shared-memory primitives that make those calls cheap are the same SynIC and shared-page primitives that VMBus uses between partitions.&lt;/p&gt;
&lt;p&gt;In other words, the architecture Microsoft built in 2008 to give a Windows VM a fast network card became, in 2016, the architecture that gives a single Windows install a security boundary stronger than its own kernel. The same hypervisor, the same trust-mediation primitives, two completely different applications.&lt;/p&gt;
&lt;p&gt;Windows Server 2019 [@ms-server-2019] extended this further with Hyper-V isolation for containers, where a container&apos;s lightweight VM gets its own kernel inside a tiny VTL0 of its own. The pattern is consistent: every time Windows wanted a stronger isolation primitive, the answer was &quot;use the hypervisor.&quot;&lt;/p&gt;
&lt;p&gt;This dual-use is the reason a serious Windows security review touches the Hyper-V codebase even on machines that nobody thinks of as virtualization hosts. A Hyper-V escape (a guest-to-host VMBus exploit) is not just &quot;an exploit against Azure&quot;; it is also, on a typical Windows 11 desktop with VBS enabled, an exploit against the boundary that protects LSASS secrets from kernel-mode malware.&lt;/p&gt;
&lt;p&gt;That makes the next section&apos;s question urgent: how strong is the VMBus boundary, in practice?&lt;/p&gt;
&lt;h2&gt;7. VMBus security: every message is a parser at the trust boundary&lt;/h2&gt;
&lt;p&gt;Here is the part of the architecture worth being honest about. The same property that makes VMBus fast, namely that the host-side VSP runs in the root partition&apos;s kernel and parses guest-supplied bytes directly, also makes the VSP the most consequential piece of attack surface in the entire stack. Microsoft itself prices it that way: the Hyper-V Bug Bounty Program [@ms-bounty-hyperv] pays up to &lt;strong&gt;USD 250,000&lt;/strong&gt; specifically for guest-to-host escapes that hit this surface, which is among the highest payouts Microsoft offers for any category of vulnerability.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Every byte that crosses a VMBus channel from a guest is a byte that a kernel-mode parser in the most privileged partition on the host has to interpret. The performance argument for a software data plane and the security argument against it are the same argument, looked at from opposite directions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The historical record&lt;/h3&gt;
&lt;p&gt;Three CVEs make the pattern concrete:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CVE-2017-0075&lt;/strong&gt; is the Hyper-V escape that the Qihoo 360 Vulcan Team demonstrated at Pwn2Own 2017. The NVD entry [@nvd-cve-2017-0075] describes it as a Hyper-V flaw that &quot;allows guest OS users to execute arbitrary code on the host OS via a crafted application.&quot; The reachable code was in a VMBus message handler on the host side.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CVE-2021-28476&lt;/strong&gt; is the canonical example. The NVD record [@nvd-cve-2021-28476] classifies it as a critical Hyper-V remote code execution vulnerability with a CVSS score of 9.9. The Akamai writeup with Guardicore and SafeBreach [@akamai-cve-2021-28476] traces the bug to &lt;code&gt;vmswitch.sys&lt;/code&gt;, the synthetic-NIC VSP, and shows it had been present in production since the August 2019 vmswitch build. The exploit primitive is exactly what the architecture invites: a guest crafts an OID-style RNDIS request, sends it through the netvsc VMBus channel, and the host&apos;s kernel parser misvalidates a length, producing memory corruption in the most privileged kernel on the box.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;CVE-2024-21407&lt;/strong&gt; is a more recent Hyper-V remote code execution vulnerability patched in March 2024 (NVD [@nvd-cve-2024-21407]). Its existence demonstrates that the bug class did not vanish; the same shape (guest-controlled message, host kernel parser, escalation to host code execution) keeps reappearing.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

The MSRC bounty page ranges from \$5,000 for low-impact bugs to \$250,000 for full guest-to-host escapes (Microsoft bounty page [@ms-bounty-hyperv]). That price point is not a marketing number; it is Microsoft signalling what its threat model says these bugs are worth. A defender pricing their own controls should treat any VSP code path that parses guest-controlled data as a category that justifies the same level of attention as remote internet-facing services.
&lt;h3&gt;Why the bug class is structural&lt;/h3&gt;
&lt;p&gt;The pattern in all three CVEs is the same:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A guest writes carefully crafted bytes into a VMBus channel ring.&lt;/li&gt;
&lt;li&gt;The guest fires the doorbell.&lt;/li&gt;
&lt;li&gt;The host&apos;s VSP, running in the root partition&apos;s kernel, dequeues the message.&lt;/li&gt;
&lt;li&gt;The VSP parses the message in C or C++ kernel code.&lt;/li&gt;
&lt;li&gt;A memory-safety mistake (length confusion, missing bounds check, integer overflow) becomes a write or read primitive in the host kernel.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There is no exotic mechanism here. The exploit surface is &quot;kernel C code parsing untrusted input,&quot; which has been the dominant source of remote-code-execution bugs in operating systems since the 1990s. The novelty is the location: the parser sits below the most privileged supervisor on the box, with full access to every other tenant&apos;s memory.&lt;/p&gt;

sequenceDiagram
    participant Mal as Malicious guest VM
    participant Ring as VMBus ring (shared memory)
    participant SInt as Synthetic Interrupt Controller
    participant VSP as Host VSP (e.g., vmswitch.sys, kernel)
    Mal-&amp;gt;&amp;gt;Ring: Write crafted RNDIS-style message
    Mal-&amp;gt;&amp;gt;SInt: Hypercall: signal channel event
    SInt--&amp;gt;&amp;gt;VSP: SINT delivered on host CPU
    VSP-&amp;gt;&amp;gt;Ring: Read message header
    note over VSP: Length confusion / missing bounds check
    VSP-&amp;gt;&amp;gt;VSP: Out-of-bounds write in root partition kernel
    note over VSP: Result: arbitrary code in the most privileged partition
&lt;h3&gt;Mitigations short of a rewrite&lt;/h3&gt;
&lt;p&gt;Microsoft&apos;s first line of defence is the same one every kernel team uses: ASLR, control-flow integrity, kernel hardening, fuzzing the parsers, code review of every new device class, and, on Azure specifically, isolating each tenant&apos;s compute hypervisor so a single compromised host does not become a multi-tenant disaster. The MSRC bounty program is partly a procurement mechanism for this same effort: pay researchers to find and report bugs before attackers find them in the wild.&lt;/p&gt;
&lt;p&gt;A second line of defence is &lt;strong&gt;Generation-2 VMs&lt;/strong&gt; (Microsoft Learn [@ms-gen1-gen2-vms]), which remove the legacy emulators (IDE, PS/2, PIC) from the host data path entirely. Every emulator removed is one fewer parser in the most privileged kernel.&lt;/p&gt;
&lt;p&gt;A third is the Microsoft Hyper-V architecture page [@ms-hyperv-architecture-perf]&apos;s &quot;minimise root-partition exposure&quot; guidance: configure hosts with the smallest set of root-partition services that the workload requires, since every service is potential surface.&lt;/p&gt;
&lt;p&gt;These all help, but none of them change the structural fact that VSPs parse guest-controlled data in C/C++ kernel code. The next architectural shift, the one that does change that fact, is what Section 9 is about.&lt;/p&gt;
&lt;h3&gt;Side channels and the Spectre era&lt;/h3&gt;
&lt;p&gt;VMBus also has to defend against side-channel attacks across the partition boundary. The same Spectre / Meltdown / L1TF mitigations that apply to a multi-tenant hypervisor in general apply to Hyper-V specifically. Microsoft&apos;s broader hypervisor mitigation strategy interacts with VMBus mostly indirectly: the SynIC, the hypercall page, and the timer subsystem all needed audit and adjustment when these classes of attacks emerged. The detail is largely outside the scope of an article about the device model, but the takeaway is consistent with the rest of this section: any shared CPU resource between partitions is a potential attack surface, and &quot;shared via the hypervisor&apos;s bus&quot; is no exception.&lt;/p&gt;
&lt;p&gt;The structural answer to all of this, the one Microsoft itself has been working toward, is to change the languages and the trust boundaries. To set that up, the next section first widens the field by comparing VMBus to its peer in the KVM world, virtio.&lt;/p&gt;
&lt;h2&gt;8. VMBus vs virtio: two answers to the same question&lt;/h2&gt;
&lt;p&gt;Hyper-V is not the only hypervisor with a paravirt I/O story. The KVM world evolved its own answer to the same problem at roughly the same time, and it ended up with a different design with different trade-offs. The standard is &lt;strong&gt;virtio&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The original virtio paper, Rusty Russell&apos;s &quot;virtio: Towards a De-Facto Standard For Virtual I/O Devices&quot; [@rusty-virtio-paper], was published at OLS 2008, the same year Hyper-V shipped. The proposal was explicit in its motivation: every hypervisor was reinventing paravirt drivers, and a single hypervisor-independent specification could let one guest driver work everywhere. OASIS later standardised virtio 1.0 in 2016, then virtio 1.1 in 2019 [@oasis-virtio-1-1], then virtio 1.2 as a Committee Specification in 2023 [@oasis-virtio-1-2].&lt;/p&gt;

A hypervisor-independent paravirtual I/O specification, governed by OASIS. A virtio device is presented to the guest over a transport (PCI, MMIO, or s390 channel I/O) that advertises capability bits. The data plane is a generic ring layout called a **virtqueue**: a ring of descriptors, an `avail` ring (guest-to-host), and a `used` ring (host-to-guest). Each device class (virtio-net, virtio-blk, virtio-scsi, virtio-fs, virtio-gpu) defines its own message format on top of virtqueues.
&lt;h3&gt;The same shape, viewed sideways&lt;/h3&gt;
&lt;p&gt;Architecturally, virtio and VMBus are sibling answers to the same shaped problem.&lt;/p&gt;

flowchart LR
    subgraph virtio_pci[&quot;virtio over PCI&quot;]
        gv[&quot;Guest virtio driver&quot;]
        vq[&quot;virtqueue (descriptors + avail + used)&quot;]
        host_be[&quot;Host backend (vhost-net, vhost-user, OpenVMM)&quot;]
        gv -- &quot;PIO doorbell write&quot; --&amp;gt; host_be
        gv -- &quot;shared memory&quot; --- vq
        host_be -- &quot;shared memory&quot; --- vq
        host_be -- &quot;MSI-X&quot; --&amp;gt; gv
    end
    subgraph vmbus[&quot;Hyper-V VMBus&quot;]
        gv2[&quot;Guest VSC&quot;]
        ring[&quot;Two ring buffers + GPADL&quot;]
        vsp[&quot;Host VSP (kernel)&quot;]
        gv2 -- &quot;Hypercall doorbell&quot; --&amp;gt; vsp
        gv2 -- &quot;shared memory&quot; --- ring
        vsp -- &quot;shared memory&quot; --- ring
        vsp -- &quot;SINT&quot; --&amp;gt; gv2
    end
&lt;p&gt;Both:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;shared-memory rings&lt;/strong&gt; for payload.The phrase &quot;shared-memory rings&quot; hides a small subtlety: a ring buffer is a circular buffer with separate read and write indices. Producer and consumer can run concurrently as long as they only touch their own index, which is what makes ring buffers a wait-free communication primitive on cache-coherent hardware.&lt;/li&gt;
&lt;li&gt;Use a &lt;strong&gt;doorbell&lt;/strong&gt; for signalling.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Batch&lt;/strong&gt; many requests per doorbell so per-message hypercall cost amortises.&lt;/li&gt;
&lt;li&gt;Have &lt;strong&gt;per-class device protocols&lt;/strong&gt; layered on top of a common transport.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The differences are where the world bites:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;VMBus&lt;/th&gt;
&lt;th&gt;virtio (1.2)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Transport&lt;/td&gt;
&lt;td&gt;Software-only &quot;bus&quot;, channel offer/open/close&lt;/td&gt;
&lt;td&gt;PCI, MMIO, s390 channel I/O&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doorbell&lt;/td&gt;
&lt;td&gt;Hypercall (&lt;code&gt;HV_SIGNAL_EVENT&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;PIO write to a doorbell BAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reverse signal&lt;/td&gt;
&lt;td&gt;Synthetic interrupt (SINT)&lt;/td&gt;
&lt;td&gt;MSI-X&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standardisation&lt;/td&gt;
&lt;td&gt;Microsoft-owned, Open Specification Promise [@ms-tlfs]&lt;/td&gt;
&lt;td&gt;OASIS-ratified, multi-vendor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Windows in-box drivers&lt;/td&gt;
&lt;td&gt;Yes, every supported version&lt;/td&gt;
&lt;td&gt;No; out-of-box signed VirtIO INFs from cloud vendors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Device classes beyond I/O&lt;/td&gt;
&lt;td&gt;Yes: KVP, time sync, VSS, balloon&lt;/td&gt;
&lt;td&gt;Limited; non-I/O often built on virtio-vsock or out-of-band agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-hypervisor portability&lt;/td&gt;
&lt;td&gt;Hyper-V only&lt;/td&gt;
&lt;td&gt;Universal: KVM, QEMU, Cloud Hypervisor, Firecracker, Xen HVM, OpenVMM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spec governance&lt;/td&gt;
&lt;td&gt;Single vendor under OSP&lt;/td&gt;
&lt;td&gt;Multi-vendor with formal conformance clauses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source for Linux side&lt;/td&gt;
&lt;td&gt;drivers/hv/ [@kernel-hyperv-index]&lt;/td&gt;
&lt;td&gt;drivers/virtio in the Linux tree&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;h3&gt;Where each design wins&lt;/h3&gt;
&lt;p&gt;Virtio&apos;s strongest claim is portability. The same Linux guest VM image, with the same in-tree virtio drivers, runs on KVM, QEMU, Cloud Hypervisor, AWS Firecracker, and (since 2024) Microsoft&apos;s own OpenVMM, which added virtio backend support. A workload that has to move between cloud providers benefits from this directly: the guest does not need a different driver stack per host.&lt;/p&gt;
&lt;p&gt;Virtio also has a richer multi-vendor governance story. The spec is OASIS-ratified, with explicit conformance clauses; multiple commercial hypervisors implement it; multiple SmartNIC vendors implement virtio data planes in hardware (the &lt;code&gt;vDPA&lt;/code&gt; and &lt;code&gt;VDUSE&lt;/code&gt; work, described by Red Hat [@redhat-vdpa] and the Linux kernel VDUSE doc [@kernel-vduse]).&lt;/p&gt;
&lt;p&gt;VMBus&apos;s strongest claim is &lt;strong&gt;integration&lt;/strong&gt;. Every supported Windows ships with the VSCs in-box; there is nothing for an admin to install. The transport carries not just I/O but a service catalogue: KVP for guest configuration, time sync, VSS for online backup, the heartbeat and shutdown channels. The TLFS, while owned by Microsoft, is published under the Open Specification Promise and is a &lt;em&gt;single&lt;/em&gt; document a guest author can read end-to-end.This is why &quot;VirtIO drivers for Windows&quot; exist as a separate project (the Fedora/Red Hat-signed &lt;code&gt;virtio-win&lt;/code&gt; package) for KVM clouds: out of the box, Windows does not know virtio. The Hyper-V world inverts the problem: out of the box, Linux does not need any third-party install because the drivers are upstream.&lt;/p&gt;
&lt;h3&gt;Where they coexist&lt;/h3&gt;
&lt;p&gt;The most interesting recent development is that the two camps have stopped being purely competitive. Microsoft&apos;s OpenVMM [@github-openvmm] implements both VMBus and virtio backends, so a Linux guest using virtio drivers can run on a Microsoft-developed VMM, and a Windows guest using VMBus drivers can run on the same VMM. This is partially ideological (Microsoft is no longer pretending its way is the only way) and partially pragmatic (a single VMM that supports both transports is simpler than maintaining two).&lt;/p&gt;
&lt;p&gt;Beyond the protocol-level comparison, both VMBus and virtio sit inside a larger composition with hardware passthrough, where the &lt;strong&gt;transport becomes the slow path&lt;/strong&gt; and a real PCIe device carries the steady-state traffic.&lt;/p&gt;
&lt;h3&gt;Hardware passthrough as a complement&lt;/h3&gt;
&lt;p&gt;The composition that runs almost every modern Azure VM is &lt;strong&gt;VMBus + SR-IOV&lt;/strong&gt;, packaged as Accelerated Networking [@ms-accelerated-networking]. The same VM gets both a synthetic NIC (&lt;code&gt;netvsc&lt;/code&gt; over VMBus) and an SR-IOV virtual function. The Linux netvsc driver documentation describes the failover mechanic: &quot;If SR-IOV is enabled in both the vSwitch and the guest configuration, then the Virtual Function (VF) device is passed to the guest as a PCI device. In this case, both a synthetic (netvsc) and VF device are visible in the guest OS and both NIC&apos;s have the same MAC address. The VF is enslaved by netvsc device. The netvsc driver will transparently switch the data path to the VF when it is available and up.&quot; (Linux kernel: netvsc [@kernel-netvsc]).&lt;/p&gt;
&lt;p&gt;When live migration starts, Azure revokes the VF, the data plane falls back to the netvsc/VMBus path, the VM moves, and a new VF on the destination host gets re-attached, all without dropping TCP connections. The VMBus path was never the production hot path, but its existence is what enables migration. The KVM world&apos;s analogue is &lt;strong&gt;vDPA&lt;/strong&gt;, which gives a virtio-shaped guest interface backed by a hardware data plane.&lt;/p&gt;
&lt;p&gt;A modern Azure NIC stack is pushing this even further. Azure Boost [@ms-azure-boost] moves both storage and networking data planes off the host CPU into dedicated FPGAs, with a stable Microsoft-engineered NIC interface called MANA [@ms-mana]. Microsoft&apos;s documentation reports up to 200 Gbps of network bandwidth and 6.6 million IOPS on local storage with this design, with the host&apos;s vmswitch still acting as the live-migration fallback path. The architectural insight is that the VMBus-based slow path is the durable invariant; what changes is whether the steady-state data plane is software, an SR-IOV VF, or a SmartNIC firmware path. Frameworks like DPDK [@dpdk-about] sit on top of whichever data plane the VM exposes.&lt;/p&gt;
&lt;p&gt;What none of this changes is the property Section 7 cared about: as long as a host-side VSP exists and parses guest-controlled bytes in kernel C/C++, the bug class is open. The next section is about the architectural move that closes it.&lt;/p&gt;
&lt;h2&gt;9. OpenVMM and OpenHCL: the 2024 open-source pivot&lt;/h2&gt;
&lt;p&gt;In 2024, Microsoft did two things that would have been hard to imagine a decade earlier. First, they open-sourced OpenVMM [@github-openvmm], a Rust implementation of the virtualization stack including the VSPs and the VMBus protocol. Second, they introduced OpenHCL [@ms-openhcl-deep-explainer], a &quot;paravisor&quot; configuration of OpenVMM that runs &lt;em&gt;inside&lt;/em&gt; a confidential VM as a higher-trust mediator between the workload and the (now-untrusted) host.&lt;/p&gt;
&lt;p&gt;Both moves are explained by the same trend the article has been circling: confidential computing fundamentally inverts the trust boundary, and the device model has to follow.&lt;/p&gt;

A higher-privileged software layer that runs *inside* a guest VM (not on the host) and mediates the guest&apos;s interaction with the hypervisor. In the Hyper-V model, a paravisor lives in VTL2 of the same VM whose workload runs in VTL0; the host hypervisor is outside the VM&apos;s trust boundary. The paravisor presents the workload with a familiar VMBus + VSP interface while internally talking to a hardware-isolated confidential VM substrate (AMD SEV-SNP or Intel TDX).
&lt;h3&gt;What changed in confidential computing&lt;/h3&gt;
&lt;p&gt;The classical Hyper-V trust model places the root partition at the apex of trust. The guest trusts the host. Memory the guest writes is, in the worst case, readable by the host. In &lt;strong&gt;confidential computing&lt;/strong&gt;, that is no longer acceptable. A regulated workload (a healthcare database, a financial processor) needs to run in a VM whose contents are protected even from a malicious or compromised hypervisor. AMD&apos;s &lt;strong&gt;SEV-SNP&lt;/strong&gt; and Intel&apos;s &lt;strong&gt;TDX&lt;/strong&gt; are CPU features that encrypt and integrity-protect VM memory in hardware so that a compromised host cannot read the guest&apos;s secrets.&lt;/p&gt;
&lt;p&gt;Azure Confidential Computing [@ms-confidential-computing] made these capabilities available as a product starting around 2022. The Azure confidential VM options page [@ms-coco-vm-options] documents the SKUs.&lt;/p&gt;
&lt;p&gt;This breaks the old VMBus story. In the classical model, the host&apos;s &lt;code&gt;vmswitch.sys&lt;/code&gt; reads the guest&apos;s network packets out of the VMBus ring. In a confidential VM that protection demands you can no longer let the host see those bytes; that defeats the entire point. So the question becomes: where does the synthetic-device backend live, if not in the host?&lt;/p&gt;
&lt;h3&gt;The paravisor answer&lt;/h3&gt;
&lt;p&gt;The Linux kernel&apos;s Hyper-V CoCo VMs document [@kernel-coco] describes the design directly: &quot;Paravisor mode. In this mode, a paravisor layer between the guest and the host provides some operations needed to run as a CoCo VM. The guest operating system can have fewer CoCo enlightenments than is required in the fully-enlightened case ... some aspects of CoCo VMs are handled by the Hyper-V paravisor while the guest OS must be enlightened for other aspects.&quot;&lt;/p&gt;
&lt;p&gt;OpenHCL is that paravisor. It runs in a higher-trust virtual trust level inside the same confidential VM (VTL2), it has access to the encrypted-memory primitives the CPU provides, and it presents the workload (in VTL0) with the same VMBus + VSP world a non-confidential VM would see. The workload OS does not need to be heavily modified; it sees what looks like Hyper-V, talks to what look like normal VSPs, and never has to know that those VSPs are now inside its own VM rather than on the host.&lt;/p&gt;

flowchart TD
    HW[&quot;Confidential CPU (SEV-SNP / TDX)&quot;]
    HV[&quot;Host hypervisor (untrusted by the workload)&quot;]
    subgraph CoCoVM[&quot;Confidential VM (memory encrypted)&quot;]
        VTL2[&quot;VTL2: OpenHCL paravisor (Rust VSPs)&quot;]
        VTL0[&quot;VTL0: workload OS (Windows or Linux, lightly enlightened)&quot;]
        VTL0 -- &quot;VMBus, looks normal&quot; --- VTL2
    end
    HW --&amp;gt; HV
    HV --&amp;gt; CoCoVM
    HV -. &quot;no access to guest plaintext&quot; .-&amp;gt; CoCoVM
&lt;h3&gt;The Rust rewrite&lt;/h3&gt;
&lt;p&gt;The other half of the story is &lt;strong&gt;memory safety&lt;/strong&gt;. Recall Section 7&apos;s CVE list: every headline Hyper-V escape in the past decade involved a parser bug in C/C++ kernel code. OpenVMM&apos;s choice to implement the entire VMM, including the VSPs, in Rust is a direct response to that history. Rust&apos;s ownership model rules out, by construction, a large class of memory-safety bugs (use-after-free, out-of-bounds access on slices, double-free) that produced those CVEs.&lt;/p&gt;
&lt;p&gt;This does not magically eliminate every vulnerability. A logic bug in a state machine, an integer-overflow on a length field, a side-channel timing leak: all of these still exist in Rust. But the categories that produced CVE-2017-0075, CVE-2021-28476, and CVE-2024-21407 are exactly the categories Rust was designed to make hard.&lt;/p&gt;

Garbage-collected languages are wrong for a kernel-mode parser: GC pauses are unacceptable in a hypervisor-adjacent fast path, and you cannot afford a runtime that allocates memory during interrupt handling. Rust&apos;s compile-time memory safety with no GC is, today, the only mature option that gives you both the safety and the predictability a VSP needs. Microsoft&apos;s choice is consistent with the rest of the industry; comparable rewrites of low-level systems infrastructure (Cloudflare&apos;s `cf-cmd`, Mozilla&apos;s `quiche`, the Android Bluetooth stack) have all converged on Rust.
&lt;h3&gt;What you can actually look at&lt;/h3&gt;
&lt;p&gt;OpenVMM is not a press release; it is a public repository that ships:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The full Rust source tree at github.com/microsoft/openvmm [@github-openvmm].&lt;/li&gt;
&lt;li&gt;A separate repository for the Linux kernel fork that the paravisor runs on top of, at github.com/microsoft/OHCL-Linux-Kernel [@github-ohcl-linux].&lt;/li&gt;
&lt;li&gt;Project documentation centred at openvmm.dev [@openvmm-dev].&lt;/li&gt;
&lt;li&gt;Both VMBus and virtio backends, so the same VMM can host Windows guests on VMBus and Linux guests on virtio.&lt;/li&gt;
&lt;li&gt;Documentation through the deeper Microsoft Tech Community explainer [@ms-openhcl-deep-explainer] and the original announcement [@ms-openhcl-announce] describing the paravisor&apos;s role.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For a security researcher or a regulated-cloud customer, this is a meaningful change. For the first time, the VMBus + VSP stack is auditable end-to-end in source.&lt;/p&gt;

If you want to see how a VSP actually consumes a channel, the OpenVMM repository contains the Rust modules that implement the VMBus channel state machine. Cloning the repo and grepping for `Channel::open` and `RingBuffer` shows the same offer/open/close/rescind pattern Section 3 described, expressed in Rust types whose lifetimes the compiler checks. Reading the same logic in Rust after reading the Linux C version in `drivers/hv/channel_mgmt.c` is a useful exercise; the abstraction is identical, and the safety guarantees diverge.
&lt;h3&gt;What still has to be solved&lt;/h3&gt;
&lt;p&gt;The kernel CoCo doc is candid about an open architectural problem that OpenHCL alone cannot solve: &quot;Unfortunately, there is no standardized enumeration of feature/functions that might be provided in the paravisor, and there is no standardized mechanism for a guest OS to query the paravisor for the feature/functions it provides. The understanding of what the paravisor provides is hard-coded in the guest OS.&quot; (Linux kernel: CoCo VMs [@kernel-coco]).&lt;/p&gt;
&lt;p&gt;In other words, the TLFS gave us a portable contract between guests and Hyper-V hypervisors. The paravisor world does not yet have an equivalent portable contract between guests and paravisors. Today&apos;s guests have OpenHCL-specific knowledge baked in. A future &quot;paravisor TLFS&quot; would let any compliant paravisor host any compliant guest, the same way the original TLFS did for the hypervisor. That standard does not exist yet, and writing it is the most consequential open problem in this corner of the architecture.&lt;/p&gt;
&lt;p&gt;The architecture is moving. Section 10 takes stock of what that means for engineers building or operating on this stack today.&lt;/p&gt;
&lt;h2&gt;10. Engineering takeaways and open problems&lt;/h2&gt;
&lt;p&gt;A working architecture is one where the trade-offs are &lt;em&gt;visible&lt;/em&gt;. Hyper-V&apos;s enlightenments + VMBus + VSP/VSC stack is a working architecture in exactly that sense: every property it has, including the security ones, is a consequence of design choices a reader can name.&lt;/p&gt;
&lt;h3&gt;What the design optimises for&lt;/h3&gt;
&lt;p&gt;Three explicit optimisations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;In-box drivers for closed-source guests.&lt;/strong&gt; Hardware virtualization handles privileged CPU instructions; the guest only needs to load a VMBus client driver to opt in to the fast path. Every supported Windows ships those drivers in-box. Every modern Linux ships them in-tree. There is no &quot;install paravirt drivers&quot; step, which is a large reason &quot;it just works.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A single transport that carries everything.&lt;/strong&gt; VMBus carries 12+ device classes plus non-device services (KVP, time sync, VSS, balloon, heartbeat). One protocol, one set of primitives, one debugging surface. This is the engineering equivalent of &quot;everything is a file&quot; applied to inter-partition communication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Live migration.&lt;/strong&gt; Because the data plane is software in the root partition, the VM is not bound to a specific host. The VSPs serialise their state during migration without guest cooperation. This is the property that makes VMBus the durable invariant under hardware-passthrough acceleration: SR-IOV gives you throughput; VMBus gives you mobility.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;What it pays for those properties&lt;/h3&gt;
&lt;p&gt;Two costs:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The host CPU is on the data plane.&lt;/strong&gt; A software ring serviced by &lt;code&gt;vmswitch.sys&lt;/code&gt; cannot match a 100 GbE NIC&apos;s line rate per host CPU core. Microsoft&apos;s answer is hybrid composition with SR-IOV (Accelerated Networking [@ms-accelerated-networking]) and SmartNIC offload (Azure Boost + MANA [@ms-azure-boost]). The KVM analogue is vDPA [@redhat-vdpa]. Both of these accept the structural truth that for the highest throughputs, the host CPU has to leave the data plane.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The host kernel parses guest-controlled bytes.&lt;/strong&gt; Section 7&apos;s CVE record is the catalogue of what that costs. The architectural answer is OpenHCL: move the parser into the guest&apos;s own trust boundary and rewrite it in Rust.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;A four-property idealisation&lt;/h3&gt;
&lt;p&gt;It is useful to write down what an idealised paravirt I/O stack would do, so it is clear which properties any real stack today is trading away.&lt;/p&gt;
&lt;p&gt;The four idealised properties:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Zero hypercalls per packet&lt;/strong&gt; in steady state.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Live-migration parity&lt;/strong&gt; with a software baseline.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cross-vendor / cross-hypervisor portability&lt;/strong&gt; of the guest driver.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No host-side memory-unsafe parser&lt;/strong&gt; of guest-controlled data.&lt;/li&gt;
&lt;/ol&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;(1) Zero hypercall&lt;/th&gt;
&lt;th&gt;(2) Live migration&lt;/th&gt;
&lt;th&gt;(3) Portability&lt;/th&gt;
&lt;th&gt;(4) No unsafe host parser&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;VMBus + in-kernel VSP&lt;/td&gt;
&lt;td&gt;partial (batched)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;virtio + vhost-net&lt;/td&gt;
&lt;td&gt;partial (batched)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SR-IOV / DDA&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accelerated Networking (VMBus + SR-IOV)&lt;/td&gt;
&lt;td&gt;yes (steady)&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vDPA&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenHCL paravisor + VMBus&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure Boost + MANA&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;partial&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;No single approach today matches all four properties. The Hyper-V production composition is roughly &lt;strong&gt;(VMBus baseline) + (Accelerated Networking for throughput) + (OpenHCL for confidential workloads)&lt;/strong&gt;. The KVM-world composition is &lt;strong&gt;(virtio baseline) + (vDPA / SmartNIC for throughput)&lt;/strong&gt;. SmartNIC-based stacks (Azure Boost, AWS Nitro, Google&apos;s offload) approach the same four-corner problem from yet another angle.&lt;/p&gt;
&lt;p&gt;This is a synthesis, not a single-source claim: the matrix combines properties documented separately in the Microsoft Accelerated Networking docs [@ms-accelerated-networking], the Linux kernel CoCo doc [@kernel-coco], the Discrete Device Assignment doc [@ms-dda], the SR-IOV overview [@ms-sriov-overview], the Linux netvsc driver doc [@kernel-netvsc], the VDUSE userspace interface [@kernel-vduse], the vPCI doc [@kernel-vpci], and the OpenHCL explainer [@ms-openhcl-deep-explainer]. Each individual cell is sourced; the ranking is the author&apos;s reading of those sources.&lt;/p&gt;
&lt;h3&gt;Practical pitfalls for operators&lt;/h3&gt;
&lt;p&gt;A few things the customer-facing docs do not always say plainly:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;vmbusrhid&lt;/code&gt; is not low-risk.&lt;/strong&gt; The keyboard/mouse channel is a kernel-level RPC surface from guest to root. Treat it the same way you would treat netvsc when modelling threat exposure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation-2 VMs reduce attack surface.&lt;/strong&gt; Choosing Generation-2 for new workloads removes the legacy IDE/PS/2/PIC emulators from the host data path entirely (Microsoft Learn: Gen 1 vs Gen 2 [@ms-gen1-gen2-vms]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mixing in-box and out-of-band Integration Services breaks things.&lt;/strong&gt; Modern Windows and modern Linux already have the drivers; installing the legacy LIS package on top can break MSI-X handling and PCI passthrough (Linux kernel: overview [@kernel-hyperv-overview]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DDA is not SR-IOV.&lt;/strong&gt; Discrete Device Assignment covers any PCIe device passthrough, but Microsoft formally supports only &lt;strong&gt;GPUs and NVMe&lt;/strong&gt; as device classes (Microsoft Learn: DDA planning [@ms-dda]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Confidential VMs do not have the same device set.&lt;/strong&gt; Hardware constraints reduce or alter the device classes available; always validate the specific synthetic devices your workload depends on are present in the target SKU (Linux kernel: CoCo [@kernel-coco]).&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; 1. Confidential VM (SEV-SNP / TDX)? Use the OpenHCL paravisor mode (Azure CoCo VM options [@ms-coco-vm-options]). 2. Need ≥40 Gbps with live migration? Use Accelerated Networking; on Boost-enabled SKUs, Boost adds another tier of offload. 3. Need ≥100 Gbps and accept binding to host? Use Discrete Device Assignment / SR-IOV. 4. Maximum guest portability across hypervisors? Use virtio; for bandwidth-sensitive workloads, vDPA. 5. Default Hyper-V workload, broad device coverage, native migration? VMBus + VSP (the default).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Open problems worth watching&lt;/h3&gt;
&lt;p&gt;The substantive open problems are:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;A standardised paravisor feature-enumeration interface.&lt;/strong&gt; OpenHCL is the first auditable paravisor, but there is no portable contract a guest can use to query &quot;what does this paravisor support.&quot; The TLFS gave us this for hypervisors; the paravisor analogue is missing (Linux kernel: CoCo [@kernel-coco]).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Confidential-VM-friendly live migration with paravirt devices.&lt;/strong&gt; Hardware-attested state cannot be cloned trivially; today&apos;s pragmatic answer is to constrain migration in CoCo VMs. A general solution is open.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A formal model of the VMBus offer/rescind state machine.&lt;/strong&gt; The kernel docs describe it narratively. A model that the VSP code could be checked against would let static analysis rule out the bug class behind the headline CVEs.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Live-migrating stateful SR-IOV VFs without device cooperation.&lt;/strong&gt; Vendor proposals exist; an industry standard does not.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Erasing memory-unsafety in legacy VSPs.&lt;/strong&gt; The Rust rewrite path in OpenVMM is correct; the multi-year engineering effort to convert every existing VSP is real. CVE-2024-21407 is recent enough to remind everyone the bug class is still producing fresh entries.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;What to remember in five years&lt;/h3&gt;
&lt;p&gt;The most important sentence in this article is one I have been quietly preparing throughout: the durable architectural invariant in Hyper-V is &lt;strong&gt;shared-memory ring + doorbell, with a published guest-side contract&lt;/strong&gt;. Everything else, including the choice of programming language for the VSP, the question of whether the data plane is software or hardware, and even whether the trust boundary places the VSP on the host or in a paravisor, is implementation. The transport is the invariant. That is the lesson the next decade of CoCo VMs and SmartNIC offload is converging toward: keep the contract stable, and let everything else change.&lt;/p&gt;
&lt;h2&gt;FAQ&lt;/h2&gt;

No. The drivers (`hv_vmbus`, `hv_netvsc`, `hv_storvsc`, `hv_utils`, `pci-hyperv`, `hv_balloon`) have been in the upstream Linux kernel since 2.6.32 in December 2009 and ship in every mainstream distribution. The legacy LIS package is a holdover from the era before in-tree support and can in fact break MSI-X handling and PCI passthrough if installed on top of a modern kernel (Linux kernel: Hyper-V overview [@kernel-hyperv-overview]).

Because the trust gradient is asymmetric. The VSP runs in the root partition&apos;s kernel, the most privileged context on the box; the VSC runs in a normal guest kernel. Bytes flowing from guest to host get parsed by code with full system privilege. A VSC bug typically harms only the guest; a VSP bug can be a cross-tenant compromise. The pattern is visible in the CVE record: CVE-2017-0075 [@nvd-cve-2017-0075], CVE-2021-28476 [@nvd-cve-2021-28476], and CVE-2024-21407 [@nvd-cve-2024-21407] all hit host-side parsers.

For live migration. SR-IOV gives you near-bare-metal throughput but binds the VM to a specific physical NIC; you cannot migrate that state. Keeping a VMBus-backed `netvsc` device in the same guest gives the hypervisor a software path it can fall back to during migration windows. The Linux kernel netvsc doc describes this failover explicitly: when SR-IOV is enabled, the VF is enslaved by netvsc and the data path switches transparently when the VF is up (Linux kernel: netvsc [@kernel-netvsc]).

OpenHCL is a *configuration* of OpenVMM, not a separate codebase. OpenVMM is the Rust virtualization stack at github.com/microsoft/openvmm [@github-openvmm]; OpenHCL is OpenVMM run as a paravisor inside a confidential VM&apos;s higher-trust virtual trust level (VTL2), so that the synthetic-device backends sit inside the guest&apos;s own trust boundary rather than on a host the guest cannot trust. The same Rust code can run as a host-side VMM (when paired with a hypervisor on the host) or as an in-guest paravisor (when running inside a SEV-SNP or TDX VM).

Both directions exist with caveats. OpenVMM, when used as a host VMM, supports both VMBus and virtio backends, so a Linux virtio guest can run on a Microsoft-developed VMM (github.com/microsoft/openvmm [@github-openvmm]). Native Hyper-V on a Windows Server host historically expects VMBus-driven guests; there is no in-box virtio device emulation on a stock Hyper-V Server. KVM hosts can technically present a VMBus-shaped device, but in practice the production answer on KVM is virtio.

Generation-2 VMs use UEFI with Secure Boot, boot from synthetic SCSI, and have no emulated IDE, PS/2, or PIC in the data path (Microsoft Learn: Gen 1 vs Gen 2 [@ms-gen1-gen2-vms]). Every emulator that is removed is one fewer parser running in the most privileged kernel on the host, so the host-side attack surface is meaningfully smaller. Generation-1 still exists for legacy guests that only know how to boot from BIOS + IDE.

VBS uses the Hyper-V hypervisor to split a single Windows install into VTL0 (the normal kernel and apps) and VTL1 (the Secure Kernel and trustlets like `lsaiso.exe`). The hypervisor enforces that VTL0 cannot read or modify VTL1&apos;s memory, even with kernel privileges. So an attacker who already has SYSTEM-level code execution in the normal world cannot trivially extract LSASS secrets or load arbitrary unsigned kernel code; the hypervisor stops them. This works on any modern Windows machine with the right CPU features, regardless of whether you ever run a VM yourself (Microsoft Learn: Windows Server 2016 What&apos;s New [@ms-server-2016]).
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;hyper-v-enlightenments-vmbus-and-the-synthetic-device-model&quot; keyTerms={[
  { term: &quot;Type-1 hypervisor&quot;, definition: &quot;A hypervisor that runs directly on hardware rather than inside a host OS. Hyper-V is Type-1; the original Microsoft Virtual Server was Type-2.&quot; },
  { term: &quot;Root partition&quot;, definition: &quot;The privileged partition under Hyper-V that owns physical I/O devices and hosts the synthetic-device VSPs. Runs Windows Server.&quot; },
  { term: &quot;Child partition&quot;, definition: &quot;An unprivileged partition that hosts a guest OS. Communicates with the root partition over VMBus.&quot; },
  { term: &quot;Enlightenment&quot;, definition: &quot;A guest-OS modification or feature that takes advantage of running under a specific hypervisor by using paravirtual interfaces (hypercalls, synthetic timers, SINTs) instead of trapping on emulated hardware.&quot; },
  { term: &quot;Top-Level Functional Specification (TLFS)&quot;, definition: &quot;Microsoft&apos;s published hypervisor ABI for Hyper-V, governing hypercalls, synthetic MSRs, synthetic interrupts, synthetic timers, and the VMBus protocol. Released under the Open Specification Promise.&quot; },
  { term: &quot;VMBus&quot;, definition: &quot;Hyper-V&apos;s software-only inter-partition transport. Has a control path (channel offer/open/close/rescind) and per-device shared-memory ring channels with SINT-based doorbells.&quot; },
  { term: &quot;VSP / VSC&quot;, definition: &quot;Virtualization Service Provider (root-partition kernel module that owns a synthetic-device backend) and Virtualization Service Client (guest-side driver that consumes the channel).&quot; },
  { term: &quot;Synthetic Interrupt Controller (SynIC)&quot;, definition: &quot;Per-vCPU synthetic interrupt subsystem with 16 SINT slots and shared message/event pages; the doorbell mechanism for VMBus and synthetic timers.&quot; },
  { term: &quot;Reference TSC page&quot;, definition: &quot;A guest-readable page maintained by Hyper-V containing scale and offset such that the guest can compute a 10 MHz monotonic clock from the hardware TSC entirely in user space.&quot; },
  { term: &quot;Generation-2 VM&quot;, definition: &quot;A Hyper-V VM that boots UEFI with Secure Boot from synthetic SCSI, with no emulated IDE/PS/2/PIC. Reduces host-side attack surface and supports VHDX up to 64 TB.&quot; },
  { term: &quot;Discrete Device Assignment (DDA)&quot;, definition: &quot;Hyper-V&apos;s general PCIe-passthrough mechanism. Microsoft formally supports GPUs and NVMe; other devices may work with vendor support.&quot; },
  { term: &quot;Accelerated Networking&quot;, definition: &quot;An Azure/Hyper-V feature that attaches both a synthetic NIC (netvsc over VMBus) and an SR-IOV virtual function to a guest, with netvsc as the live-migration fallback path.&quot; },
  { term: &quot;VBS / HVCI / VTL&quot;, definition: &quot;Virtualization-Based Security uses the Hyper-V hypervisor to split a single guest into Virtual Trust Levels (VTL0 normal, VTL1 secure). HVCI (Hypervisor-protected Code Integrity) and trustlets like lsaiso.exe live in VTL1.&quot; },
  { term: &quot;Paravisor&quot;, definition: &quot;A higher-trust software layer running inside a confidential VM (typically in VTL2) that mediates between the workload and the untrusted host hypervisor; presents the workload with a familiar VMBus + VSP world.&quot; },
  { term: &quot;OpenVMM / OpenHCL&quot;, definition: &quot;Microsoft&apos;s 2024 open-source Rust virtualization stack and its paravisor configuration. Re-implements the VSPs in memory-safe Rust to address the bug class behind CVE-2017-0075, CVE-2021-28476, and CVE-2024-21407.&quot; }
]} questions={[
  { q: &quot;Why does Microsoft maintain the Top-Level Functional Specification under the Open Specification Promise rather than as an internal document?&quot;, a: &quot;Because the OSP is what makes it legally and practically safe for the Linux community to ship in-tree drivers (drivers/hv/) implementing the hypervisor&apos;s guest-side ABI. Without the published, OSP-protected spec, Linux could only support Hyper-V via reverse-engineering, which would not have been politically or technically acceptable upstream. The OSP is the contractual artefact that turned &apos;Hyper-V can host Linux&apos; from a vendor claim into a maintained, in-tree reality.&quot; },
  { q: &quot;Walk through the lifecycle of a single network packet from a Hyper-V guest&apos;s userspace to the wire.&quot;, a: &quot;(1) The guest application calls send(); (2) the guest TCP/IP stack hands a packet to the hv_netvsc driver; (3) hv_netvsc allocates a slot in the netvsc TX VMBus ring, copies the descriptor and payload, and writes the new write index; (4) if the host is not already chasing the writes, the guest issues a HV_SIGNAL_EVENT hypercall (one VMEXIT) to fire the SINT for that channel; (5) the host&apos;s vmswitch.sys VSP reaps the descriptor from the ring, parses the RNDIS frame, and forwards it to the virtual switch; (6) the virtual switch dispatches it to a real NIC. In the steady state, a single VMEXIT can amortise across many packets through batching.&quot; },
  { q: &quot;Explain why the host-side VSP is the historical CVE locus for Hyper-V escapes.&quot;, a: &quot;Because the VSP runs in the root partition&apos;s kernel (the most privileged context on the box) and parses guest-controlled bytes from the VMBus ring. Any memory-safety mistake (length confusion, missing bounds check, integer overflow) in C/C++ kernel code translates directly to code execution in the most privileged supervisor on the host. CVE-2017-0075, CVE-2021-28476 (vmswitch.sys), and CVE-2024-21407 all instantiate this pattern. The attack surface is structural, not incidental.&quot; },
  { q: &quot;What does an enlightened Linux guest do when it first boots on Hyper-V, before any network or storage I/O happens?&quot;, a: &quot;It executes cpuid leaf 0x40000000 to detect the Microsoft hypervisor signature; reads further leaves to enumerate available enlightenments; writes HV_X64_MSR_GUEST_OS_ID to declare itself; writes HV_X64_MSR_HYPERCALL with a guest-physical address and an enable bit, prompting the hypervisor to populate that page with the right vmcall/vmmcall opcode; sets up SINT slots and a per-CPU SynIC message page; optionally reads the reference TSC page; loads the hv_vmbus driver, which begins receiving channel offers from the root partition; and binds class-specific drivers (hv_netvsc, hv_storvsc, etc.) to each offered channel.&quot; },
  { q: &quot;Why is OpenHCL described as a paravisor rather than a hypervisor or a VMM?&quot;, a: &quot;Because it sits inside a guest VM (in VTL2 of that VM), not on the host, and its job is to mediate between the guest workload and a hypervisor that the guest does not trust. A hypervisor on the host runs underneath all VMs; a VMM owns and controls VMs from outside; a paravisor lives inside one VM, at higher privilege than that VM&apos;s workload, and presents the workload with a familiar device-model surface (VMBus + VSPs) that is now backed by code inside the guest&apos;s own trust boundary rather than by the host kernel. The architecture inverts the historical Hyper-V trust model so that confidential VMs can be protected from a malicious host.&quot; },
  { q: &quot;Compare VMBus&apos;s ring-buffer transport to virtio&apos;s virtqueues. What is the same and what is different?&quot;, a: &quot;Same: shared-memory rings carrying descriptors and payload; doorbell-based signalling so per-message hypercall cost amortises across batches; per-device-class protocols layered on a common transport. Different: VMBus uses a software-only &apos;bus&apos; with offer/open/close/rescind control, while virtio rides on a real PCI/MMIO/channel-I/O transport with a generic capability-bit mechanism. VMBus&apos;s reverse signal is a SINT; virtio&apos;s is MSI-X. VMBus is Microsoft-owned under the OSP; virtio is OASIS-ratified and multi-vendor. VMBus has in-box Windows drivers and broader synthetic-service coverage (KVP, time sync, VSS); virtio has cross-hypervisor portability and a multi-vendor implementation pool.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>hyper-v</category><category>virtualization</category><category>vmbus</category><category>paravirtualization</category><category>azure</category><category>confidential-computing</category><category>security</category><author>noreply@paragmali.com (Parag Mali)</author></item><item><title>Inside Azure Confidential VMs: SEV-SNP, Intel TDX, and the Paravisor that Makes Them a Cloud Product</title><link>https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/</link><guid isPermaLink="true">https://paragmali.com/blog/inside-azure-confidential-vms-sev-snp-intel-tdx-and-the-para/</guid><description>Azure Confidential VMs combine AMD SEV-SNP and Intel TDX with the OpenHCL paravisor and MAA policy v1.2. A textbook tour from silicon to relying party.</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate><content:encoded>
**Azure Confidential VMs are Windows or Linux guests that the cloud operator&apos;s hypervisor cannot read or silently modify.** They are built on two distinct CPU primitives -- AMD SEV-SNP (Reverse Map Table + Virtual Machine Privilege Level + SNP_REPORT) and Intel TDX (Secure Arbitration Mode + the signed TDX Module + RTMR0-3) -- and wrapped on Azure by the open-source Rust paravisor OpenHCL running inside the trust boundary at VMPL0 or the L1 TD seat.&lt;p&gt;Inside that boundary the paravisor synthesises a vTPM whose quotes chain to the SEV-SNP or TDX hardware report, and Microsoft Azure Attestation runs a customer-defined policy v1.2 file (with JmesPath claim rules) against the evidence to release HSM-backed keys via Secure Key Release.&lt;/p&gt;
&lt;p&gt;The Generation-2 integrity rail closes the SEVered and SEVurity ciphertext-remapping class architecturally, but four 2024-era papers (CacheWarp, WeSee, Heckler, Ahoi) demonstrate that side-channel and notification-injection seams remain. Read this if you need to draw the Azure CVM stack from silicon to MAA, decide between SEV-SNP and TDX SKUs, and write an attestation policy that says exactly what you mean.
&lt;/p&gt;&lt;p&gt;&lt;/p&gt;
&lt;h2&gt;1. Even the cloud operator must not see your memory&lt;/h2&gt;
&lt;p&gt;A Windows Server VM is running a SQL query on Azure right now. It is joining a million-row variant table against a patient-genome reference, building an index in RAM, and serving the answer back to a clinician&apos;s web portal. The customer who owns that VM has every reason to want the query to succeed and every reason to make sure that nobody else can ever read the index it builds: not the hypervisor it runs on, not the host firmware below it, not the Microsoft engineer holding the on-call pager, not even a court-ordered datacentre raid carried out with full physical access to the rack.&lt;/p&gt;
&lt;p&gt;As of 2026, that is not a thought experiment. It is the contract Azure signs when you provision a &lt;code&gt;DCasv5&lt;/code&gt; or &lt;code&gt;DCesv5&lt;/code&gt; confidential VM [@msdocs-overview-products]. And the contract has a shape -- an architecturally enforced shape rooted in two distinct CPU mechanisms, wrapped in an open-source Rust paravisor [@openhcl-blog], verified by a policy-driven attestation service [@msdocs-maa-overview], and dented by four published 2024 attacks that this article will name in order.&lt;/p&gt;
&lt;p&gt;The Confidential Computing Consortium defines the contract in one sentence: &quot;Confidential Computing protects data in use by performing computation in a hardware-based, attested Trusted Execution Environment&quot; [@ccc-about]. That sentence finishes a longer thought. Data at rest gets BitLocker and full-disk encryption. Data in transit gets TLS. Data in use -- the gigabytes that sit in DRAM while a process actually computes against them -- has historically been the unencrypted leg of a three-legged stool.&lt;/p&gt;

A virtual machine whose memory and CPU state are cryptographically protected from the host hypervisor and the cloud operator&apos;s infrastructure, and whose configuration is bound to a hardware-rooted attestation report a remote verifier can check. The Confidential Computing Consortium&apos;s framing is the canonical one: &quot;These secure and isolated environments prevent unauthorized access or modification of applications and data while in use&quot; [@ccc-about].

A computing environment whose confidentiality, integrity, and attestability are enforced by hardware mechanisms below the level of the operating system. A TEE may be process-scoped (Intel SGX enclaves), VM-scoped (AMD SEV-SNP, Intel TDX), or board-scoped (AWS Nitro Enclaves). The Confidential VM is the VM-scoped specialisation.
&lt;p&gt;Three concrete workloads make the contract operationally legible. A regulated clean room running joint analytics over patient genomes between an academic medical centre and a pharmaceutical sponsor, where the contract literally forbids the sponsor&apos;s staff from reading raw genotypes. A multi-party anti-money-laundering analytic between two competing banks who will share encrypted features but not raw transactions. A sovereign-cloud control plane that must not leak to the hyperscaler&apos;s host kernel under any subpoena. In each case the threat model treats the cloud operator as semi-trusted at best and adversarial at worst, and in each case the customer wants the cipher engine to live below the operator&apos;s reach.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Encryption at rest hides bytes on storage. Encryption in transit hides bytes on the wire. Encryption in use is the missing third leg -- the one that asks the cipher engine to live inline with the memory controller, so that a VM&apos;s working set never appears in plaintext to anyone but the VM itself. That is what AMD SEV-SNP and Intel TDX do at the silicon layer, and what Azure productises with the OpenHCL paravisor and Microsoft Azure Attestation [@ccc-about; @msdocs-azure-cvm].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture that makes this contract real takes vocabulary from Internet standards as well as silicon. RFC 9334, published in January 2023, gives us the verifier / evidence / relying party language we will use throughout the article [@rfc9334]. An &lt;em&gt;attester&lt;/em&gt; (the guest VM plus the paravisor) generates &lt;em&gt;evidence&lt;/em&gt; (a hardware attestation report plus a vTPM quote). A &lt;em&gt;verifier&lt;/em&gt; (Microsoft Azure Attestation in Azure&apos;s case) checks the evidence against a policy and emits an &lt;em&gt;attestation result&lt;/em&gt; (a signed JWT). A &lt;em&gt;relying party&lt;/em&gt; (Azure Key Vault, or any customer service) consumes the result and decides whether to release a secret. The article you are reading is, at heart, a tour of how a SEV-SNP or TDX guest, an OpenHCL paravisor, and Microsoft Azure Attestation realise that abstract diagram on commodity silicon.&lt;/p&gt;
&lt;p&gt;That leads to the obvious question. How can a CPU enforce that even the hypervisor cannot read RAM? And once it can, why does a single mechanism turn out to be insufficient -- why does the architecture need a separate integrity rail on top? The next two sections trace the wrong answers that came first.&lt;/p&gt;
&lt;h2&gt;2. Why enclaves were not enough&lt;/h2&gt;
&lt;p&gt;In August 2016 David Kaplan stood on the USENIX Security stage in Austin and described &quot;two new x86 ISA features developed by AMD&quot; that he called &quot;the first general-purpose memory encryption features to be integrated into the x86 architecture&quot; [@usenix-kaplan-2016]. Kaplan was, in the conference biography&apos;s words, the &quot;lead architect for the AMD memory encryption features&quot; [@usenix-kaplan-2016]. His argument was deceptively simple. An enclave that lives inside a single process is the wrong unit of confidential computation for a cloud workload. The workloads customers actually run -- database engines, analytic services, language runtimes -- want gigabytes of working memory, multiple threads, and an unmodified operating system. None of that fits inside a roughly 96-MiB SGX enclave [@costan-devadas-2016].&lt;/p&gt;
&lt;p&gt;Two design ancestors set the shape of the problem before either AMD or Intel solved it.&lt;/p&gt;
&lt;p&gt;The first ancestor is the Trusted Platform Module. The TCG TPM specification dates back to 2003, when &quot;the first TPM version that was deployed was 1.1b&quot; [@wiki-tpm]. TPM 2.0 was announced on April 9, 2014 [@wiki-tpm] and standardised as ISO/IEC 11889. The TPM contributed three concepts that remain load-bearing two decades later: &lt;em&gt;platform configuration registers&lt;/em&gt; (the extend-only PCR digests that a measured-boot chain builds), &lt;em&gt;attestation identity keys&lt;/em&gt;, and a &lt;em&gt;quote&lt;/em&gt; operation that signs PCR state with a key whose origin a remote verifier can trust. The TPM is not a TEE in the modern sense -- it does not host computation -- but it is the first widely deployed device that lets a remote party gain cryptographic assurance about what a machine is running. Every confidential VM design ships a TPM-shaped attestation surface inside it.&lt;/p&gt;
&lt;p&gt;The second ancestor is Intel Software Guard Extensions. Designed at the HASP 2013 workshop and delivered on Skylake in 2015 [@costan-devadas-2016], SGX introduced the &lt;em&gt;enclave&lt;/em&gt;: a process-scoped TEE backed by the Enclave Page Cache, a CPU-managed memory region whose contents are decrypted only inside the cache. Programs enter and leave through &lt;code&gt;ENCLU&lt;/code&gt;-family instructions; cross-domain calls use a partitioned model called &lt;code&gt;ECALL&lt;/code&gt; / &lt;code&gt;OCALL&lt;/code&gt;; remote attestation is mediated by Intel through a quoting enclave. SGX worked, in the strict sense that the threat model included even a malicious operating system. But three things kept it from generalising.&lt;/p&gt;

A CPU-protected DRAM region that holds an SGX enclave&apos;s working memory in encrypted, integrity-checked form. On early Skylake / Kaby Lake parts the EPC was capped at approximately 128 MiB physical with between ~93 and 96 MiB usable depending on BIOS reservation after reserved EPCM metadata accounting [@costan-devadas-2016]. Anything beyond the cap paged through the encrypted-page-eviction path with a substantial performance cliff, which is one of the architectural reasons SGX did not generalise to whole-VM cloud workloads.
&lt;p&gt;The EPC cap was the first. A working set of ~96 MiB is fine for a key-wrapping service or a small ML model, but it is not a cloud-database VM. The second was the partitioned programming model. Real applications had to be split into trusted and untrusted halves with explicit &lt;code&gt;ECALL&lt;/code&gt; / &lt;code&gt;OCALL&lt;/code&gt; boundaries, which is a refactoring tax that few existing codebases would pay. The third was the side-channel question: Foreshadow [@foreshadow], SgxPectre [@sgxpectre], and SGAxe [@sgaxe] each demonstrated that a determined attacker with microarchitectural access could extract secrets from SGX, often without ever defeating the cipher itself.Microsoft&apos;s response was &lt;em&gt;Haven&lt;/em&gt;, an OSDI 2014 project that put a Windows library OS (Drawbridge) inside an SGX enclave to run unmodified Windows binaries. Haven worked as a proof of concept but was effectively obviated by the EPC cap and by the slow pace of SGX silicon delivery in Xeon-class CPUs. The library-OS-in-an-enclave became one of several dead ends on the road to whole-VM TEEs.&lt;/p&gt;
&lt;p&gt;Microsoft staked Azure publicly to &quot;data in use&quot; on September 14, 2017, when Mark Russinovich announced Azure confidential computing on the company blog: &quot;Microsoft Azure is the first cloud to offer new data security capabilities with a collection of features and services called Azure confidential computing&quot; [@russinovich-azure-2017]. The same post named the initial backing TEEs. &quot;Initially we support two TEEs, Virtual Secure Mode and Intel SGX. Virtual Secure Mode (VSM) is a software-based TEE that&apos;s implemented by Hyper-V in Windows 10 and Windows Server 2016&quot; [@russinovich-azure-2017]. VSM was already the substrate of Credential Guard and HVCI inside the operating system; pulling it up as a &quot;TEE the cloud customer can target&quot; was the bridge between the in-OS Secure Kernel story and the eventually-needed silicon-rooted CVM.&lt;/p&gt;
&lt;p&gt;The industry got organised two years later. The Confidential Computing Consortium formed under the Linux Foundation on October 17, 2019. The press release names the founding premiere members verbatim: &quot;Alibaba, Arm, Google Cloud, Huawei, Intel, Microsoft and Red Hat&quot; and the general members &quot;Baidu, ByteDance, decentriq, Fortanix, Kindite, Oasis Labs, Swisscom, Tencent and VMware&quot; [@lf-ccc-press]. An earlier Microsoft Open Source blog post on August 21, 2019, announced the formation with a slightly different membership list (including IBM but not Huawei) [@ms-ccc-blog]; the October press release is the formal founding roster.&lt;/p&gt;

Across three load-bearing AMD whitepapers -- SME/SEV (2016), SEV-ES (February 17, 2017), and SEV-SNP (January 9, 2020) -- the PDF cover-page metadata records &quot;David Kaplan&quot; as the named author [@amd-mem-enc-whitepaper; @amd-sev-es-whitepaper; @amd-snp-whitepaper], and the USENIX Security 2016 biography corroborates &quot;lead architect for the AMD memory encryption features&quot; [@usenix-kaplan-2016]. Across the parallel Intel artefacts -- the September 2020 TDX whitepaper and the Architecture Specification doc 344425-001 -- PDF metadata names only &quot;Intel Corporation&quot; as the institutional author and does not enumerate individual architects [@intel-tdx-spec-344425]. We name David Kaplan throughout because the documentary record names him; we deliberately do not name individual Intel architects because the documentary record does not.

flowchart TD
    Data[&quot;Customer data&quot;] --&amp;gt; Rest[&quot;At rest -- BitLocker, SED, KMS&quot;]
    Data --&amp;gt; Transit[&quot;In transit -- TLS 1.3, IPsec&quot;]
    Data --&amp;gt; Use[&quot;In use -- ?&quot;]
    Use --&amp;gt; CVM[&quot;Confidential VMs -- SEV-SNP / Intel TDX&quot;]
    CVM --&amp;gt; Para[&quot;Paravisor -- OpenHCL&quot;]
    Para --&amp;gt; MAA[&quot;MAA verifier&quot;]
&lt;p&gt;If a TEE has to be smaller than a single page cache, the unit of confidential computation is wrong. What if the unit were a whole VM, and the cipher engine lived inline with the memory controller? The next section is the first time someone tried.&lt;/p&gt;
&lt;h2&gt;3. Generation 1 and 1.5: confidentiality without integrity&lt;/h2&gt;
&lt;p&gt;April 2016. David Kaplan, Jeremy Powell, and Tom Woller publish the AMD whitepaper &lt;em&gt;AMD Memory Encryption&lt;/em&gt; [@amd-mem-enc-whitepaper]. The paper introduces two features in a single document. Secure Memory Encryption (SME) is a chassis-wide bulk cipher: a per-boot AES-128 key, managed by the on-die AMD Secure Processor, encrypts main memory transparently to the operating system. Secure Encrypted Virtualization (SEV) takes the same engine and gives each VM its own AES key tagged into an Address Space Identifier (ASID) in the cache, so two co-resident VMs cannot read each other&apos;s memory and neither can the hypervisor. The &quot;C-bit&quot; in the guest page table marks which pages are encrypted [@amd-mem-enc-whitepaper]. The first silicon to ship SEV was the first-generation EPYC &quot;Naples&quot; launched June 20, 2017 [@wiki-epyc].&lt;/p&gt;

A high physical-address bit in an AMD SEV guest&apos;s page-table entries that signals to the memory controller &quot;this page is encrypted with my VM&apos;s key.&quot; The C-bit is the per-page opt-in that lets a SEV guest mix encrypted private memory with explicitly shared bounce buffers in the same address space. Its absence means a page is cleartext to the hypervisor; its presence means the AES engine in the memory controller decrypts on every read and encrypts on every write [@amd-mem-enc-whitepaper].
&lt;p&gt;The threat model was clear and the architecture was honest about it. The hypervisor sees ciphertext on every encrypted page. What the architecture did &lt;em&gt;not&lt;/em&gt; do, and what the original whitepaper did &lt;em&gt;not&lt;/em&gt; claim, was integrity. The hypervisor remained authoritative over the nested page tables -- it could remap which host physical page a given guest physical address pointed to, and the cipher engine would happily decrypt whatever blob it found under the same key.&lt;/p&gt;
&lt;p&gt;That gap produced the architectural lesson.&lt;/p&gt;
&lt;h3&gt;SEVered (Morbitzer et al., EuroSec 2018)&lt;/h3&gt;
&lt;p&gt;In May 2018, four authors from Fraunhofer AISEC -- Mathias Morbitzer, Manuel Huber, Julian Horsch, and Sascha Wessel -- published a paper whose abstract is unambiguous: &quot;We present the design and implementation of SEVered, an attack from a malicious hypervisor capable of extracting the full contents of main memory in plaintext from SEV-encrypted virtual machines&quot; [@severed-arxiv]. The attack did not break the cipher. It exploited the fact that a malicious hypervisor could &lt;em&gt;remap&lt;/em&gt; a page known to contain a particular plaintext (say, a known string in a network response served by the guest) and observe that the same ciphertext block now appeared at the address corresponding to the secret it wanted. Because there was no architectural binding between a guest physical address and the ciphertext that should sit there, the hypervisor could read the entire VM by chaining such remappings.&lt;/p&gt;

We present the design and implementation of SEVered, an attack from a malicious hypervisor capable of extracting the full contents of main memory in plaintext from SEV-encrypted virtual machines. -- Morbitzer, Huber, Horsch, Wessel, EuroSec&apos;18 [@severed-arxiv]
&lt;p&gt;The architectural lesson, stated as bluntly as the paper deserves, is that confidentiality without integrity is not confidentiality.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; Confidentiality without integrity is not confidentiality. The hypervisor that can move ciphertext between addresses is the hypervisor that can read it. The integrity of the guest-physical-to-host-physical mapping is as load-bearing as the cipher itself.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;SEV-ES (February 2017): half a fix&lt;/h3&gt;
&lt;p&gt;AMD&apos;s first response was SEV-ES, dated February 17, 2017 in the whitepaper&apos;s PDF cover page [@amd-sev-es-whitepaper]. SEV-ES introduced register-state encryption on VMEXIT. Before SEV-ES, every VM exit handed the hypervisor a complete dump of guest CPU registers, including pointers into otherwise-encrypted memory. SEV-ES encrypted the saved register state under the guest key, surfaced a new &lt;code&gt;#VC&lt;/code&gt; (VMM Communication) exception (vector 29), and required the guest to use a deliberately shared page called the Guest-Hypervisor Communication Block (GHCB) for everything that genuinely needed to cross the boundary -- emulated I/O, MMIO, time, the works.&lt;/p&gt;

A page that a SEV-ES (and later SEV-SNP) guest deliberately shares with the hypervisor for the purposes of communicating about events the hypervisor genuinely needs to handle: emulated I/O, MMIO accesses, certain control-plane operations. The GHCB is the explicit, audited &quot;side channel&quot; through the trust boundary. Everything else stays encrypted [@amd-sev-es-whitepaper].
&lt;p&gt;SEV-ES closed one channel and left the other open. The integrity of the GPA-to-HPA mapping was still the hypervisor&apos;s problem to behave on, and the cipher was still XEX-mode AES without any keyed authentication. Two more papers made the architectural pressure unbearable.&lt;/p&gt;
&lt;h3&gt;ICUP (Buhren et al., CCS 2019) and SEVurity (Wilke et al., S&amp;amp;P 2020)&lt;/h3&gt;
&lt;p&gt;In August 2019, Robert Buhren, Christian Werling, and Jean-Pierre Seifert published &lt;em&gt;Insecure Until Proven Updated&lt;/em&gt; [@icup-arxiv]. The abstract makes the operational point cleanly: &quot;We demonstrate that it is possible to extract critical CPU-specific keys that are fundamental for the security of the remote attestation protocol. This effectively renders the SEV technology on current AMD Epyc CPUs useless when confronted with an untrusted cloud provider&quot; [@icup-arxiv]. The mechanism was a firmware rollback against the AMD-SP that exposed attestation keys.&lt;/p&gt;
&lt;p&gt;In May 2020, Wilke, Wichelmann, Morbitzer, and Eisenbarth published &lt;em&gt;SEVurity: No Security Without Integrity&lt;/em&gt; at IEEE S&amp;amp;P [@sevurity-uzl]. Their two new methods, the project-page abstract records verbatim, &quot;allow us to inject arbitrary code into SEV-ES secured virtual machines. Due to the lack of proper integrity protection, it is sufficient to reuse existing ciphertext to build a high-speed encryption oracle&quot; [@sevurity-uzl]. The architectural diagnosis was now overdetermined: integrity had to enter the design, not as a side feature, but as a load-bearing rail.The same Buhren-led group escalated to physical fault injection in August 2021 with &lt;em&gt;One Glitch to Rule Them All&lt;/em&gt;, voltage-glitching the AMD Secure Processor on Zen 1 / 2 / 3 to extract custom payloads [@one-glitch-arxiv]. The PSPReverse GitHub artefact contains the supporting tooling [@pspreverse-github]. This is the &lt;em&gt;physical-fault&lt;/em&gt; lower bound on the AMD-SP: an adversary with the right glitcher can subvert the security processor itself. The SEV-SNP design assumes a logical adversary; physical-access adversaries remain a known residual that §8 will revisit.&lt;/p&gt;
&lt;h3&gt;Intel&apos;s parallel road: TME and MKTME&lt;/h3&gt;
&lt;p&gt;Intel&apos;s bottom-of-stack cipher engine ran on a parallel track. In December 2017, Intel published &lt;em&gt;Architecture Memory Encryption Technologies Specification&lt;/em&gt;, document 336907 rev 1.1 [@intel-mem-enc-spec-336907], introducing Total Memory Encryption (TME). The multi-key successor, MKTME (later TME-MK), surfaced publicly through a September 7, 2018 Linux-kernel RFC by Alison Schofield archived on LWN: &quot;Multi-Key Total Memory Encryption API (MKTME) ... allows multiple encryption domains, each having their own key. While the main use case for the feature is virtual machine isolation&quot; [@lwn-mktme]. TME-MK is the per-keyID memory cipher that the eventual Intel TDX architecture will mount its trust-domain model on top of.&lt;/p&gt;
&lt;p&gt;Three papers, two vendors, one architectural verdict: confidentiality without integrity is not confidentiality, and the architecture has to change. What did AMD and Intel actually build in response?&lt;/p&gt;

flowchart LR
    SME[&quot;SME (2016) -- Bulk memory cipher&quot;]
    SEV[&quot;SEV (Naples, 2017) -- Per-VM AES key&quot;]
    ES[&quot;SEV-ES (Feb 2017) -- + Register-state cipher&quot;]
    SNP[&quot;SEV-SNP (Jan 2020) -- + Integrity rail&quot;]
    SME --&amp;gt; SEV
    SEV -- &quot;SEVered -- (EuroSec 2018)&quot; --&amp;gt; ES
    ES -- &quot;ICUP (CCS 2019) -- SEVurity (S&amp;amp;P 2020)&quot; --&amp;gt; SNP
&lt;h2&gt;4. Generation 2: the integrity rail&lt;/h2&gt;
&lt;p&gt;January 9, 2020. AMD publishes the 20-page SEV-SNP whitepaper, sole-authored by David Kaplan, with the title &lt;em&gt;Strengthening VM Isolation with Integrity Protection and More&lt;/em&gt; [@amd-snp-whitepaper]. Eight months later, in September 2020, Intel publishes the first public TDX whitepaper (document 343961-002US, filename &lt;code&gt;tdx-whitepaper-final9-17.pdf&lt;/code&gt;, PDF creation date Thursday September 17, 2020) and the companion Architecture Specification doc 344425-001 dated September 1, 2020 [@intel-tdx-spec-344425]. Two vendors, two different architectural answers, one shared diagnosis: the hypervisor must be excluded from the GPA-to-HPA mapping, not just from the ciphertext.Wikipedia describes Intel TDX as &quot;proposed by Intel in May 2021&quot; [@wiki-tdx], but the PDF cover-page metadata extracted from both the TDX whitepaper and the Architecture Specification places the public release in September 2020. Where Wikipedia and the Intel-authored PDFs disagree, the PDFs are the primary record.&lt;/p&gt;
&lt;h3&gt;AMD SEV-SNP: four ingredients&lt;/h3&gt;
&lt;p&gt;SEV-SNP keeps the per-VM AES cipher from SEV and the register-state encryption from SEV-ES, and adds four new architectural ingredients that together close the integrity gap.&lt;/p&gt;
&lt;p&gt;The first is the &lt;em&gt;Reverse Map Table&lt;/em&gt; (RMP). The RMP is a system-wide per-page metadata table consulted on every nested page-table walk. Each entry binds a host physical page to the tuple &lt;code&gt;(assigned ASID, expected guest physical address, VMPL, immutable bit, validated bit)&lt;/code&gt;. If the hypervisor tries to remap a guest physical address to a different host page, the RMP entry will fail to match and the CPU raises an &lt;code&gt;#NPF(rmpfault)&lt;/code&gt;. The architecture&apos;s own description is verbatim: &quot;SEV-SNP adds strong memory integrity protection to help prevent malicious hypervisor-based attacks like data replay, memory re-mapping, and more to create an isolated execution environment&quot; [@amd-sev-portal]. This is the integrity rail. It is not a separate keyed MAC over memory; it is a structural binding that turns SEVered-class remappings into faults.&lt;/p&gt;

A system-wide AMD SEV-SNP data structure that records, for every host physical page, the guest ASID it belongs to, the guest physical address it is mapped at, the VMPL ACL, an immutable flag, and a validated flag. Every nested page-table walk consults the RMP; mismatches raise `#NPF(rmpfault)`. The RMP is the architectural answer to SEVered: the hypervisor remains in charge of nested page tables, but the RMP says what each host page is allowed to be used for [@amd-snp-whitepaper; @amd-sev-portal].
&lt;p&gt;The second is the &lt;code&gt;PVALIDATE&lt;/code&gt; instruction. A SEV-SNP guest must explicitly &lt;em&gt;validate&lt;/em&gt; a page before it uses it for confidential storage. The hypervisor cannot fake validation; if the page has not been validated by the guest, accesses fault. This pushes the responsibility for tracking &quot;is this page really part of my private memory&quot; into the guest, where the hypervisor cannot lie about it.&lt;/p&gt;
&lt;p&gt;The third is the Virtual Machine Privilege Level lattice.&lt;/p&gt;

A four-level privilege lattice (VMPL0 highest, VMPL3 lowest) introduced by AMD SEV-SNP. Each RMP entry includes per-VMPL access-control bits, so a single SEV-SNP guest can split itself into multiple ring-shaped partitions where a higher-VMPL component (for example, a paravisor at VMPL0) sees pages that a lower-VMPL component (the customer&apos;s kernel at VMPL2) cannot. VMPL appears as a field inside the SNP_REPORT, so a remote verifier can tell which VMPL produced a given quote [@amd-snp-whitepaper].
&lt;p&gt;The fourth is the attestation report. The SNP_REPORT is an ECDSA-P384 signed blob produced by the AMD-SP, carrying fields including the launch &lt;em&gt;measurement&lt;/em&gt;, the guest &lt;em&gt;policy&lt;/em&gt;, the user-supplied &lt;em&gt;report_data&lt;/em&gt; nonce, the issuing &lt;em&gt;vmpl&lt;/em&gt;, the unique &lt;em&gt;chip_id&lt;/em&gt;, and the &lt;em&gt;tcb_version&lt;/em&gt;. The signing key is the Versioned Chip Endorsement Key (VCEK), derived per chip per TCB version from a long-lived endorsement key, and the certificate chain runs &lt;code&gt;VCEK_cert -&amp;gt; ASK -&amp;gt; AMD root&lt;/code&gt; [@amd-sev-portal].&lt;/p&gt;

The AMD SEV-SNP attestation signing key. Derived deterministically from each chip&apos;s individual endorsement secret and the current TCB version (firmware level), so a single chip exposes one VCEK per TCB version. The certificate chain anchors back to AMD&apos;s root via the AMD Signing Key (ASK). The VCEK is what makes SEV-SNP attestation chain to silicon: the verifier checks the SNP_REPORT signature against a VCEK certificate AMD will only issue for genuine AMD-SP firmware [@amd-snp-whitepaper; @amd-sev-portal].

SEV-SNP adds strong memory integrity protection to help prevent malicious hypervisor-based attacks like data replay, memory re-mapping, and more in order to create an isolated execution environment. -- AMD SEV-SNP whitepaper, January 2020 [@amd-snp-whitepaper]

sequenceDiagram
    autonumber
    participant Guest as Guest CPU access
    participant NPT as Nested Page Walker
    participant RMP as Reverse Map Table
    participant AES as AES engine (memory ctrl)
    Guest-&amp;gt;&amp;gt;NPT: Resolve GVA -&amp;gt; GPA -&amp;gt; HPA
    NPT-&amp;gt;&amp;gt;RMP: Lookup (HPA)
    RMP--&amp;gt;&amp;gt;NPT: ASID, expected GPA, VMPL
    alt RMP entry matches request
        NPT-&amp;gt;&amp;gt;AES: Decrypt under VM key
        AES--&amp;gt;&amp;gt;Guest: Plaintext
    else Mismatch (SEVered-style remap)
        RMP--&amp;gt;&amp;gt;Guest: #NPF (rmpfault)
    end
&lt;h3&gt;Intel TDX: a different geometry, the same end-state&lt;/h3&gt;
&lt;p&gt;Intel reached the same architectural conclusion with a different mechanism. Rather than bake integrity into microcode plus the AMD-SP, Intel introduced a new CPU mode and a separately signed software module that runs in it. The Intel TDX overview is verbatim: &quot;A CPU-measured Intel TDX module enables Intel TDX. This software module runs in a new CPU Secure Arbitration Mode (SEAM) as a peer virtual machine manager (VMM) ... hosted in a reserved memory space identified by the SEAM Range Register (SEAMRR)&quot; [@intel-tdx-overview].&lt;/p&gt;
&lt;p&gt;The ingredients are seven, not four.&lt;/p&gt;

A new CPU privilege state introduced by Intel TDX. Code running in SEAM is hosted in a physical-memory range identified by the SEAM Range Register (SEAMRR) that the legacy VMM cannot inspect. Only the signed Intel TDX Module runs in SEAM, and it does so as a peer VMM that mediates every interaction between the legacy hypervisor and a Trust Domain [@intel-tdx-overview].
&lt;p&gt;The Intel &lt;strong&gt;TDX Module&lt;/strong&gt; is the second ingredient: a CPU-measured firmware binary, loaded by the SEAMLDR at boot, that mediates every entry into and exit from a Trust Domain via &lt;code&gt;SEAMCALL&lt;/code&gt; and &lt;code&gt;SEAMRET&lt;/code&gt; instructions. The Intel-signed &lt;code&gt;intel-tdx-module-1.5-base-spec-348549002.pdf&lt;/code&gt; is the canonical specification for the current generation [@intel-tdx-module-base-348549].&lt;/p&gt;
&lt;p&gt;The third is the &lt;strong&gt;Trust Domain&lt;/strong&gt;, a VM-shaped container that carries a &lt;em&gt;Shared Bit&lt;/em&gt; in the guest physical address. A clear shared bit means the page is private; a set shared bit means the page is deliberately shared with the hypervisor for I/O bounce buffers. The fourth is &lt;strong&gt;TME-MK&lt;/strong&gt; memory encryption, derived from the December 2017 TME spec [@intel-mem-enc-spec-336907] and the September 2018 MKTME Linux-kernel RFC [@lwn-mktme]: AES-128 in XTS mode, with the keyID embedded in the upper physical-address bits, gives one key per Trust Domain.&lt;/p&gt;
&lt;p&gt;The fifth ingredient is the structural analogue of AMD&apos;s RMP, the &lt;strong&gt;Physical-Address-Metadata table&lt;/strong&gt; (PAMT). The Intel TDX overview enumerates the architectural elements precisely: &quot;Intel TDX uses architectural elements such as SEAM, a shared bit in Guest Physical Address (GPA), secure Extended Page Table (EPT), physical-address-metadata table, Intel Total Memory Encryption -- Multi-Key (Intel TME-MK), and remote attestation&quot; [@intel-tdx-overview].&lt;/p&gt;
&lt;p&gt;The sixth ingredient is the measurement registers. The &lt;strong&gt;MRTD&lt;/strong&gt; is the build-time measurement of the initial TD image, similar to a TPM PCR fixed at launch. &lt;strong&gt;RTMR0 through RTMR3&lt;/strong&gt; are the runtime measurement registers, four PCR-equivalents the TDX Module exposes for runtime measured-boot extensions. These four registers are what a TDX-aware Trusted Boot chain extends.&lt;/p&gt;

The build-time and runtime measurement registers exposed by an Intel TDX Trust Domain. MRTD is hashed by the TDX Module over the initial TD launch image and is the SEAM analogue of an immutable launch PCR. RTMR0-3 are four extendable runtime registers, the SEAM analogue of the runtime-extension TPM PCRs (the same conceptual role as PCRs 8-15 in the canonical static-OS measurement chain), that hold a measured-boot chain of subsequent components (loaders, kernel, initrd, paravisor pages). The canonical TDX-vTPM event-log convention used by Linux IMA and systemd-stub maps RTMR[0] to PCR[1, 7]; RTMR[1] to PCR[2-6]; RTMR[2] to PCR[8-9]; and RTMR[3] to PCR[14, 17-22]. A TD Quote carries all five values; a verifier evaluates them against a customer-defined policy [@intel-tdx-overview; @intel-tdx-spec-344425].
&lt;p&gt;The seventh is the &lt;strong&gt;TD Quote&lt;/strong&gt;. A TD Quote is produced in two stages. The TD guest first issues &lt;code&gt;TDCALL[TDG.MR.REPORT]&lt;/code&gt;, which lands in the TDX Module (the VMM-to-Module entry is the separate &lt;code&gt;SEAMCALL&lt;/code&gt; interface defined in the comparison table below); the TDX Module returns an in-SEAM &lt;code&gt;SEAMREPORT&lt;/code&gt; structure, a Report MAC-signed with a key bound to the platform. A host-side SGX Quoting Enclave then converts that Report into a Quote signed with the SGX-resident QE attestation key. The Quote carries MRTD, RTMR0-3, the TD&apos;s TCB SVN (a per-component firmware version vector), and a caller nonce. The Intel Trust Authority (or Microsoft Azure Attestation, or Google&apos;s verifier) checks the quote [@intel-tdx-overview; @intel-tdx-module-base-348549].&lt;/p&gt;

flowchart TB
    HW[&quot;Silicon: TME-MK + SEAMRR -- + Secure EPT + PAMT&quot;]
    SEAM[&quot;Intel TDX Module -- (SEAM mode)&quot;]
    VMM[&quot;Legacy VMM -- (Hyper-V / KVM)&quot;]
    TD1[&quot;Trust Domain 1&quot;]
    TD2[&quot;Trust Domain 2&quot;]
    HW --&amp;gt; SEAM
    HW --&amp;gt; VMM
    VMM -- &quot;SEAMCALL&quot; --&amp;gt; SEAM
    SEAM -- &quot;SEAMRET&quot; --&amp;gt; VMM
    SEAM -- &quot;TDENTER / TDEXIT&quot; --&amp;gt; TD1
    SEAM -- &quot;TDENTER / TDEXIT&quot; --&amp;gt; TD2
&lt;h3&gt;Side by side&lt;/h3&gt;
&lt;p&gt;The two architectures answer the same question and arrive at the same end-state contract through fundamentally different trust geometries.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Ingredient&lt;/th&gt;
&lt;th&gt;AMD SEV-SNP&lt;/th&gt;
&lt;th&gt;Intel TDX&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;Memory cipher&lt;/td&gt;
&lt;td&gt;AES-128, per-VM key in memory controller&lt;/td&gt;
&lt;td&gt;AES-128-XTS, per-TD key by keyID (TME-MK)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrity binding&lt;/td&gt;
&lt;td&gt;Reverse Map Table per host page&lt;/td&gt;
&lt;td&gt;Physical-Address-Metadata table + Secure EPT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mediating component&lt;/td&gt;
&lt;td&gt;AMD-SP firmware (microcode + on-die security processor)&lt;/td&gt;
&lt;td&gt;Signed Intel TDX Module in SEAM mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privilege lattice&lt;/td&gt;
&lt;td&gt;VMPL0-VMPL3 (four levels)&lt;/td&gt;
&lt;td&gt;TD Partitioning L1/L2 (TDX Module 1.5)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build-time measurement&lt;/td&gt;
&lt;td&gt;Launch measurement in SNP_REPORT&lt;/td&gt;
&lt;td&gt;MRTD inside the TDX Module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runtime measurement&lt;/td&gt;
&lt;td&gt;None at module level (vTPM provides it)&lt;/td&gt;
&lt;td&gt;RTMR0-RTMR3 inside the TDX Module&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attestation signing key&lt;/td&gt;
&lt;td&gt;VCEK (ECDSA-P384), per chip per TCB version&lt;/td&gt;
&lt;td&gt;SGX-resident Quoting Enclave key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Certificate chain&lt;/td&gt;
&lt;td&gt;VCEK -&amp;gt; ASK -&amp;gt; AMD root&lt;/td&gt;
&lt;td&gt;Quoting Enclave -&amp;gt; Intel root&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Page-validation primitive&lt;/td&gt;
&lt;td&gt;&lt;code&gt;PVALIDATE&lt;/code&gt; (guest-driven)&lt;/td&gt;
&lt;td&gt;TDX Module-mediated page acceptance&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shared-page indicator&lt;/td&gt;
&lt;td&gt;C-bit (clear = shared, set = encrypted)&lt;/td&gt;
&lt;td&gt;Shared bit in GPA (set = shared)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hypervisor-to-trust-component call&lt;/td&gt;
&lt;td&gt;Mediated VMRUN&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SEAMCALL&lt;/code&gt; / &lt;code&gt;SEAMRET&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;{`
// Pseudo-code sketch of how a SEV-SNP guest assembles an SNP_REPORT
// via SNP_GUEST_REQUEST. Not runnable against silicon; the point is
// the shape of the evidence the verifier receives.&lt;/p&gt;
&lt;p&gt;function buildSnpReport(nonce32) {
  // Guest builds a request structure with a 32-byte user nonce.
  const request = { reportData: nonce32, vmpl: 0 };&lt;/p&gt;
&lt;p&gt;  // Hypercall lands in the AMD-SP, which signs with the VCEK.
  const report = sp_guest_request(request);&lt;/p&gt;
&lt;p&gt;  return {
    version:        report.version,        // structure version
    guestSvn:       report.guestSvn,       // guest firmware SVN
    policy:         report.policy,         // SEV policy bits at launch
    familyId:       report.familyId,       // 16-byte ID set by launch
    measurement:    report.measurement,    // 48-byte launch measurement
    reportData:     report.reportData,     // echoes user nonce
    vmpl:           report.vmpl,           // VMPL of issuing component
    chipId:         report.chipId,         // 64-byte unique chip ID
    tcbVersion:     report.tcbVersion,     // boot loader / TEE / SNP / microcode SVNs
    signature:      report.signature,      // ECDSA P-384 over the report
  };
}&lt;/p&gt;
&lt;p&gt;// The verifier walks the certificate chain VCEK -&amp;gt; ASK -&amp;gt; AMD root,
// re-checks the signature, and then evaluates policy on the claims.
console.log(JSON.stringify(buildSnpReport(&apos;nonce_from_relying_party&apos;), null, 2));
`}&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; SEV-SNP and TDX answer the same question differently. AMD bakes integrity into microcode plus the AMD-SP, signs with a per-chip per-TCB VCEK, and exposes a four-level VMPL lattice. Intel puts integrity into a separately loaded, separately signed software module running in a new CPU mode, signs with an SGX-resident Quoting Enclave, and exposes L1/L2 partitioning. The trust roots, the breaking surfaces, and the supply chains are different even when the end-state contract is the same.&lt;/p&gt;
&lt;/blockquote&gt;

flowchart LR
    subgraph AMD[&quot;AMD SEV-SNP&quot;]
        A1[&quot;AMD-SP firmware&quot;]
        A2[&quot;Reverse Map Table&quot;]
        A3[&quot;VMPL0-3 lattice&quot;]
        A4[&quot;SNP_REPORT -- VCEK signed&quot;]
    end
    subgraph INTEL[&quot;Intel TDX&quot;]
        I1[&quot;Signed TDX Module&quot;]
        I2[&quot;PAMT + Secure EPT&quot;]
        I3[&quot;L1 / L2 partitioning&quot;]
        I4[&quot;TD Quote -- Quoting Enclave&quot;]
    end
    A1 --- I1
    A2 --- I2
    A3 --- I3
    A4 --- I4
&lt;p&gt;Generation 2 makes a confidential VM architecturally possible. But a SEV-SNP guest is not yet a Windows Server VM you can lift and shift onto Azure -- there is a whole productisation problem still to solve. How does Microsoft put a paravisor inside that trust boundary, and what does it deliver?&lt;/p&gt;
&lt;h2&gt;5. The contract: a cloud-shaped TEE&lt;/h2&gt;
&lt;p&gt;A confidential VM is two rails, not one. Rail 1 is &lt;strong&gt;confidentiality plus integrity&lt;/strong&gt; of memory and CPU state. Rail 2 is &lt;strong&gt;measurement plus attestation&lt;/strong&gt;. SEV-SNP and TDX each deliver both rails. Anyone who has read the equivalent Secure Boot / Trusted Boot story will recognise the shape: a measurement chain anchored in silicon, terminated in a remote verifier, with a signed result that a relying party can act on.&lt;/p&gt;
&lt;p&gt;The Confidential Computing Consortium&apos;s framing, repeated here as a contract the architectures actually realise: &quot;Confidential Computing protects data in use by performing computation in a hardware-based, attested Trusted Execution Environment&quot; [@ccc-about]. &lt;em&gt;Hardware-based&lt;/em&gt; is rail 1. &lt;em&gt;Attested&lt;/em&gt; is rail 2. The two words together are why a TPM-only system, however well-measured, is not a CVM, and why a SEV-only system, however well-encrypted, is not a CVM either.&lt;/p&gt;
&lt;p&gt;RFC 9334 names the actors. The &lt;em&gt;attester&lt;/em&gt; is the guest plus the paravisor producing evidence. The &lt;em&gt;evidence&lt;/em&gt; is the SNP_REPORT or TD Quote, plus optionally a vTPM quote chained to it. The &lt;em&gt;verifier&lt;/em&gt; is the entity that checks the evidence against a policy and emits an attestation result. The &lt;em&gt;relying party&lt;/em&gt; is the consumer who acts on the result -- typically a key vault releasing a wrapped secret [@rfc9334].&lt;/p&gt;

The IETF Remote ATtestation procedureS working group&apos;s RFC 9334 (January 2023) fixes the vocabulary the rest of the confidential-computing industry uses: an *attester* produces *evidence*; a *verifier* checks it against reference values from an *endorser* and a *reference value provider* and emits an *attestation result*; a *relying party* acts on the result. RFC 9334 §5 names two topologies. In the *Passport* model (§5.1), the attester sends evidence directly to the verifier, collects a signed result, and presents that result to the relying party. In the *Background-Check* model (§5.2), the attester sends evidence to the relying party, which forwards it to the verifier and receives the result on the attester&apos;s behalf. Microsoft Azure Attestation, Intel Trust Authority, Google&apos;s verifier, and AWS KMS attestation all implement variants of this model [@rfc9334].
&lt;p&gt;Microsoft Azure Attestation implements the &lt;em&gt;Passport&lt;/em&gt; model. The attester -- the CVM, through its in-guest agent -- sends evidence (an SNP_REPORT or TD Quote, plus a vTPM quote) directly to MAA. MAA validates the evidence against the customer-authored policy and returns a signed JWT. The attester then presents that JWT to the relying party. Azure Key Vault authorises Secure Key Release against the MAA-issued claim set, not against raw SNP evidence. The relying party never sees the SNP_REPORT and never calls MAA on the attester&apos;s behalf, which is the design signature of Passport rather than Background-Check [@rfc9334; @msdocs-maa-overview].&lt;/p&gt;

flowchart LR
    Rail1[&quot;Rail 1 -- Confidentiality + Integrity&quot;] --&amp;gt; Mem[&quot;Encrypted DRAM -- + RMP / PAMT -- + encrypted register state&quot;]
    Rail2[&quot;Rail 2 -- Measurement + Attestation&quot;] --&amp;gt; Ev[&quot;Evidence: -- SNP_REPORT / TD Quote -- + vTPM quote&quot;]
    Ev --&amp;gt; Ver[&quot;Verifier: -- MAA / Intel Trust Authority&quot;]
    Ver --&amp;gt; Tok[&quot;Attestation Result -- (signed JWT)&quot;]
    Tok --&amp;gt; RP[&quot;Relying Party -- (Azure Key Vault)&quot;]
    RP --&amp;gt; Secret[&quot;Wrapped secret release&quot;]
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; A Confidential VM is not a memory-encryption product. It is a contract: confidentiality with integrity, plus an evidence-bearing attestation chain that a relying party can verify before it releases a secret. Anyone who sells you &quot;confidential&quot; infrastructure without rail 2 is selling you half the product.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If this is the contract, how does Azure actually build a usable Windows-guest CVM on top of it? What lives where, and who signs what?&lt;/p&gt;
&lt;h2&gt;6. State of the art on Azure: from silicon to MAA&lt;/h2&gt;
&lt;p&gt;July 20, 2022. Microsoft Azure announces general availability of the DCasv5 and ECasv5 confidential VM SKUs on AMD third-generation EPYC silicon. The Register&apos;s coverage captures the framing: &quot;Microsoft is expanding its Azure confidential computing portfolio with virtual machines that use the encryption and memory protection features of AMD&apos;s third-gen Epyc processors. ... Customers using them can also use the free Microsoft Azure Attestation (MAA) service to remotely verify the operating environment and integrity of the software binaries running on it&quot; [@theregister-azure-cvm]. That is the moment a confidential VM stops being a research paper and starts being a product the customer can pay for by the hour.&lt;/p&gt;
&lt;p&gt;This section walks the Azure stack bottom-up. It is the longest section because it is the article&apos;s reason to exist.&lt;/p&gt;
&lt;h3&gt;The Azure CVM SKU family&lt;/h3&gt;
&lt;p&gt;Microsoft Learn&apos;s confidential-computing products page enumerates the current Azure CVM SKU map. On AMD SEV-SNP: &quot;DCasv5 and ECasv5 enable rehosting of existing workloads&quot; [@msdocs-overview-products]. These are the third-generation EPYC Milan SKUs that went GA in July 2022. The Learn page continues: &quot;DCasv6 and ECasv6 confidential VMs based on fourth-generation AMD EPYC processors are currently in gated preview&quot; [@msdocs-overview-products]. Lenovo Press corroborates that &quot;SEV-SNP is supported on AMD EPYC processors starting with the AMD EPYC 7003 series processors&quot; -- i.e., Milan -- with the third-generation 7003 series being the first SEV-SNP silicon [@lenovo-lp1893].&lt;/p&gt;
&lt;p&gt;On Intel TDX: &quot;DCesv5 and ECesv5&quot; are the fourth-generation Xeon Sapphire Rapids SKUs, generally available. SecurityWeek&apos;s coverage anchors the Sapphire Rapids launch: &quot;Intel announced on Tuesday that it has added Intel Trust Domain Extensions (TDX) to its confidential computing portfolio with the launch of its new 4th Gen Xeon enterprise processors. ... The feature will be available through cloud providers such as Microsoft, Google, IBM and Alibaba&quot; [@securityweek-tdx]. Wikipedia notes that &quot;TDX is available for 5th generation Intel Xeon processors (codename Emerald Rapids) and Edge Enhanced Compute variants of 4th generation Xeon processors (codename Sapphire Rapids)&quot; [@wiki-tdx]. The fifth-generation Emerald Rapids SKUs DCesv6 and ECesv6 are in preview at the time of writing, per the Learn products page [@msdocs-overview-products].&lt;/p&gt;
&lt;p&gt;GPU CVMs anchor on the same CPU-side TEEs and add a GPU TEE. The Learn page describes the NCCadsH100v5 SKU: &quot;NCCadsH100v5 confidential VMs come with a GPU ... use linked CPU and GPU Trusted Execution Environments (TEEs)&quot; [@msdocs-overview-products]. This is the linked-attestation product for confidential AI -- a SEV-SNP host CVM bound by attestation to an NVIDIA H100 in Confidential Compute mode.March 30, 2026 brings a pricing change customers should plan for. Microsoft Learn states: &quot;From March 30 2026, encrypted OS disks will incur higher costs&quot; [@msdocs-azure-cvm]. Confidential OS-disk encryption remains the recommended configuration where the workload requires it; the change is to the billing line, not to the architecture.&lt;/p&gt;
&lt;h3&gt;The paravisor: OpenHCL on OpenVMM&lt;/h3&gt;
&lt;p&gt;The single most important productisation move Azure made is what Microsoft calls a &lt;em&gt;paravisor&lt;/em&gt;. The framing from the October 17, 2024 Tech Community announcement is verbatim: &quot;Microsoft developed the first paravisor in the industry, and for years, we have been enhancing the paravisor offered to Azure customers. This effort now culminates in the release of a new, open source paravisor, called OpenHCL&quot; [@openhcl-blog].&lt;/p&gt;

A thin operating system running inside the trust boundary of a confidential VM, between the host hypervisor and the customer guest. The paravisor exposes the synthetic devices, the vTPM, and the GPA partitioning that a Windows or Linux guest expects from a Hyper-V environment -- without trusting any of those services to the host below the trust boundary. The paravisor is itself part of the TCB, but on Azure the paravisor binary is open source [@openhcl-blog; @openvmm-repo].

Microsoft&apos;s open-source paravisor, released on October 17, 2024. OpenHCL is built on top of OpenVMM, &quot;a modular, cross-platform Virtual Machine Monitor (VMM), written in Rust&quot; [@openvmm-repo]. On Azure SEV-SNP CVMs OpenHCL runs at VMPL0; on TDX CVMs it runs in the L1 partition seat under TD Partitioning [@openhcl-blog; @openvmm-dev]. It mediates virtual devices, brokers the vTPM, manages GPA partitioning between private and shared pages, and handles diagnostics, all inside the trust boundary.

Microsoft developed the first paravisor in the industry, and for years, we have been enhancing the paravisor offered to Azure customers. This effort now culminates in the release of a new, open source paravisor, called OpenHCL. -- Microsoft Tech Community, OpenHCL announcement, October 17, 2024 [@openhcl-blog]
&lt;p&gt;The OpenVMM repository README puts the focus crisply: &quot;OpenVMM is a modular, cross-platform Virtual Machine Monitor (VMM), written in Rust. Although it can function as a traditional VMM, OpenVMM&apos;s development is currently focused on its role in the OpenHCL paravisor&quot; [@openvmm-repo]. The OpenVMM Guide lists the virtualisation APIs OpenVMM supports, including &quot;MSHV (using VSM / TDX / SEV-SNP)&quot; for paravisor mode, WHP for a Windows host, and KVM for a Linux host [@openvmm-dev]. The use cases listed include Azure Boost, Trusted Launch, and Confidential VMs.&lt;/p&gt;
&lt;p&gt;Because OpenHCL is in the TCB, customers do not avoid trusting Microsoft by running it -- but they can now &lt;em&gt;read the source&lt;/em&gt;. That is a categorical change from earlier closed paravisors. The point about a TCB is not its size but its auditability and reviewability.&lt;/p&gt;
&lt;p&gt;The canonical Linux-side analogue is AMD&apos;s &lt;strong&gt;Secure VM Service Module (SVSM)&lt;/strong&gt;, which runs at VMPL0 inside an SEV-SNP guest and provides the same kind of in-trust-boundary services (virtual TPM, paravirtualised I/O brokering, attestation surface) that OpenHCL provides on Azure [@amd-svsm]. SVSM and OpenHCL solve the same problem with different implementations and different signing chains. The Linux community&apos;s reference SVSM is the COCONUT-SVSM open-source project [@coconut-svsm]. A reader who needs a confidential-VM paravisor on a non-Azure Linux host should look at SVSM; a reader who needs it on Azure gets OpenHCL.&lt;/p&gt;
&lt;h3&gt;The vTPM&lt;/h3&gt;
&lt;p&gt;Inside the paravisor&apos;s protected memory, OpenHCL synthesises a per-VM virtual TPM. Microsoft Learn is verbatim: &quot;Azure confidential VMs feature a virtual TPM (vTPM) for Azure VMs. ... Confidential VMs have their own dedicated vTPM instance, which runs in a secure environment outside the reach of any VM&quot; [@msdocs-azure-cvm]. The architectural significance of this single sentence cannot be overstated. The vTPM&apos;s endorsement key is bound at provision time to the SEV-SNP or TDX hardware attestation report, so a vTPM quote can be transitively chained back to silicon: &lt;code&gt;vTPM quote -&amp;gt; EK certificate -&amp;gt; SNP_REPORT or TD Quote -&amp;gt; VCEK or Intel signing root&lt;/code&gt; [@msdocs-azure-cvm].&lt;/p&gt;
&lt;p&gt;The practical consequence is that a Windows Server CVM runs an unmodified Trusted Boot chain inside the guest. PCR-7 still indexes the Secure Boot signer. Code Integrity policies still extend their own PCRs. BitLocker still seals the Volume Master Key to the TPM. None of those operating-system features need to know that the TPM they are talking to is synthesised by OpenHCL inside an SEV-SNP guest -- and yet every one of those features is now anchored, transitively, to AMD or Intel silicon rather than to a discrete TPM chip on a motherboard the cloud customer cannot inspect.&lt;/p&gt;
&lt;h3&gt;Microsoft Azure Attestation&lt;/h3&gt;
&lt;p&gt;The verifier in Azure&apos;s confidential-computing stack is Microsoft Azure Attestation. The Learn overview describes it: &quot;Microsoft Azure Attestation is a unified solution for remotely verifying the trustworthiness of a platform and integrity of the binaries running inside it. The service supports attestation of the platforms backed by Trusted Platform Modules (TPMs) alongside the ability to attest to the state of Trusted Execution Environments (TEEs) such as Intel Software Guard Extensions (SGX) enclaves, Virtualization-based Security (VBS) enclaves ... and Azure confidential VMs&quot; [@msdocs-maa-overview].&lt;/p&gt;

Azure&apos;s unified verifier service for confidential platforms. MAA accepts evidence -- an SNP_REPORT or TD Quote, plus a vTPM quote, plus boot measurements -- evaluates it against a customer-defined attestation policy, and returns a signed JWT carrying the issued claims. MAA&apos;s role in the RATS architecture is the *verifier*, in *Passport* topology: the attester collects MAA&apos;s signed result and presents it to the relying party (Azure Key Vault) [@msdocs-maa-overview; @rfc9334].
&lt;p&gt;The SKR loop is documented verbatim. &quot;When a CVM boots up, SNP report containing the guest VM firmware measurements are sent to Azure Attestation. The service validates the measurements and issues an attestation token that is used to release keys from Managed-HSM or Azure Key Vault. These keys are used to decrypt the vTPM state of the guest VM, unlock the OS disk and start the CVM&quot; [@msdocs-maa-overview].&lt;/p&gt;

The Azure Key Vault / Managed HSM operation that releases a wrapped key only after the requesting party presents a valid Microsoft Azure Attestation token that satisfies the key&apos;s release policy. SKR is what closes the loop between rail 1 (memory protection) and rail 2 (attestation) at the customer&apos;s perimeter: a key never leaves the HSM unless the attesting CVM has been verified [@msdocs-maa-overview; @msdocs-azure-cvm].
&lt;h3&gt;MAA policy v1.2&lt;/h3&gt;
&lt;p&gt;The policy language is the operational surface customers actually interact with. The MAA policy v1.2 grammar has four segments, verbatim from the Microsoft Learn page: &quot;Policy version 1.2 has four segments: version, configurationrules, authorizationrules, issuancerules&quot; [@maa-policy-v12]. The critical operational distinction is between the last two. Authorization rules can fail attestation; issuance rules cannot. The docs are explicit: &quot;&lt;strong&gt;authorizationrules&lt;/strong&gt;: ... These rules can be used to fail attestation. &lt;strong&gt;issuancerules&lt;/strong&gt;: ... These rules can be used to add to the outgoing claim set and the response token. These rules can&apos;t be used to fail attestation&quot; [@maa-policy-v12].&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The most common bug in hand-authored MAA policies is writing a security gate as an issuance rule. If you want a missing SecureBoot value to &lt;em&gt;reject&lt;/em&gt; the attestation, the predicate must live in &lt;code&gt;authorizationrules&lt;/code&gt;. Putting it in &lt;code&gt;issuancerules&lt;/code&gt; only adds a claim to the resulting JWT; the relying party then has to enforce the gate. The verifier will mint the token either way [@maa-policy-v12].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The configuration-rule defaults give you sane behaviour out of the box: &lt;code&gt;require_valid_aik_cert&lt;/code&gt; defaults to &lt;code&gt;true&lt;/code&gt; and &lt;code&gt;required_pcr_mask&lt;/code&gt; defaults to &lt;code&gt;0xFFFFFF&lt;/code&gt; (the first twenty-four PCRs must appear in the quote) [@maa-policy-v12].&lt;/p&gt;
&lt;p&gt;Claim extraction uses JmesPath. The Learn page reproduces a Secure Boot detection rule that the verifier can use to flip a &lt;code&gt;secureBootEnabled&lt;/code&gt; claim:&lt;/p&gt;
&lt;p&gt;{`
// Verbatim from Microsoft Learn (MAA policy v1.2 Secure Boot detection).
// This is JS-style pseudo-code that walks the rule structure, not
// runnable MAA syntax.&lt;/p&gt;
&lt;p&gt;const policyRule = {
  segment: &apos;issuancerules&apos;,
  // &quot;Claim rules&quot; use JmesPath queries against parsed event data.
  step1: {
    when: &apos;type == &quot;events&quot; &amp;amp;&amp;amp; issuer == &quot;AttestationService&quot;&apos;,
    add:  &apos;efiConfigVariables&apos;,
    via:  &quot;Events[?EventTypeString == &apos;EV_EFI_VARIABLE_DRIVER_CONFIG&apos; &quot; +
          &quot;&amp;amp;&amp;amp; ProcessedData.VariableGuid == &apos;8BE4DF61-93CA-11D2-AA0D-00E098032B8C&apos;]&quot;
  },
  // GUID 8BE4DF61-93CA-11D2-AA0D-00E098032B8C is the EFI Global Variable
  // namespace, which is where &apos;SecureBoot&apos; lives.
  step2: {
    issue: &apos;secureBootEnabled&apos;,
    via: &quot;[?ProcessedData.UnicodeName == &apos;SecureBoot&apos;] &quot; +
         &quot;| length(@) == 1 &amp;amp;&amp;amp; @[0].ProcessedData.VariableData == &apos;AQ&apos;&quot;
  },
  // &apos;AQ&apos; is base64(&apos;\x01&apos;), i.e. SecureBoot==1.
  fallback: { issue: &apos;secureBootEnabled&apos;, value: false }
};&lt;/p&gt;
&lt;p&gt;console.log(&apos;Segment :&apos;, policyRule.segment);                 // issuancerules
console.log(&apos;Yields  :&apos;, &apos;secureBootEnabled claim in JWT&apos;);
console.log(&apos;Lesson  :&apos;, &apos;Add this to authorizationrules to actually fail!&apos;);
`}&lt;/p&gt;

sequenceDiagram
    participant E as Evidence (SNP_REPORT + vTPM)
    participant C as configurationrules
    participant A as authorizationrules
    participant I as issuancerules
    participant J as Signed JWT
    E-&amp;gt;&amp;gt;C: parse + defaults -- (require_valid_aik_cert, PCR mask)
    C-&amp;gt;&amp;gt;A: typed claim set
    A--&amp;gt;&amp;gt;A: predicate checks
    alt All authorization rules pass
        A-&amp;gt;&amp;gt;I: continue
        I-&amp;gt;&amp;gt;J: mint claims (secureBootEnabled, x-ms-isolation-tee, ...)
        J--&amp;gt;&amp;gt;E: signed attestation token
    else Any authorization rule fails
        A--&amp;gt;&amp;gt;E: attestation rejected
    end
&lt;h3&gt;The two-axis privilege model: VMPL crossed with VTL&lt;/h3&gt;
&lt;p&gt;A common misconception is that a SEV-SNP CVM makes Virtualization-Based Security inside the guest redundant. The argument goes: &quot;the whole VM is in a TEE, so why do I still need a Secure Kernel?&quot; The architecture answers the question by saying that VMPL and VTL are orthogonal axes.&lt;/p&gt;
&lt;p&gt;The VMPL axis is &lt;em&gt;cloud-operator threat model&lt;/em&gt;. VMPL0 (the OpenHCL paravisor) sees pages that the customer&apos;s kernel at VMPL2 does not, and the host hypervisor below VMPL0 sees none of the encrypted memory at all. VMPL keeps the operator out.&lt;/p&gt;
&lt;p&gt;The VTL axis is &lt;em&gt;intra-guest threat model&lt;/em&gt;. Inside the guest, VTL1 hosts the Secure Kernel, IUM (Isolated User Mode) trustlets like LSAIso for Credential Guard, and the HVCI code-integrity verifier. VTL0 hosts the normal Windows kernel and user mode. VTL keeps a kernel-mode attacker out of LSA secrets and credential blobs. Without VTL, the customer&apos;s own kernel can read its own LSAIso heap; without VMPL, the hypervisor can read the customer&apos;s RAM.&lt;/p&gt;
&lt;p&gt;VBS-inside-CVM is therefore not a duplication. It closes two different attack classes.&lt;/p&gt;

flowchart TB
    subgraph Host[&quot;Host below trust boundary&quot;]
        H[&quot;Hyper-V host kernel -- (no access to encrypted RAM)&quot;]
    end
    subgraph Boundary[&quot;Inside SEV-SNP / TDX trust boundary&quot;]
        subgraph V0[&quot;VMPL0 / L1 TD partition&quot;]
            P[&quot;OpenHCL paravisor -- (synthetic devices, vTPM)&quot;]
        end
        subgraph V2[&quot;VMPL2 / L2 TD partition (customer guest)&quot;]
            subgraph T1[&quot;VTL1 (Secure Kernel)&quot;]
                SK[&quot;Secure Kernel -- + IUM trustlets: -- LSAIso, Credential Guard&quot;]
            end
            subgraph T0[&quot;VTL0 (normal OS)&quot;]
                W[&quot;Windows Server kernel -- + user mode&quot;]
            end
        end
    end
    H -. &quot;blocked by VMPL + -- RMP / PAMT&quot; .-&amp;gt; P
    W -. &quot;blocked by VTL 1 -- VBS / HVCI&quot; .-&amp;gt; SK
    P --&amp;gt; V2
&lt;h3&gt;Confidential Containers: three Azure surfaces&lt;/h3&gt;
&lt;p&gt;Confidential VMs are not the only Azure surface where SEV-SNP attestation can land. There are three more.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential Containers on Azure Container Instances (ACI), GA.&lt;/strong&gt; Microsoft Learn: &quot;Confidential containers on Azure Container Instances are deployed in a container group with a Hyper-V isolated TEE, which includes a memory encryption key generated and managed by an AMD SEV-SNP capable processor&quot; [@msdocs-aci-confidential]. ACI Confidential Containers use &lt;em&gt;confidential computing enforcement&lt;/em&gt; (CCE) policies generated by the &lt;code&gt;confcom&lt;/code&gt; Azure CLI extension, and they expose SNP attestation reports for the SKR sidecar pattern.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential Containers on AKS, preview, sunsetting.&lt;/strong&gt; The Learn AKS page is explicit: &quot;The Confidential Containers preview is set to sunset in March 2026. After this date, customers with existing Confidential Container node pools should expect to see reduced functionality, and you won&apos;t be able to spin up any new nodes with the &lt;code&gt;KataCcIsolation&lt;/code&gt; runtime&quot; [@msdocs-aks-confidential-containers]. Microsoft routes customers to four alternatives: Confidential VM AKS node pools, ACI Confidential Containers, ARO Confidential Containers, and the upstream Confidential Containers project [@msdocs-aks-confidential-containers].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential VM AKS worker nodes, GA.&lt;/strong&gt; A different model -- node-granularity CVM rather than per-pod CVM. Learn: &quot;AKS now supports confidential VM node pools with Azure confidential VMs. These confidential VMs are the generally available DCasv5 and ECasv5 confidential VM-series using 3rd Gen AMD EPYC processors with Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) security features&quot; [@msdocs-aks-cvm-nodes]. This is a lift-and-shift path for existing AKS workloads.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Confidential Containers on ARO&lt;/strong&gt; is the Red Hat OpenShift equivalent, with Kata-isolated per-container SEV-SNP enforcement.&lt;/p&gt;
&lt;p&gt;The cross-cloud parallel is the CNCF Confidential Containers project, accepted to CNCF on March 8, 2022 at the Sandbox maturity level [@cncf-coco]. The project documentation describes it as &quot;an open source project that brings confidential computing to Cloud Native environments, using hardware technology to protect complex workloads&quot; [@coco-docs]. Trustee is the canonical attestation broker on the CNCF side. CoCo&apos;s substrate is Kata Containers&apos; MicroVM model; the TEE backing is currently Linux-only. The open-source community floor under all of this includes Edgeless&apos;s Constellation (historically the canonical confidential-Kubernetes distribution; the upstream repo was archived in 2025-2026 and Edgeless&apos;s successor project Contrast [@contrast] now carries the work forward at the workload-confidential-container layer rather than the whole-cluster layer) [@constellation], COCONUT-SVSM (the AMD-side reference SVSM running at VMPL0) [@coconut-svsm], and the CoCo Trustee attestation broker.&lt;/p&gt;
&lt;h3&gt;NVIDIA H100 CC on NCCadsH100v5&lt;/h3&gt;
&lt;p&gt;The Azure NCCadsH100v5 SKU pairs an SEV-SNP CVM with an NVIDIA H100 in Confidential Compute mode and links the two attestations together. CPU-side rail 1 is SEV-SNP. GPU-side rail 1 is H100 CC. Rail 2 must compose both: the relying party only releases the workload&apos;s key if both attestations check out. Cross-vendor attestation composition is one of the open standards problems §9 will revisit.&lt;/p&gt;

flowchart TB
    subgraph S[&quot;Silicon&quot;]
        AMD[&quot;AMD-SP firmware -- + SEV-SNP RMP&quot;]
        INTEL[&quot;Intel TDX Module -- (SEAM, SEAMRR)&quot;]
    end
    subgraph H[&quot;Host&quot;]
        HV[&quot;Azure Hyper-V -- (below trust boundary)&quot;]
    end
    subgraph P[&quot;Paravisor (in TCB)&quot;]
        OH[&quot;OpenHCL on OpenVMM -- VMPL0 / L1 TD seat&quot;]
        VT[&quot;vTPM synthesised -- by paravisor&quot;]
    end
    subgraph G[&quot;Customer guest&quot;]
        WS[&quot;Windows Server CVM -- (VTL0 + VTL1, VBS / HVCI)&quot;]
    end
    subgraph V[&quot;Verifier&quot;]
        MAA[&quot;Microsoft Azure Attestation -- (policy v1.2)&quot;]
    end
    subgraph R[&quot;Relying party&quot;]
        AKV[&quot;Azure Key Vault / -- Managed HSM (SKR)&quot;]
        APP[&quot;Customer application&quot;]
    end
    AMD --&amp;gt; HV
    INTEL --&amp;gt; HV
    HV --&amp;gt; OH
    OH --&amp;gt; VT
    OH --&amp;gt; WS
    WS -- &quot;SNP_REPORT -- or TD Quote -- + vTPM quote&quot; --&amp;gt; MAA
    MAA -- &quot;Signed JWT&quot; --&amp;gt; AKV
    AKV --&amp;gt; APP
&lt;p&gt;That is the Azure stack. But Azure is not the only design point -- Google and AWS chose different glue, and one of them is on a fundamentally different threat model. How do they compare?&lt;/p&gt;
&lt;h2&gt;7. Competing approaches&lt;/h2&gt;
&lt;p&gt;Three competitors share the design space with very different choices. Two are near-peers to Azure; one is a fundamentally different model that customers routinely confuse for the same product.&lt;/p&gt;
&lt;h3&gt;Google Cloud Confidential VMs&lt;/h3&gt;
&lt;p&gt;Google Cloud supports the same two CPU TEEs. The GCP Confidential VM docs are explicit: &quot;AMD Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) expands on SEV, adding hardware-based security to help prevent malicious hypervisor-based attacks like data replay and memory remapping. Attestation reports can be requested at any time directly from the AMD Secure Processor&quot; [@gcp-cvm-overview]. And on the Intel side: &quot;Intel Trust Domain Extensions (TDX) creates an isolated trust domain (TD) within a VM, and uses hardware extensions for managing and encrypting memory&quot; [@gcp-cvm-overview].&lt;/p&gt;
&lt;p&gt;GCP&apos;s machine-type mapping is direct. AMD SEV / SEV-SNP runs on N2D and C3D; Intel TDX runs on C3 Confidential VMs. The Confidential Computing product hub lists &quot;Confidential VMs on the C3 machine series brings hardware-level protection to your AI models and data&quot; and &quot;Confidential VMs on the accelerator-optimized A3 machine series with NVIDIA H100 GPUs&quot; as the parallel GPU-CC product [@gcp-confidential-overview]. There is a Confidential Space product on top for multi-party analytics, plus Confidential GKE Nodes and Confidential Dataflow.&lt;/p&gt;
&lt;p&gt;The verifier-of-record is Google&apos;s own attestation service, with the guest&apos;s vTPM as the default trust root. Intel Trust Authority is supported as a plug-in alternative for TDX evidence.&lt;/p&gt;

The GCP Confidential VM docs make a claim Azure does not match: &quot;AMD SEV machines that use the N2D and C3D machine types support live migration&quot; [@gcp-cvm-overview]. Live migration of a confidential VM is genuinely hard: the encrypted state has to be re-keyed under the destination host&apos;s per-VM key, and the integrity-rail structures (RMP entries) have to be coherently re-established without ever exposing the plaintext to either host. AMD&apos;s SEV migration helper is the underlying mechanism. Azure does not currently expose live migration on its confidential VM SKUs. This is the most operationally consequential cross-cloud difference today.
&lt;p&gt;A small correction to a widely repeated framing. It is sometimes said that GCP&apos;s confidential offerings are &quot;also SEV-SNP&quot; -- the Stage 0 input to this article said exactly that. Per the GCP docs, GCP supports &lt;strong&gt;both&lt;/strong&gt; SEV-SNP and TDX [@gcp-cvm-overview]. If you are picking a CVM cloud for a multi-vendor strategy, treat GCP as a near-peer to Azure on the CPU dimension and differentiate on the verifier, the SKU mapping, and the live-migration story instead.&lt;/p&gt;
&lt;h3&gt;AWS Nitro Enclaves: a genuinely different model&lt;/h3&gt;
&lt;p&gt;The most common confusion in this design space is the assumption that AWS Nitro Enclaves is &quot;AWS&apos;s confidential VM product.&quot; It is not. It is a different model on a different threat boundary.&lt;/p&gt;
&lt;p&gt;The Nitro Enclaves user guide is unambiguous about the threat model. &quot;AWS Nitro Enclaves is an Amazon EC2 feature that allows you to create isolated execution environments ... Enclaves are separate, hardened, and highly-constrained virtual machines. They provide only secure local socket connectivity with their parent instance. They have no persistent storage, interactive access, or external networking&quot; [@aws-nitro-enclaves]. The same page continues: &quot;Nitro Enclaves is processor agnostic and it is supported on most Intel, AMD, and AWS Graviton-based Amazon EC2 instance types built on the AWS Nitro System&quot; [@aws-nitro-enclaves]. And: &quot;Nitro Enclaves use the same Nitro Hypervisor technology that provides CPU and memory isolation for Amazon EC2 instances&quot; [@aws-nitro-enclaves].&lt;/p&gt;
&lt;p&gt;Three differences matter.&lt;/p&gt;
&lt;p&gt;First, there is no CPU memory cipher. Isolation is enforced by the Nitro hypervisor on a dedicated Nitro System card, not by SEV-SNP or TDX. Memory is in the clear in DRAM, just architecturally walled off by the hypervisor and the hardware root of trust below it.&lt;/p&gt;
&lt;p&gt;Second, attestation signs through the Nitro hypervisor and integrates with AWS KMS. There is no VCEK or TDX Quoting Enclave.&lt;/p&gt;
&lt;p&gt;Third, the threat model is parent-instance and co-tenant isolation, not cloud-operator isolation. Amazon is in the TCB by design. A subpoena or a compromised AWS operator are within the threat model of Azure / GCP CVMs and outside the threat model of Nitro Enclaves.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If your threat model includes a malicious or compelled cloud operator, AWS Nitro Enclaves does not protect you. The Nitro hypervisor enforces the enclave boundary; it is software AWS owns and operates. Use Nitro Enclaves for what it is good at -- a hardened compartment for key material against your own parent instance and your own application bugs. Use SEV-SNP / TDX on Azure or GCP if you need cryptographic protection against the operator&apos;s hypervisor [@aws-nitro-enclaves].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Nitro Enclaves still has a role: it is excellent at isolating a long-lived signing service from a more loosely audited application instance, and four enclaves per parent EC2 host is a generous concurrency budget for that pattern.&lt;/p&gt;
&lt;h3&gt;Confidential Containers and NVIDIA H100 CC&lt;/h3&gt;
&lt;p&gt;The Confidential Containers project crosses cloud boundaries. CNCF accepted it in March 2022 [@cncf-coco]. The project docs describe it as &quot;an open source project that brings confidential computing to Cloud Native environments, using hardware technology to protect complex workloads&quot; [@coco-docs]. The Azure surfaces (ACI, AKS, ARO) were covered in §6; the equivalent on AWS is the Kata Containers + Confidential Containers combination on top of bare-metal Nitro hosts, and on GCP it lands on Confidential GKE Nodes.&lt;/p&gt;
&lt;p&gt;The NVIDIA H100 CC story is roughly cross-cloud parity. Azure NCCadsH100v5 pairs SEV-SNP with H100 CC; Google&apos;s A3 series pairs SEV-SNP and TDX with H100 CC. Cross-vendor attestation composition is the open standards problem on which the relying party experience still depends. On the silicon side, ARM&apos;s Confidential Compute Architecture (CCA, with Area Management Extension) is the ARM-side analogue of SEV-SNP/TDX, and Apple&apos;s Secure Enclave Processor is a board-scoped TEE with a different form factor; both are adjacent VM-scoped or board-scoped TEE designs but out of scope for the cloud-CVM body of this article.&lt;/p&gt;
&lt;h3&gt;The head-to-head matrix&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Azure CVM&lt;/th&gt;
&lt;th&gt;GCP CVM&lt;/th&gt;
&lt;th&gt;AWS Nitro Enclaves&lt;/th&gt;
&lt;th&gt;Confidential Containers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;&lt;tr&gt;
&lt;td&gt;CPU TEE&lt;/td&gt;
&lt;td&gt;SEV-SNP, Intel TDX&lt;/td&gt;
&lt;td&gt;SEV / SEV-SNP, Intel TDX&lt;/td&gt;
&lt;td&gt;None (Nitro hypervisor)&lt;/td&gt;
&lt;td&gt;SEV-SNP, TDX (varies by host)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory cipher&lt;/td&gt;
&lt;td&gt;AES (per-VM, per-TD)&lt;/td&gt;
&lt;td&gt;AES (per-VM, per-TD)&lt;/td&gt;
&lt;td&gt;None (host RAM)&lt;/td&gt;
&lt;td&gt;Inherited from host TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrity rail&lt;/td&gt;
&lt;td&gt;RMP (AMD), PAMT (Intel)&lt;/td&gt;
&lt;td&gt;RMP, PAMT&lt;/td&gt;
&lt;td&gt;Nitro hypervisor isolation&lt;/td&gt;
&lt;td&gt;Inherited from host TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Attestation evidence&lt;/td&gt;
&lt;td&gt;SNP_REPORT, TD Quote, vTPM quote&lt;/td&gt;
&lt;td&gt;SNP_REPORT, TD Quote, vTPM&lt;/td&gt;
&lt;td&gt;Nitro attestation document&lt;/td&gt;
&lt;td&gt;TEE evidence + container measurement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verifier&lt;/td&gt;
&lt;td&gt;Microsoft Azure Attestation&lt;/td&gt;
&lt;td&gt;Google attestation, Intel Trust Authority&lt;/td&gt;
&lt;td&gt;AWS KMS&lt;/td&gt;
&lt;td&gt;Trustee (CNCF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operator threat model&lt;/td&gt;
&lt;td&gt;Yes (operator excluded)&lt;/td&gt;
&lt;td&gt;Yes (operator excluded)&lt;/td&gt;
&lt;td&gt;No (Nitro in TCB)&lt;/td&gt;
&lt;td&gt;Yes (operator excluded)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lift-and-shift Windows&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No (custom enclave format)&lt;/td&gt;
&lt;td&gt;Linux containers only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Live migration of CVM&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes (SEV on N2D / C3D)&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2024-era CVE exposure&lt;/td&gt;
&lt;td&gt;CacheWarp, WeSee, Heckler (SEV-SNP); Heckler (TDX)&lt;/td&gt;
&lt;td&gt;Same upstream CVEs&lt;/td&gt;
&lt;td&gt;Distinct (Nitro hypervisor)&lt;/td&gt;
&lt;td&gt;Inherited from host TEE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Whole VM, container&lt;/td&gt;
&lt;td&gt;Whole VM&lt;/td&gt;
&lt;td&gt;Per enclave (up to 4 per host)&lt;/td&gt;
&lt;td&gt;Per pod / per container&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

flowchart LR
    Nitro[&quot;AWS Nitro Enclaves -- (parent-instance threat model)&quot;]
    Azure[&quot;Azure / GCP CVMs -- (cloud-operator threat model, -- whole VM)&quot;]
    CoCo[&quot;Confidential Containers -- (per pod / per container)&quot;]
    H100[&quot;NVIDIA H100 CC -- (CPU + GPU linked TEE)&quot;]
    Nitro --- Azure
    Azure --- CoCo
    CoCo --- H100
&lt;p&gt;If the contract is settled and the products ship, what is still wrong with this picture? Why do four published papers in 2024 demonstrate extracting secrets from a fully-patched SEV-SNP CVM?&lt;/p&gt;
&lt;h2&gt;8. Theoretical limits and the 2024 attack class&lt;/h2&gt;
&lt;p&gt;May 2, 2024. ETH Zurich&apos;s ZISC group publishes the Ahoi family of attacks. The lab&apos;s announcement is brisk: &quot;Researchers from the SECTRS group have now discovered a new class of attacks, dubbed Ahoi attacks, that exploit vulnerabilities in the notification framework in Intel TDX and AMD SEV-SNP. ... the vulnerabilities are tracked under 2 CVEs: CVE-2024-25744, CVE-2024-25743&quot; [@eth-ahoi-news] (with CVE-2024-25742 covering WeSee). WeSee won the Distinguished Paper Award at IEEE S&amp;amp;P 2024 [@ahoi-wesee]. Heckler appeared at USENIX Security 2024 [@heckler-usenix]. CISPA&apos;s CacheWarp, also at USENIX Security 2024, cross-cut both [@cachewarp-usenix].&lt;/p&gt;
&lt;p&gt;Four 2024-era papers attacking shipping confidential VMs, and a key observation: none of them broke the Generation-2 integrity rail itself. They all exploit seams &lt;em&gt;around&lt;/em&gt; it.&lt;/p&gt;
&lt;h3&gt;Trusted Computing Base accounting&lt;/h3&gt;
&lt;p&gt;The irreducible silicon-vendor trust root is non-zero by design. On SEV-SNP the customer must trust AMD-SP firmware and the ECDSA-P384 VCEK chain rooted at AMD. On TDX the customer must trust the signed TDX Module binary and the SGX-resident Quoting Enclave&apos;s signing root rooted at Intel. On Azure the customer additionally trusts Microsoft&apos;s signed OpenHCL binary -- with the consolation that OpenHCL is open source and reviewable [@openhcl-blog; @openvmm-repo]. The verifier (MAA, Intel Trust Authority, Google&apos;s verifier) is a separate trust component the relying party must extend.&lt;/p&gt;

The set of hardware, firmware, and software components whose correct operation is necessary for a system to enforce its security properties. For an Azure SEV-SNP CVM the TCB is the AMD silicon, the AMD-SP firmware, the OpenHCL paravisor binary, and Microsoft Azure Attestation acting as the verifier. The TCB cannot be empty; the goal is to make it small, auditable, and named [@amd-snp-whitepaper; @openhcl-blog].
&lt;p&gt;The lower bound on TCB is at least one signing root the customer cannot independently rebuild from public artefacts. Reproducible-build transparency over the AMD-SP firmware and the Intel TDX Module is one of the open standards problems on the 2026 frontier. The Google-Intel joint TDX security review from April 2023 is the best public substitute for a reproducible build of the TDX Module today [@gcp-tdx-review].&lt;/p&gt;
&lt;h3&gt;The 2024 attack class, in order of architectural depth&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;CacheWarp (USENIX Security 2024; CVE-2023-20592; AMD-SB-3005).&lt;/strong&gt; A software fault injection. The mechanism, in NVD&apos;s verbatim language: &quot;Improper or unexpected behavior of the INVD instruction in some AMD CPUs may allow an attacker with a malicious hypervisor to affect cache line write-back behavior of the CPU leading to a potential loss of guest virtual machine (VM) memory integrity&quot; [@nvd-cve-2023-20592]. The project page is plain: &quot;CacheWarp is a new software fault attack on AMD SEV-ES and SEV-SNP. It allows attackers to hijack control flow, break into encrypted VMs, and perform privilege escalation inside the VM&quot; [@cachewarp-site]. The CacheWarp authors -- Ruiyi Zhang, Lukas Gerlach, Daniel Weber, Lorenz Hetterich (CISPA), Youheng Lü (Independent), Andreas Kogler (Graz), Michael Schwarz (CISPA) -- demonstrated full RSA key recovery from Intel IPP, passwordless OpenSSH login, and &lt;code&gt;sudo&lt;/code&gt;-to-&lt;code&gt;root&lt;/code&gt; escalation [@cachewarp-usenix]. SEV-SNP is affected; the fix is the AMD microcode update tracked by AMD-SB-3005 [@amd-sb-3005].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;WeSee (IEEE S&amp;amp;P 2024 Distinguished Paper; CVE-2024-25742).&lt;/strong&gt; A malicious &lt;code&gt;#VC&lt;/code&gt; injection. The hypervisor coerces the guest&apos;s &lt;code&gt;#VC&lt;/code&gt; handler into doing the wrong thing by injecting a &lt;code&gt;#VC&lt;/code&gt; at a moment the guest does not expect one. The arXiv abstract is verbatim: &quot;We present WeSee attack, where the hypervisor injects malicious #VC into a victim VM&apos;s CPU to compromise the security guarantees of AMD SEV-SNP. ... WeSee can leak sensitive VM information (kTLS keys for NGINX), corrupt kernel data (firewall rules), and inject arbitrary code (launch a root shell from the kernel space)&quot; [@wesee-arxiv]. SEV-SNP only.The arXiv &lt;code&gt;citation_author&lt;/code&gt; metadata for 2404.03526 enumerates the WeSee co-authors as Schlueter, Sridhara, Bertschi, Shinde [@wesee-arxiv]. Earlier writeups, including some upstream pipeline stages of this article, listed the third co-author as &quot;Wilke.&quot; This was an inadvertent crossover from the SEVurity author list. The canonical author list, retrieved by querying the arXiv abstract page&apos;s &lt;code&gt;citation_author&lt;/code&gt; meta tags, names Andrin Bertschi (ETH Zurich), which matches the project page on &lt;code&gt;ahoi-attacks.github.io/wesee/&lt;/code&gt; [@ahoi-wesee]. This article reflects the corrected attribution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Heckler (USENIX Security 2024; CVE-2024-25743, CVE-2024-25744).&lt;/strong&gt; A malicious non-timer interrupt injection. The hypervisor injects &lt;code&gt;int 0x80&lt;/code&gt; or a signal-mapped exception into the guest at a moment that breaks an invariant. The Ahoi Heckler page captures the scope: &quot;All Intel TDX and AMD SEV-SNP processors are vulnerable to Heckler&quot; [@ahoi-heckler]. The arXiv extended version demonstrates &quot;Heckler on OpenSSH and sudo to bypass authentication. On AMD SEV-SNP we break execution integrity of C, Java, and Julia applications that perform statistical and text analysis&quot; [@heckler-arxiv]. Mitigations are kernel-side interrupt filtering plus AMD&apos;s protected interrupt delivery feature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ahoi Attacks (umbrella).&lt;/strong&gt; The family page describes scope: &quot;Ahoi Attacks is a family of attacks on Hardware-based Trusted Execution Environments (TEEs) to break AMD SEV-SNP, Intel TDX and Intel SGX&quot; [@ahoi-site]. The ZISC news framing names the SECTRS group at ETH Zurich (Shweta Shinde&apos;s lab) as the locus [@eth-ahoi-news].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;One Glitch to Rule Them All (CCS 2021).&lt;/strong&gt; The physical-fault lower bound established in §3, included here for completeness. Buhren et al. voltage-glitched the AMD-SP on Zen 1 / 2 / 3 to execute custom payloads and to &quot;reverse-engineer the Versioned Chip Endorsement Key (VCEK) mechanism introduced with SEV Secure Nested Paging (SEV-SNP)&quot; [@one-glitch-arxiv]. With supplemental tooling on the PSPReverse GitHub artefact [@pspreverse-github]. With physical access and the right glitcher, the AMD-SP is breakable.&lt;/p&gt;

SEV cannot adequately protect confidential data in cloud environments from insider attackers, such as rogue administrators, on currently available CPUs. -- Buhren, Jacob, Krachenfels, Seifert, *One Glitch to Rule Them All*, 2021 [@one-glitch-arxiv]

flowchart TB
    INTG[&quot;Generation-2 integrity rail -- (RMP / PAMT)&quot;]
    INVD[&quot;CacheWarp -- CVE-2023-20592 -- INVD seam -- (SEV-ES, SEV-SNP)&quot;]
    VC[&quot;WeSee -- CVE-2024-25742 -- #VC handler seam -- (SEV-SNP)&quot;]
    INT[&quot;Heckler -- CVE-2024-25743/4 -- Interrupt-injection seam -- (SEV-SNP, TDX)&quot;]
    GLITCH[&quot;One Glitch -- Physical voltage-fault -- (AMD-SP firmware)&quot;]
    INTG -. &quot;intact&quot; .-&amp;gt; INVD
    INTG -. &quot;intact&quot; .-&amp;gt; VC
    INTG -. &quot;intact&quot; .-&amp;gt; INT
    INTG -. &quot;intact&quot; .-&amp;gt; GLITCH
&lt;h3&gt;Composition limits and operational corollaries&lt;/h3&gt;
&lt;p&gt;Can the verifier itself be a CVM? Can SKR survive a verifier compromise? These are open standards questions; the Confidential Computing Consortium is iterating on them and there is no settled answer. What there &lt;em&gt;is&lt;/em&gt; is operational guidance.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Every 2024-era SEV-SNP and TDX attack has a corresponding microcode or firmware update with a higher TCB SVN. Policies that accept &quot;any TCB SVN at or above the floor of last year&apos;s launch&quot; leave the door open to CacheWarp-class CPUs. Bind your MAA policy to &lt;code&gt;tcb_version &amp;gt;= latest_advisory&lt;/code&gt; and update the floor when AMD or Intel publishes a new security bulletin [@amd-sb-3005; @nvd-cve-2023-20592].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Confidential VMs do not promise side-channel resistance. They promise that the hypervisor cannot &lt;em&gt;directly read&lt;/em&gt; memory and that an integrity-broken page cannot be silently substituted. The current equilibrium against the 2024 attack class is patch-after-disclosure plus attestation-policy hygiene. That equilibrium is itself an architectural statement.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key idea:&lt;/strong&gt; The 2024 attacks do not break the SEV-SNP or TDX integrity rail. They exploit seams &lt;em&gt;around&lt;/em&gt; the rail: the INVD instruction, the &lt;code&gt;#VC&lt;/code&gt; handler, the interrupt-injection path, and the physical AMD-SP. The architecture is settled. The residuals are the work.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The architecture is settled; the residuals are open. What is the 2026 research frontier actually working on?&lt;/p&gt;
&lt;h2&gt;9. Open problems&lt;/h2&gt;
&lt;p&gt;Six open problems shape the 2026 confidential-VM research frontier.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP1. Nested CVMs.&lt;/strong&gt; Intel TDX Module 1.5 ships TD Partitioning, where an L1 TD can host L2 TDs of its own [@intel-tdx-td-partitioning-354807]. AMD&apos;s analogue is the VMPL0 / VMPL2 layout that Azure OpenHCL already exploits. The portable cross-vendor formulation -- nested-CVM evidence that composes both vendors&apos; attestation reports into a single relying-party-checkable artefact -- is not yet standardised. Customers who want a verifier-inside-a-CVM design must build the composition themselves.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP2. Cross-vendor attestation composition for CPU+GPU CVMs.&lt;/strong&gt; Azure NCCadsH100v5 and GCP A3 already compose AMD or Intel CPU attestation with NVIDIA H100 GPU attestation in production. The relying party today consumes two separate evidence packages and runs two separate policy evaluations. The RATS working group&apos;s RFC 9711 (The Entity Attestation Token, EAT) [@rfc9711] is the canonical wire-format vocabulary -- a JWT- or CWT-encoded attested claims set -- that a Passport-topology verifier such as Microsoft Azure Attestation produces, and is the path to a single composed evidence package, but the cross-vendor standards work is unsettled.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP3. Transparency and reproducible builds of the AMD-SP firmware and the Intel TDX Module.&lt;/strong&gt; Both are signed binaries customers trust but do not build. Google&apos;s April 2023 joint security review of TDX, authored by Erdem Aktas, Cfir Cohen, Josh Eads (Google Cloud Security), James Forshaw, and Felix Wilhelm (Google Project Zero), enumerated specific vulnerabilities including &quot;Non-Persistent SEAM Loader, Exit Path Interrupt Hijacking, Unsafe Performance Monitoring VMCS Configuration&quot; [@gcp-tdx-review]. That review is the closest thing to public auditability the TDX Module has today. A reproducible build with binary transparency log (rekor-style) would close the residual auditability gap that even open-source OpenHCL leaves on the table for the silicon vendor&apos;s firmware.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP4. Post-quantum attestation signatures.&lt;/strong&gt; SNP_REPORT signs with ECDSA-P384. TD Quotes are Intel-signed with RSA / ECDSA. The NIST FIPS 204 (ML-DSA) and FIPS 205 (SLH-DSA) standards are final, but vendor-side migration of the CVM signing roots has not been announced for either AMD or Intel. The deployment-feasible path is dual-signing: the SNP_REPORT or TD Quote carries both an ECDSA signature and an ML-DSA signature, the verifier accepts either, and the relying party gates on whichever signing root it trusts most. The transition is non-trivial because the VCEK derivation itself uses a classical KDF chain rooted in classical entropy.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP5. Side-channel-resistant CVMs at deployment scale.&lt;/strong&gt; The CacheWarp, WeSee, Heckler, and Ahoi family is the &lt;em&gt;active&lt;/em&gt; frontier. The current operational equilibrium is policy-pinning to the latest TCB SVN plus microcode-update discipline. There is no production CVM architecture that promises constant-time execution across the integrity rail or that closes the cache-side and notification-injection seams at the silicon layer. The 2026 frontier is what &lt;em&gt;architectural&lt;/em&gt; mitigations look like, not what microcode patches catch up to.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;OP6. Confidential container portability after AKS KataCcIsolation sunset (March 2026).&lt;/strong&gt; The Azure CoCo surface fragments into ACI per-pod CVM, ARO per-container CVM, AKS Confidential VM node pools at node granularity, and the upstream CoCo project [@msdocs-aks-confidential-containers]. Customers picking a confidential-containers strategy today need to plan for one of those four routes; the CoCo project itself is Linux-only as of 2026-05. Windows confidential containers remain out of scope on every shipping cloud.&lt;/p&gt;

This article does not deep-cover Intel SGX (the sibling enclave article handles that), ARM Confidential Compute Architecture (CCA) or Apple&apos;s Secure Enclave Processor (different threat models and form factors), the full text of the TDX Module Architecture Specification (it is 285 pages [@intel-tdx-spec-344425]; this article cites the load-bearing parts), the regulatory and sovereign-cloud framing of CVMs (a separate topic), or the application-level patterns for designing a customer service to be SKR-aware (an operations topic for a future post).

flowchart LR
    OP1[&quot;OP1 -- Nested CVMs -- (TD Part. / VMPL)&quot;]
    OP2[&quot;OP2 -- Cross-vendor -- attestation composition&quot;]
    OP3[&quot;OP3 -- Firmware transparency -- + reproducible build&quot;]
    OP4[&quot;OP4 -- PQ signatures -- (ML-DSA / SLH-DSA)&quot;]
    OP5[&quot;OP5 -- Side-channel- -- resistant CVMs&quot;]
    OP6[&quot;OP6 -- CoCo portability -- (post-March-2026)&quot;]
    OP1 --- OP2
    OP3 --- OP4
    OP5 --- OP6
&lt;p&gt;If you are deploying today, what should you do this quarter? The next section is a practical walk-through that ties the architecture to a runnable workflow.&lt;/p&gt;
&lt;h2&gt;10. Practical guide: VBS-inside-CVM end-to-end&lt;/h2&gt;
&lt;p&gt;Six steps move you from a credit-card swipe to a Windows Server CVM that runs an attested workload with HSM-backed key release. Treat the list as a checklist; each step is a place where the architecture from the previous sections becomes operational.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 1. Provision the CVM.&lt;/strong&gt; Pick a SEV-SNP SKU (DCasv5 or DCasv6 preview), a supported Windows Server image (2019, 2022, or 2025), and turn on Confidential OS-disk encryption with a customer-managed key in Azure Key Vault or Managed HSM. Bind the key to an MAA-aware release policy. The Learn CVM overview describes the SKU family and the OS-image support [@msdocs-azure-cvm]. Plan for the March 30, 2026 encrypted-OS-disk pricing change [@msdocs-azure-cvm].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 2. Confirm VBS inside the CVM.&lt;/strong&gt; A common misconception is that turning on SEV-SNP makes Virtualization-Based Security redundant. It does not -- VMPL and VTL are orthogonal. From an elevated PowerShell session:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; &lt;code&gt;Get-CimInstance -Namespace Root\Microsoft\Windows\DeviceGuard -ClassName Win32_DeviceGuard&lt;/code&gt; should return &lt;code&gt;VirtualizationBasedSecurityStatus = 2&lt;/code&gt; (running) and a non-empty &lt;code&gt;SecurityServicesRunning&lt;/code&gt; array that includes Credential Guard and HVCI. This proves that VTL1 / VTL0 separation is intact inside the SEV-SNP trust boundary -- the cloud operator is excluded by VMPL, and the customer&apos;s own user mode and ring-0 are excluded from the Secure Kernel by VTL.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 3. Capture an attestation token and walk it by hand.&lt;/strong&gt; Use the Azure Attestation client (&lt;code&gt;Microsoft.Azure.Attestation&lt;/code&gt;) to send the guest&apos;s SNP_REPORT and vTPM quote to the regional MAA endpoint. Inspect the returned JWT. The decoded claim set will include &lt;code&gt;x-ms-isolation-tee&lt;/code&gt; describing the TEE (SEV-SNP or TDX), &lt;code&gt;x-ms-runtime&lt;/code&gt; describing the guest configuration, the boot measurements, and any custom claims your policy mints. Verify the JWT signature against the region&apos;s MAA signing certificate -- not against an arbitrary trusted root; this is the verifier-identity hygiene that closes the SKR loop.&lt;/p&gt;

A valid MAA JWT will contain `x-ms-attestation-type = sevsnpvm` (or `tdxvm`) and a `x-ms-compliance-status = azure-compliant-cvm` claim. If either is missing or has a different value, the policy did not gate on the TEE and the relying party is about to release a key against unattested evidence.
&lt;p&gt;&lt;strong&gt;Step 4. Author the policy.&lt;/strong&gt; Write an MAA policy v1.2 file with four pieces. A configuration-rules block that keeps the defaults: &lt;code&gt;require_valid_aik_cert=true&lt;/code&gt; and &lt;code&gt;required_pcr_mask=0xFFFFFF&lt;/code&gt; [@maa-policy-v12]. An authorization-rules block that requires (a) &lt;code&gt;x-ms-attestation-type == &quot;sevsnpvm&quot;&lt;/code&gt;, (b) the SNP_REPORT measurement matches a known reference value for the customer&apos;s golden image, (c) the vTPM PCR-7 matches a known Secure Boot signer baseline, and (d) the VBS-enabled claim is &lt;code&gt;true&lt;/code&gt;. An issuance-rules block that mints a &lt;code&gt;customer-workload-tier&lt;/code&gt; claim from the SNP_REPORT&apos;s &lt;code&gt;tcb_version&lt;/code&gt;. And version &lt;code&gt;1.2&lt;/code&gt;. Bind your HSM key&apos;s release policy to require the issuance-rule claim plus the authorization-rule pass.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Use &lt;code&gt;az attestation policy set&lt;/code&gt; to upload the policy to a non-production attestation provider and replay captured evidence through &lt;code&gt;attestationProvider&lt;/code&gt; REST endpoints. This lets you iterate on JmesPath claim rules without rebooting CVMs. Pre-production failures here are cheap; failures after SKR binding are expensive [@maa-policy-v12].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 5. Repeat on a TDX SKU.&lt;/strong&gt; Provision a DCesv5 or DCesv6 (preview) CVM. The attestation evidence shape changes: TDX evidence carries &lt;code&gt;MRTD&lt;/code&gt; plus &lt;code&gt;RTMR0-3&lt;/code&gt; instead of a single SNP measurement, and the claims JSON shape differs. The JmesPath rules in your policy must be parameterised on &lt;code&gt;productId&lt;/code&gt; to handle both TEEs from one policy file, or split into two policy files keyed by attestation provider region and TEE type [@intel-tdx-overview; @maa-policy-v12].&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Step 6. Plan TCB SVN hygiene.&lt;/strong&gt; Treat the TCB SVN floor in your policy as a moving target, not a one-time configuration. Subscribe to the AMD security bulletins and the Intel TDX security advisories. When CacheWarp&apos;s microcode shipped via AMD-SB-3005 [@amd-sb-3005], the appropriate operational response was to raise the policy&apos;s TCB SVN floor to the new microcode level, not to leave the floor at the launch baseline. This is the single most important operational habit a CVM customer can adopt.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; A policy that accepts the launch-baseline TCB SVN forever is a policy that grandfathers in every known CVE the silicon vendor has shipped a microcode patch for. The 2024 attack class makes this a load-bearing operational discipline, not a footnote [@nvd-cve-2023-20592; @amd-sb-3005].&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;You can build it today. The FAQ below answers the questions readers most often ask after they have built it.&lt;/p&gt;
&lt;h2&gt;11. FAQ and closing&lt;/h2&gt;


Architecturally, the host hypervisor cannot read your encrypted RAM and cannot silently remap pages without triggering an RMP or PAMT fault [@amd-sev-portal; @intel-tdx-overview]. Operationally, the verifier (Microsoft Azure Attestation) is run by Microsoft, the paravisor (OpenHCL) is built by Microsoft, and the silicon is signed by AMD or Intel. You must still trust those components. The lower bound on TCB is at least the silicon vendor&apos;s signing root plus at least one verifier; you can shrink the *verifier* trust by using a third party (Intel Trust Authority for TDX, or your own deployment of an attestation broker), but you cannot shrink the silicon-vendor root [@msdocs-maa-overview].


No. VMPL (the SEV-SNP privilege axis) and VTL (the in-guest Virtualization-Based Security axis) are orthogonal -- VMPL gates the *operator*; VTL gates the *guest kernel*. See §6 for the full two-axis treatment; a Windows Server CVM should run with VBS, HVCI, and Credential Guard enabled inside the guest exactly as it would outside a CVM [@msdocs-azure-cvm].


No. The Nitro hypervisor enforces the enclave boundary in software AWS owns and operates; there is no CPU-level memory cipher, and the threat model is parent-instance isolation rather than cloud-operator isolation. See §7 for the three architectural differences and the operator-trustless callout [@aws-nitro-enclaves].


Yes, with limits. The attestation surface changes: the SNP_REPORT measurement (or MRTD plus RTMR extensions on TDX) now reflects your custom image. Your MAA policy must whitelist the new measurement values or use issuance-rule projection to bind to attributes you control. You cannot bypass the paravisor without abandoning the OpenHCL-mediated vTPM, which removes the chained vTPM-quote to silicon path most customers depend on [@msdocs-azure-cvm; @openhcl-blog].


Yes -- transitively, through the paravisor. See §6 for the full `vTPM quote -&amp;gt; EK certificate -&amp;gt; SNP_REPORT or TD Quote -&amp;gt; VCEK or Intel signing root` chain, and read it end-to-end before you accept a vTPM quote as silicon-bound [@msdocs-azure-cvm].


Node-granularity CVM versus per-pod CVM. Confidential VM AKS node pools put each worker node inside an SEV-SNP CVM; all pods on that node share the trust boundary [@msdocs-aks-cvm-nodes]. Confidential Containers on AKS used the `KataCcIsolation` runtime to put each pod inside its own SEV-SNP-backed Kata MicroVM; that preview is sunsetting in March 2026 [@msdocs-aks-confidential-containers]. Different SKUs, different runtimes, different sunset timelines. Pick node-granularity for lift-and-shift; pick per-pod when you need stricter blast-radius isolation between pods on the same hardware.


No. See §8 for the architectural finding (the Generation-2 integrity rail remains intact under all four 2024 papers; each attack exploits a seam *around* the rail) and §10 Step 6 for the TCB-SVN-pinning operational habit that translates the finding into deployment policy [@cachewarp-site; @ahoi-heckler; @amd-sb-3005].

&lt;p&gt;Imagine drawing the architecture from memory. Start at the bottom with AMD silicon plus the AMD-SP firmware, or Intel silicon plus the SEAM Range Register and the signed TDX Module. Above that, the Azure Hyper-V host -- below the trust boundary, blind to encrypted RAM. Above that, the OpenHCL paravisor at VMPL0 or the L1 TD seat, mediating synthetic devices and the vTPM. Above that, the Windows Server guest at VMPL2 or the L2 TD, still running VBS, HVCI, and Credential Guard inside. Then evidence flows up: SNP_REPORT or TD Quote plus vTPM quote into Microsoft Azure Attestation, which evaluates policy v1.2 against the evidence and emits a signed JWT, which Azure Key Vault checks before releasing the wrapped OS-disk key. If you can draw it on a napkin in two minutes, you have understood the article. If you can write the MAA policy that says exactly what you mean by &quot;this VM is one of mine,&quot; you can build with it.&lt;/p&gt;
&lt;p&gt;&amp;lt;StudyGuide slug=&quot;confidential-vms-on-azure&quot; keyTerms={[
  { term: &quot;Reverse Map Table (RMP)&quot;, definition: &quot;AMD SEV-SNP per-page metadata table enforcing GPA-to-HPA binding; mismatched mappings raise #NPF(rmpfault).&quot; },
  { term: &quot;Virtual Machine Privilege Level (VMPL)&quot;, definition: &quot;AMD SEV-SNP four-level privilege lattice; OpenHCL paravisor at VMPL0, customer kernel at VMPL2.&quot; },
  { term: &quot;SNP_REPORT&quot;, definition: &quot;ECDSA-P384 signed attestation report from the AMD-SP, carrying measurement, policy, report_data, vmpl, chip_id, tcb_version.&quot; },
  { term: &quot;Secure Arbitration Mode (SEAM)&quot;, definition: &quot;Intel CPU privilege state in which the signed TDX Module executes, hosted in the SEAMRR memory range.&quot; },
  { term: &quot;Intel TDX Module&quot;, definition: &quot;Signed Intel firmware running in SEAM that mediates entry, exit, and measurement for Trust Domains.&quot; },
  { term: &quot;MRTD&quot;, definition: &quot;Build-time TDX measurement of the initial TD image; SEAM analogue of an immutable launch PCR.&quot; },
  { term: &quot;RTMR0-3&quot;, definition: &quot;Runtime extendable measurement registers exposed by the TDX Module; SEAM analogue of the runtime-extension TPM PCRs. Canonical TDX-vTPM mapping: RTMR[0]&amp;lt;-&amp;gt;PCR[1,7], RTMR[1]&amp;lt;-&amp;gt;PCR[2-6], RTMR[2]&amp;lt;-&amp;gt;PCR[8-9], RTMR[3]&amp;lt;-&amp;gt;PCR[14,17-22].&quot; },
  { term: &quot;OpenHCL paravisor&quot;, definition: &quot;Microsoft&apos;s open-source Rust paravisor on OpenVMM, running inside the CVM trust boundary at VMPL0 or the L1 TD seat.&quot; },
  { term: &quot;Microsoft Azure Attestation (MAA)&quot;, definition: &quot;Azure&apos;s RATS verifier; evaluates customer policy v1.2 against SNP_REPORT or TD Quote plus vTPM evidence and returns a signed JWT.&quot; },
  { term: &quot;Secure Key Release (SKR)&quot;, definition: &quot;Azure Key Vault / Managed HSM operation gating wrapped-key release on a valid MAA attestation token.&quot; },
  { term: &quot;Versioned Chip Endorsement Key (VCEK)&quot;, definition: &quot;AMD per-chip per-TCB-version ECDSA-P384 signing key for SNP_REPORTs; certificate chain anchors to AMD root via the ASK.&quot; }
]} /&amp;gt;&lt;/p&gt;
</content:encoded><category>confidential-computing</category><category>sev-snp</category><category>intel-tdx</category><category>azure</category><category>attestation</category><category>paravisor</category><category>windows-security</category><category>tee</category><author>noreply@paragmali.com (Parag Mali)</author></item></channel></rss>