Q: Is Apple's source release the same as open source?

No. Per the apple/security-pcc README verbatim: "The publication of this code is intended for security research and verification purposes only" [ 28 ] . The publication's purpose is research-grade transparency -- so that an independent researcher can inspect what is running, exercise the architecture inside the Virtual Research Environment, and submit findings to the Apple Security Bounty program with rewards up to $1,000,000 [ 2 ] . It is not a typical open-source contribution model and the license and intended use are explicitly different. The substantive thing PCC ships is verifiable transparency of the running fleet, not community-driven development.

TL;DR

Apple and Microsoft now ship the same user-facing promise -- "the cloud cannot see your AI prompt" -- through completely different machinery. Apple's Private Cloud Compute (announced June 10, 2024 ^[1]; source release October 24, 2024 ^[2]) runs custom Apple-Silicon servers with a per-node Secure Enclave Processor and publishes every production image hash to a public, append-only Transparency Log that the user's device cryptographically refuses to bypass. Microsoft's Azure confidential AI substrate (NCCads_H100_v5, GA September 24, 2024 ^[3]) composes AMD SEV-SNP confidential VMs with NVIDIA H100 GPUs in CC-On mode, verifies the composed attestation through Microsoft Azure Attestation, and gates customer-managed keys through Secure Key Release from Azure Key Vault. On five of six architectural axes the two designs differ in degree. On the sixth -- verifiable transparency of the production fleet -- they differ in kind.

1. Same Promise, Opposite Architectures

On June 10, 2024, Apple announced Private Cloud Compute and promised that "personal user data sent to PCC isn't accessible to anyone other than the user -- not even to Apple" ^[1]. On September 24, 2024, Microsoft brought its first confidential GPU SKU to general availability. NVIDIA's companion blog called Azure "the first cloud provider to offer confidential computing with NVIDIA H100 GPUs" ^[4]. Microsoft's coordinated Trustworthy AI post framed the same architectural commitment: Microsoft itself cannot view or tamper with the data or the model inference process ^[3] ^[5]. Two vendors. The same user-facing contract. Five months apart.

Open the lid on either one and the machinery is unrecognisable.

Apple PCC runs on custom Apple-Silicon servers, each with a Secure Enclave Processor wired into a vendor-controlled certificate chain. Every production node image hash is published to an append-only public log that the user's device cryptographically refuses to bypass ^[1] ^[6].

Azure's confidential-AI substrate runs on the Standard_NCC40ads_H100_v5 SKU: 40 non-multithreaded 4th-Gen AMD EPYC Genoa vCPUs, 320 GiB of RAM, one NVIDIA H100 NVL GPU with 94 GB of high-bandwidth memory, with the Trusted Execution Environment "spanning confidential VM on the CPU and attached GPU" ^[7]. Trust is rooted in AMD's per-chip signing key, Intel's TDX module on the alternative SKU family, NVIDIA's on-die hardware root of trust on the GPU, and a Microsoft-operated verifier service called Microsoft Azure Attestation ^[8]. None of those signers are Apple, and Apple's signer is none of them.

That is not a difference of brand preference. It is a difference about who you are trusting and how you can check.

This article is a side-by-side architectural treatment of the two designs. It will compare them on six axes you will be able to recite at the end:

Silicon control -- who controls the chip, the firmware, the OS, and the inference runtime.
Hardware root of trust -- which signing keys anchor the attestation chain.
Attestation surface -- what cryptographic artefact the relying party actually consumes.
Key release and state model -- whether the customer holds keys, and how those keys are released to the workload.
GPU TEE -- how confidential compute extends from the CPU into the GPU.
Network anonymization -- whether the operator can correlate requests with their originating client.

By the end you should be able to read a Microsoft Azure Attestation JSON Web Token and an Apple PCC attestation envelope at the same level of fluency, and explain to a non-specialist what each cryptographic artefact actually proves. You should be able to name the threat each architecture defends against, and the threats neither closes by construction.

When the user-facing promise is the same, the architectural divergence is the entire story. To understand what that divergence means, we first have to see where each architecture came from. The two designs did not converge on the same problem by coincidence. They descended from two different ancestor problems that took until 2024 to meet.

2. Confidential Computing's Two Parents

September 14, 2017. Mark Russinovich, Azure CTO, publishes "Introducing Azure confidential computing." Microsoft, he writes, is "the first cloud to offer new data security capabilities with a collection of features and services called Azure confidential computing," and the point of the announcement is "encryption of data while in use" ^[9]. Russinovich names "data in use" as the third protection state, the missing companion to "at rest" and "in transit." Five years later the Confidential Computing Consortium publishes "A Technical Analysis of Confidential Computing" v1.3, the vendor-neutral document both Apple and Microsoft now anchor on, which defines the field formally and gives the lower bounds explicitly ^[10] ^[11].

Russinovich's framing did not appear from nowhere. It was the cloud-operator-side voice of a conversation that had two parents in the underlying hardware.

Parent one: the hardware TEE lineage

A Trusted Execution Environment is a hardware-isolated execution context inside a system whose own host operating system or hypervisor is not trusted to look in. The lineage starts in the early 2000s with ARM TrustZone's split-world NS-bit, then Intel TXT (Trusted Execution Technology) for measured launch on the CPU side -- originally announced as LaGrande Technology at IDF 2003 and rebranded as TXT around 2007 with the vPro / Q35-Q45 chipset rollout. Apple shipped its first Secure Enclave Processor -- a separate Apple-designed processor core on the same SoC as the main application processor, with its own boot ROM, AES engine, and protected memory -- on the iPhone 5s in September 2013 ^[12].

Trusted Execution Environment (TEE)

A hardware-isolated execution context inside a larger system in which code can run with cryptographic guarantees of confidentiality and integrity even when the system's own operating system, hypervisor, or peripheral firmware is compromised or controlled by an adversary. TEEs include process-scope enclaves (Intel SGX), VM-scope confidential VMs (AMD SEV-SNP, Intel TDX), and on-die separate-processor designs (Apple Secure Enclave Processor, Microsoft Pluton).

Intel SGX (Software Guard Extensions) arrived as the first widely-available general-purpose TEE on commodity x86 silicon, with the architectural model first described in the McKeen et al. HASP 2013 paper ^[13] and given general availability on Skylake-era Core CPUs in late 2015. Costan and Devadas's "Intel SGX Explained" (IACR ePrint 2016/086) became the canonical academic systematization ^[14]. SGX let an application author carve out an enclave -- a slice of address space encrypted in DRAM by a per-CPU memory-encryption engine and measured at creation time -- and have a remote party verify, through an Intel-signed attestation report, that a specific code measurement was running before any secret was released to it.

Confidential Computing

Per the Confidential Computing Consortium: protection of data in use through computation in a hardware-based, attested Trusted Execution Environment. The CCC explicitly extends the protection state-pair (at rest, in transit) with a third state (in use) and treats hardware TEEs as the substrate that makes the third state cryptographically enforceable. The CCC v1.3 analysis is the vendor-neutral definitional document both Apple and Microsoft cite ^[10] ^[15].

Parent two: the cloud-operator-as-adversary lineage

The other parent was the cloud. Once enterprise workloads moved into public clouds, the cloud operator itself became part of the threat model. AMD published the first SEV API specification ("Secure Encrypted Virtualization") in April 2016, with silicon support shipping in the EPYC 7001 "Naples" family in June 2017 -- attaching a per-VM memory-encryption key to AMD EPYC processors. SEV-ES followed in February 2017, adding encrypted register state on world switches. SEV-SNP (Secure Nested Paging), described in an AMD whitepaper in January 2020 ^[16], added integrity protection through the Reverse Map Table. Intel's parallel response was TDX (Trust Domain Extensions), specified in September 2020.

Both AMD and Intel framed the contribution the same way: protect the guest from a hypervisor that may itself be the adversary. That framing was exactly what Russinovich's 2017 post had been pointing at, three years earlier, on the cloud side ^[9].

Convergence

The two parents started speaking a common vocabulary in the early 2020s. The Confidential Computing Consortium was founded in August 2019 as a Linux Foundation project community, with members across CPU vendors (AMD, Intel, NVIDIA, ARM), cloud providers (Microsoft, Google, Oracle), and OS / runtime vendors (Red Hat, Canonical, IBM) ^[11].

In January 2023 the IETF Remote ATtestation procedureS (RATS) Working Group published RFC 9334, "Remote ATtestation procedureS (RATS) Architecture," giving the field a single vocabulary for the four roles in any attestation flow: the Attester (the workload making the claim), the Verifier (the party that checks the cryptographic evidence), the Relying Party (the party that makes a decision based on the verified result), and the Endorser (the party that vouches for the Attester's identity, typically the silicon vendor) ^[17].

Both Apple PCC and Microsoft Azure Attestation map cleanly onto RFC 9334's vocabulary. They use the same words for the same roles. The architectures that fill those roles are different.

Ctrl + scroll to zoom

Diagram source

timeline
title TEE and confidential-computing milestones (2003-2024)
section Hardware TEE lineage
2003 : ARM TrustZone (mobile split-world)
2007 : Intel TXT / LaGrande (measured launch)
2013 : Apple Secure Enclave on iPhone 5s
2015 : Intel SGX general availability (Skylake)
2016 : Costan and Devadas SGX Explained
section Cloud operator as adversary
2016 : AMD SEV (memory encryption)
2017 : AMD SEV-ES (encrypted register state)
2017 : Azure CC introduced (Russinovich)
2020 : AMD SEV-SNP whitepaper (integrity via RMP)
2020 : Intel TDX specification
section Vocabulary and standards
2019 : Confidential Computing Consortium founded
2022 : CCC Technical Analysis v1.3
2023 : IETF RFC 9334 RATS Architecture
2024 : Apple PCC and Azure H100 CC-On GA

Diagram source

timeline
title TEE and confidential-computing milestones (2003-2024)
section Hardware TEE lineage
2003 : ARM TrustZone (mobile split-world)
2007 : Intel TXT / LaGrande (measured launch)
2013 : Apple Secure Enclave on iPhone 5s
2015 : Intel SGX general availability (Skylake)
2016 : Costan and Devadas SGX Explained
section Cloud operator as adversary
2016 : AMD SEV (memory encryption)
2017 : AMD SEV-ES (encrypted register state)
2017 : Azure CC introduced (Russinovich)
2020 : AMD SEV-SNP whitepaper (integrity via RMP)
2020 : Intel TDX specification
section Vocabulary and standards
2019 : Confidential Computing Consortium founded
2022 : CCC Technical Analysis v1.3
2023 : IETF RFC 9334 RATS Architecture
2024 : Apple PCC and Azure H100 CC-On GA

Two parents converging on a single vocabulary. The hardware-TEE lineage runs from ARM TrustZone and Apple Secure Enclave through Intel SGX. The cloud-operator-adversary lineage runs from AMD SEV through Intel TDX. Both converge in the CCC and IETF RATS standards in 2019-2023.

Apple's lineage is a third tributary the other two largely overlook. The iPhone Data Protection model, anchored in the SEP since 2013, and iCloud Private Relay's two-hop architecture from 2021 onward both fed into PCC. PCC is the only major-vendor confidential-AI substrate descended from a device-side TEE origin rather than a cloud-side one ^[12] ^[1].

Both parents converged on the same vocabulary by 2023. But the first attempts at putting that vocabulary into production hit walls neither parent had predicted -- starting with the 128 MB enclave that broke deep learning before it began.

3. Process Enclaves and the Operator-Honesty Assumption

August 2018, USENIX Security. Jo Van Bulck and nine co-authors publish "Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution" ^[18]. The attack reads L1-cached enclave memory transiently and -- this is the load-bearing detail -- recovers the SGX EPID attestation-signing key for the targeted CPU generation. Once an attestation key leaks, every attestation that platform produces is forgeable to the attacker until microcode is updated and the EPID group is revoked. The whole "the enclave really is what it says it is" property collapses for that CPU generation overnight.

To understand what Foreshadow was attacking, it helps to walk SGX's enclave lifecycle. A privileged-mode application invokes ECREATE to reserve an enclave address range; pages are added with EADD, each call measuring the page contents into a SHA-256 chain that becomes the enclave's MRENCLAVE measurement; EINIT finalises the chain and locks the enclave; EENTER is then the only legal entry point ^[13] ^[14]. When a remote party asks the enclave to prove its identity, the Quoting Enclave -- a small Intel-signed enclave on every SGX-enabled CPU -- signs a REPORT structure with the EPID key. The remote party verifies the EPID signature against the Intel Attestation Service and learns which code measurement the enclave is running.

Ctrl + scroll to zoom

Diagram source

sequenceDiagram
participant App as Untrusted app
participant CPU as SGX hardware
participant QE as Quoting Enclave
participant IAS as Intel Attestation Service
participant RP as Relying Party
App->>CPU: ECREATE (reserve enclave)
App->>CPU: EADD pages (measured into MRENCLAVE)
App->>CPU: EINIT (finalise measurement)
App->>CPU: EENTER (transfer control)
CPU->>QE: produce local REPORT
QE->>IAS: sign REPORT with EPID key
IAS->>RP: verify quote, return result
RP->>App: release secret if measurement matches

Diagram source

sequenceDiagram
participant App as Untrusted app
participant CPU as SGX hardware
participant QE as Quoting Enclave
participant IAS as Intel Attestation Service
participant RP as Relying Party
App->>CPU: ECREATE (reserve enclave)
App->>CPU: EADD pages (measured into MRENCLAVE)
App->>CPU: EINIT (finalise measurement)
App->>CPU: EENTER (transfer control)
CPU->>QE: produce local REPORT
QE->>IAS: sign REPORT with EPID key
IAS->>RP: verify quote, return result
RP->>App: release secret if measurement matches

The Intel SGX attestation flow. ECREATE reserves enclave memory, EADD measures pages into MRENCLAVE, EINIT finalises the measurement, EENTER is the only legal entry point, and the Quoting Enclave signs the REPORT for remote verification.

Secure Enclave Processor (SEP)

A dedicated secure subsystem integrated into Apple Silicon, isolated from the main application processor with its own boot ROM, AES Engine, and protected memory. The SEP runs an L4-derived microkernel and was first shipped on the iPhone 5s in 2013. It is not a TPM, not the NFC Secure Element used for Apple Pay, and not architecturally related to Intel SGX. It is the per-node hardware root of trust on every Apple Private Cloud Compute server ^[12] ^[1].

SGX scaled to a billion CPUs in three or four years, but it never scaled to deep learning. Three killer constraints stopped it.

Constraint one: the Enclave Page Cache ceiling. On Skylake-class client and Xeon E-2100 / E-2200 (Coffee Lake-based) server SKUs the Enclave Page Cache (EPC) was capped at 128 MB total per socket, of which only ~96 MB was usable for application data after Intel's bookkeeping overhead. An order of magnitude too small for any modern deep-learning workload, where a single set of weights for even a small model could easily exceed the EPC by a factor of 100 or more. (Skylake-SP and Cascade Lake-SP server Xeons did not ship SGX at all; SGX at server scale only arrived with Ice Lake-SP in 2021, by which point the cloud-AI story had moved past process-scope enclaves.)

Constraint two: the programming model. SGX required the application author to split the codebase into a trusted (in-enclave) and untrusted (outside-enclave) half, with explicit ECALL and OCALL transitions and a fixed serialised data interface across the trust boundary. Production codebases written before SGX existed simply refused to be partitioned that way. The handful of teams that tried -- mainly Intel internal proof-of-concepts -- produced systems that worked but did not generalise.

Constraint three: the side-channel cascade. Foreshadow / L1TF in August 2018 ^[18]; SgxPectre at IEEE EuroS&P 2019, demonstrating Spectre-v1-style transient-execution attacks inside SGX enclaves ^[19]; Plundervolt in IEEE S&P 2020, a software-based fault-injection attack via Intel's privileged voltage-control interface, assigned CVE-2019-11157 ^[20]. Each closed a different residual surface that Intel's threat model had not named. The principled extension -- that any TEE on shared silicon inherits a microarchitectural side-channel surface that the architectural threat model does not cover -- became the field's unspoken second axiom.

SGX's attestation chain itself went through a generational turnover. The original EPID (Enhanced Privacy ID) scheme tied attestation verification to the Intel Attestation Service as a centralised relying party. By 2018 Intel had begun the transition to DCAP (Data Center Attestation Primitives), letting cloud operators host their own attestation infrastructure. The transition was exactly because EPID-pinned-to-IAS was incompatible with how cloud providers wanted to verify attestations at fleet scale.

AMD's first-generation SEV and SEV-ES belong to the same era. They encrypted guest memory and (in SEV-ES) the saved register state on world switches, but they did not yet have the integrity check that would make a malicious hypervisor architecturally unable to mount remap-style attacks. That defence had to wait for SEV-SNP and a different failure that demonstrated, on the other side of the trust boundary, exactly the same lesson Foreshadow had taught on the Intel side.

Process-scope enclaves were the wrong granularity. The fix had to come from somewhere else. What if you encrypted whole virtual machines instead?

4. Three Architectural Waves That Made Cloud Confidential AI Feasible

WOOT 2018. Mathias Morbitzer, Manuel Huber, Julian Horsch, and Sascha Wessel publish "SEVered: Subverting AMD's Virtual Machine Encryption" ^[21]. A malicious hypervisor remaps a guest's network-facing service to point at other guest physical pages; the service unwittingly serves the contents of those pages -- still inside the guest, still nominally encrypted at the memory controller -- as plaintext over the network. The encryption did not break. The attack did not need it to.

This is the architectural insight every Generation-3-and-later confidential VM design is built on.

Confidentiality without integrity is not isolation. A confidential VM that encrypts memory but does not bind the encryption to a specific physical page can be tricked into encrypting and then leaking other guests' contents on the operator's behalf. Every TEE design from 2020 onward is haunted by the SEVered failure.

Wave 1 (~2020-2022): VM-level TEEs with hardware-enforced page ownership

AMD's response was SEV-SNP and the Reverse Map Table (RMP): one entry per 4 KB physical page in the system, tracking ownership, validation state, and the permitted size class for that page. Guest pages transition from INVALID to VALIDATED only via a guest-initiated PVALIDATE instruction; subsequent hypervisor remap attempts that would violate the RMP fault out at the hardware level. Intel TDX took a parallel architectural path: a new privilege ring below the hypervisor called SEAM mode, running the Intel-signed TDX Module, with per-VM trust-domain encryption keys managed through MK-TME (Multi-Key Total Memory Encryption).

Reverse Map Table (RMP)

A hardware-managed table maintained by AMD SEV-SNP processors with one entry per 4 KB physical page in the system. Each entry records the page's owner (which guest, if any), its validation state (VALIDATED or not), and the permitted size class. The hypervisor cannot remap a guest-owned page into a different guest without triggering a fault. The RMP is AMD's architectural response to SEVered: it makes the SEVered class of attacks impossible by construction.

Azure brought the SEV-SNP substrate to general availability in 2022 with the DCasv5 and ECasv5 confidential VM families (the a denotes AMD silicon, the s denotes premium storage) ^[15]. Intel TDX entered public preview on Azure in December 2023. Full general availability of the next-generation Intel TDX confidential VMs on 5th-Gen Intel Xeon Scalable Emerald Rapids -- the DCesv6, DCedsv6, ECesv6, and ECedsv6 families -- followed on February 26, 2026 ^[22] ^[23].

The earlier SEV and SEV-ES generations were not free of side channels either. Li, Zhang, Wang, Li, and Cheng's "CipherLeaks" (USENIX Security 2021) showed a deterministic-ciphertext side channel against SEV-ES: identical plaintext at the same physical address produced identical ciphertext, letting a hypervisor observe constant-time cryptographic implementations and recover keys without ever breaking the encryption ^[24]. SEV-SNP's tweakable ciphertext mode addressed this, but the architectural lesson -- that "the encryption is intact" is not the same as "the operator learns nothing" -- repeats.

Wave 2 (~2022-2024): Attestation and key release as managed services

The second wave was less spectacular but more consequential for procurement. Microsoft Azure Attestation (MAA) is a managed verifier that consumes SEV-SNP attestation reports, TDX quotes, SGX quotes, VBS enclave reports, vTPM event logs, and Trusted Launch evidence and issues a JSON Web Token (JWT) with documented x-ms-isolation-tee, x-ms-compliance-status, x-ms-sevsnpvm-*, and x-ms-runtime claims ^[8]. Per the MAA overview verbatim: "Azure Attestation supports both platform- and guest-attestation of AMD SEV-SNP based Confidential VMs (CVMs)" ^[8]. The JWT can then drive Secure Key Release from Azure Key Vault Premium or Azure Managed HSM: the encrypted customer key carries a release policy against MAA-issued claims, and the HSM unwraps the key only when the policy is satisfied ^[15].

Microsoft Azure Attestation (MAA)

A managed Microsoft cloud service that acts as the Verifier (in the IETF RFC 9334 sense) for confidential workloads on Azure. MAA consumes hardware-vendor attestation evidence (SGX quotes, SEV-SNP attestation reports, Intel TDX quotes, vTPM event logs) and produces a signed JSON Web Token whose x-ms-* claims describe the attested TEE state. The JWT is the artefact that downstream relying parties -- including Azure Key Vault's Secure Key Release flow -- consume to decide whether to release a secret to the workload ^[8].

Secure Key Release (SKR)

An Azure Key Vault Premium and Azure Managed HSM capability that gates release of a wrapped key on a successful attestation. The customer attaches a release policy to the key at creation time; the policy is evaluated against the claims of an MAA-issued JWT presented at unwrap time. The key is released to the workload only when the MAA token's claims match the policy. SKR makes customer-managed key material a first-class architectural primitive for Azure confidential workloads ^[15] ^[8].

This is the implementation of what RFC 9334 calls the Passport topological pattern: the Attester collects evidence once, hands it to the Verifier, gets back an Attestation Result (the MAA JWT), and then carries that Result to any Relying Party (the HSM, an external policy engine, an audit log) for the rest of the session ^[17].

Wave 3 (June-October 2024): GPU TEEs, vendor-controlled fleets, and the public arrival of confidential AI

The third wave landed in five months in 2024 and changed what "confidential AI" could mean in production.

The NVIDIA Hopper H100 confidential-computing whitepaper (WP-11459-001) had landed in July 2023 ^[25], and the NVIDIA Developer Blog technical post that accompanied it described the architecture in detail: an on-die hardware root of trust, secure measured boot of the GPU firmware, an SPDM (Security Protocol and Data Model) session connecting the CPU TEE driver to the GPU with mutual authentication, and encrypted bounce-buffer data movement between CPU encrypted memory and GPU encrypted HBM ^[26]. The blog states the architectural fact verbatim: "The NVIDIA H100 Tensor Core GPU is the first ever GPU to introduce support for confidential computing" ^[26].

Apple announced Private Cloud Compute on June 10, 2024 at WWDC, with the canonical primary titled "Private Cloud Compute: A new frontier for AI privacy in the cloud" ^[1]. Microsoft Build 2024 (May 21, 2024) announced confidential inferencing not for GPT-4 but for the Azure OpenAI Whisper speech-to-text model ^[27].

Microsoft's NCCads_H100_v5 confidential GPU VM family -- 4th-Gen AMD EPYC Genoa CPU plus one NVIDIA H100 NVL GPU per VM, with the TEE spanning both ^[7] -- reached general availability on September 24, 2024 ^[3]. The companion Microsoft Trustworthy AI post made the same architectural commitment: customer data and models remain inaccessible to Microsoft itself ^[5] ^[3]. NVIDIA's parallel announcement underscored the same fact verbatim: "Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs" ^[4].

Then on October 24, 2024 Apple published the supporting source code at github.com/apple/security-pcc, shipped the Virtual Research Environment with macOS Sequoia 15.1 Developer Preview, and extended the Apple Security Bounty to PCC with rewards up to $1,000,000 ^[2] ^[28]. By end of October the substrate for cloud-scale confidential AI existed in two parallel forms. But "shipping" does not mean "settling on one architecture." Two distinct breakthroughs landed within five months of each other and took the substrate in opposite directions.

Ctrl + scroll to zoom

Diagram source

flowchart LR
A[Attacker
controls hypervisor] -->|Remaps guest GPA tables| B[SEV guest
network service]
B -->|Reads memory under remapped pages| C[Other guest memory
still under encryption]
B -->|Serves bytes over network| D[Attacker collects
plaintext]
style A fill:#fee,stroke:#c33,color:#7f1d1d
style D fill:#fee,stroke:#c33,color:#7f1d1d

Diagram source

flowchart LR
A[Attacker
controls hypervisor] -->|Remaps guest GPA tables| B[SEV guest
network service]
B -->|Reads memory under remapped pages| C[Other guest memory
still under encryption]
B -->|Serves bytes over network| D[Attacker collects
plaintext]
style A fill:#fee,stroke:#c33,color:#7f1d1d
style D fill:#fee,stroke:#c33,color:#7f1d1d

The SEVered (WOOT 2018) attack. A malicious hypervisor remaps a guest's network-facing service to point at memory pages belonging to another part of the same guest. The service unwittingly serves those pages over the network in plaintext. The encryption was never broken; it was bypassed.

5. Two Distinct 2024 Designs

June 10, 2024, WWDC. Apple Security Engineering and Architecture -- the institutional author block of the post, along with User Privacy, Core OS, Services Engineering, and Machine Learning and AI -- publishes "Private Cloud Compute: A new frontier for AI privacy in the cloud" ^[1]. The post enumerates five core requirements verbatim: stateless computation on personal user data, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency ^[1]. The fifth requirement is the one nothing in the field had ever shipped at this scale.

(a) Apple's Verifiable Transparency model

Every production PCC node software image hash is published to an append-only Transparency Log. Apple's canonical terminology is "Transparency Log" and "Release Transparency" -- both are reflected in the URL path of the Apple documentation page that defines the model ^[6] ^[29]. The user's device cryptographically refuses to forward a request to a node whose image hash is not in the log; in Apple's words, "your device won't issue requests to PCC unless the OS image running in PCC is logged for inspection" ^[1].

Transparency Log (Apple PCC)

An append-only public log of every production Private Cloud Compute node software image hash. The log is structured along the lines of RFC 6962 Certificate Transparency -- a Merkle tree of measurement entries that can be audited end-to-end without trusting any single party. Apple's canonical primary uses the terms "Transparency Log" and "Release Transparency"; "Verifiable Image Catalog" is not Apple terminology. The user's device refuses to forward a request to a PCC node whose image hash is not in the log, making the log a precondition for any data flow ^[1] ^[6].

On October 24, 2024 Apple released the supporting source code at github.com/apple/security-pcc, shipped the Virtual Research Environment (VRE) with macOS Sequoia 15.1 Developer Preview to let researchers run the PCC software stack (including a virtual Secure Enclave Processor) inside a Mac, and extended the Apple Security Bounty to PCC with rewards up to $1,000,000 ^[2] ^[28]. The README on the source release states the scope plainly: "The publication of this code is intended for security research and verification purposes only" ^[28]. The components in the release include CloudAttestation (the attestation envelope library), Thimble (the on-device PCC client), splunkloggingd (the audited logging path), and srd_tools (security-research tooling).

Personal user data sent to PCC isn't accessible to anyone other than the user -- not even to Apple. -- Apple Security Engineering and Architecture, June 10, 2024 ^[1]

The network ingress path to PCC reinforces the non-targetability requirement. Client requests are routed through an Oblivious HTTP relay, operated by an independent third party rather than by Apple, that strips the client IP address before forwarding the request to the PCC cluster. OHTTP is standardised in IETF RFC 9458 by Martin Thomson and Christopher A. Wood, January 2024, with the explicit goal of letting "a client make multiple requests to an origin server without that server being able to link those requests to the client or to identify the requests as having come from the same client" ^[30].

Apple's Target Diffusion design layers an RSA Blind Signatures protocol -- RFC 9474 ^[31] -- on top of the OHTTP path to issue single-use credentials, so even the relay cannot link two requests as having come from the same client.

The OHTTP relay is third-party operated -- not Apple-operated. This is the architectural detail that makes non-targetability work. If Apple operated both the relay and the PCC cluster, Apple would observe the client IP at the relay and the request payload at the cluster and could correlate them. By splitting the two roles across two organizations whose business interests are not aligned, Apple can argue (and the architecture can enforce) that no single organization holds both halves of the correlation.

Ctrl + scroll to zoom

Diagram source

sequenceDiagram
participant Dev as User device
participant Log as Transparency Log
participant Relay as OHTTP relay (third party)
participant Node as PCC node (SEP-rooted)
Dev->>Log: fetch current log root
Log-->>Dev: signed root, inclusion proofs
Dev->>Dev: verify target image hash is in log
Dev->>Relay: encrypted request (no client IP at origin)
Relay->>Node: forwarded request (relay IP only)
Node->>Node: enforce stateless processing
Node-->>Relay: response, SEP-signed attestation envelope
Relay-->>Dev: response delivered
Dev->>Dev: verify SEP attestation matches logged image

Diagram source

sequenceDiagram
participant Dev as User device
participant Log as Transparency Log
participant Relay as OHTTP relay (third party)
participant Node as PCC node (SEP-rooted)
Dev->>Log: fetch current log root
Log-->>Dev: signed root, inclusion proofs
Dev->>Dev: verify target image hash is in log
Dev->>Relay: encrypted request (no client IP at origin)
Relay->>Node: forwarded request (relay IP only)
Node->>Node: enforce stateless processing
Node-->>Relay: response, SEP-signed attestation envelope
Relay-->>Dev: response delivered
Dev->>Dev: verify SEP attestation matches logged image

Apple Private Cloud Compute request flow. The user's device fetches the current Transparency Log root, verifies that the target node image hash is included, mints a single-use credential via RSA Blind Signatures, sends the request through a third-party OHTTP relay that strips the client IP, the PCC node returns a SEP-signed attestation envelope, and the device verifies log inclusion before accepting the response.

(b) Microsoft and NVIDIA's cross-vendor CPU+GPU TEE composition

The other 2024 breakthrough was a composition. The Standard_NCC40ads_H100_v5 SKU is a confidential VM whose Trusted Execution Environment "spans confidential VM on the CPU and attached GPU, enabling secure offload of data, models, and computation to the GPU" ^[7]. The substrate is an AMD SEV-SNP confidential VM on a 4th-Gen AMD EPYC Genoa CPU. The accelerator is an NVIDIA H100 NVL GPU with 94 GB of high-bandwidth memory, operating in CC-On mode ^[7] ^[26].

The H100 in CC-On mode performs secure measured boot of its firmware against an on-die hardware root of trust, then establishes mutually-authenticated SPDM (Security Protocol and Data Model) sessions with the CPU TEE driver, and routes all data movement between CPU encrypted memory and GPU encrypted HBM through an encrypted bounce buffer. The NVIDIA Developer Blog states it verbatim: "a chain of trust is established through ... a security protocols and data models (SPDM) session to securely connect to the driver in a CPU TEE" ^[26]. The GPU's attestation report is signed against NVIDIA's on-die root of trust and consumable through NVIDIA's NRAS (NVIDIA Remote Attestation Service) and the open-source nvtrust SDK ^[32].

Oblivious HTTP (OHTTP, RFC 9458)

An IETF protocol for forwarding HTTP requests through an intermediary in a way that prevents either the intermediary or the target from linking requests to a single client. Per RFC 9458 verbatim: "Oblivious HTTP allows a client to make multiple requests to an origin server without that server being able to link those requests to the client or to identify the requests as having come from the same client, while placing only limited trust in the nodes used to forward the messages" ^[30]. Apple Private Cloud Compute uses an OHTTP relay operated by an independent third party to enforce non-targetability.

The CPU-to-GPU interconnect throughput in H100 CC-On is bounded by CPU encryption performance, not by raw PCIe or NVLink bandwidth. The NVIDIA Developer Blog measures it verbatim: "It is limited by CPU encryption performance, which we currently measure at roughly 4 GBytes/sec" ^[26]. Practitioners sizing throughput around H100 NVL's 94 GB HBM3 capacity should reason about the ~4 GB/s encryption ceiling, not the headline NVLink rate. The ceiling is what makes large-model long-sequence workloads amortise the overhead well, and what makes small-model short-prompt workloads pay a higher relative cost.

Security Protocol and Data Model (SPDM)

A DMTF standard (DSP0274) that defines a mutually-authenticated message-exchange protocol between two PCIe endpoints, used in the NVIDIA H100 CC-On architecture to establish a secure session between the host CPU TEE driver and the GPU. The session protects all subsequent control-plane and data-plane traffic and lets each endpoint verify the other's identity and measurements before any sensitive data crosses the PCIe link ^[33] ^[26] ^[32].

The SPDM handshake itself is specified by DMTF DSP0274 v1.1.0 ^[33] and walks a precise message sequence the relying-party implementer needs to know exists: GET_VERSION (§10.2) negotiates the protocol version; GET_CAPABILITIES (§10.3) negotiates supported capabilities; NEGOTIATE_ALGORITHMS (§10.4) negotiates the cryptographic algorithm family; GET_DIGESTS (§10.7) fetches device-certificate digests; GET_CERTIFICATE (§10.8) retrieves the per-die device-identity certificate; CHALLENGE_AUTH (§10.9) verifies the device's signature over a host-supplied nonce; GET_MEASUREMENTS (§10.11) retrieves the device's runtime measurement vector; and KEY_EXCHANGE (§10.16) establishes the session key over ECDHE on P-384 ^[33]. The first three messages are an ordered prerequisite: per DSP0274 §10.6, no other request is valid until the three-step negotiation completes ^[33].

The negotiated crypto family for the H100 in CC-On mode is SHA-384 / ECDSA-P384 / AES-256-GCM. The device-identity certificate is signed with a per-die ECC-384 hardware-bound key burned into H100 fuses, and revocation runs through the NVIDIA OCSP endpoint -- the GPU-side analogue of the AMD KDS CRL path described later ^[26].

Ctrl + scroll to zoom

Diagram source

sequenceDiagram
participant Req as Host CVM (Requester)
participant Resp as NVIDIA H100 (Responder)
Req->>Resp: GET_VERSION (DSP0274 10.2)
Resp-->>Req: VERSION
Req->>Resp: GET_CAPABILITIES (10.3)
Resp-->>Req: CAPABILITIES
Req->>Resp: NEGOTIATE_ALGORITHMS (10.4)
Resp-->>Req: ALGORITHMS (SHA-384, ECDSA-P384, AES-256-GCM)
Req->>Resp: GET_DIGESTS (10.7)
Resp-->>Req: DIGESTS
Req->>Resp: GET_CERTIFICATE (10.8)
Resp-->>Req: CERTIFICATE (per-die ECC-384)
Req->>Resp: CHALLENGE (10.9)
Resp-->>Req: CHALLENGE_AUTH (signature over nonce)
Req->>Resp: GET_MEASUREMENTS (10.11)
Resp-->>Req: MEASUREMENTS
Req->>Resp: KEY_EXCHANGE (10.16, ECDHE P-384)
Resp-->>Req: KEY_EXCHANGE_RSP

Diagram source

sequenceDiagram
participant Req as Host CVM (Requester)
participant Resp as NVIDIA H100 (Responder)
Req->>Resp: GET_VERSION (DSP0274 10.2)
Resp-->>Req: VERSION
Req->>Resp: GET_CAPABILITIES (10.3)
Resp-->>Req: CAPABILITIES
Req->>Resp: NEGOTIATE_ALGORITHMS (10.4)
Resp-->>Req: ALGORITHMS (SHA-384, ECDSA-P384, AES-256-GCM)
Req->>Resp: GET_DIGESTS (10.7)
Resp-->>Req: DIGESTS
Req->>Resp: GET_CERTIFICATE (10.8)
Resp-->>Req: CERTIFICATE (per-die ECC-384)
Req->>Resp: CHALLENGE (10.9)
Resp-->>Req: CHALLENGE_AUTH (signature over nonce)
Req->>Resp: GET_MEASUREMENTS (10.11)
Resp-->>Req: MEASUREMENTS
Req->>Resp: KEY_EXCHANGE (10.16, ECDHE P-384)
Resp-->>Req: KEY_EXCHANGE_RSP

The SPDM 1.1 handshake between the host CVM (SPDM Requester) and the H100 GPU (SPDM Responder), with DSP0274 v1.1.0 section numbers on each arrow. Per DSP0274 §10.6 the three-step negotiation (GET_VERSION, GET_CAPABILITIES, NEGOTIATE_ALGORITHMS) must complete before any other request is valid; the negotiated crypto family for H100 CC-On is SHA-384 / ECDSA-P384 / AES-256-GCM with ECDHE on P-384 in KEY_EXCHANGE.

The NVIDIA-side verifier reference moved generations recently: the Python SDK in NVIDIA/nvtrust ^[32] is now superseded by nv-attestation-sdk-cpp (also called "NV Attest"), which NVIDIA describes as "a new and improved version of the NVIDIA nvtrust attestation SDK, redesigned to address key limitations" ^[34]. The C++ SDK is the current canonical reference; the older Python SDK still works but is deprecated. The NVIDIA CC documentation index links both ^[35].

The composed attestation -- the AMD SEV-SNP attestation report from the host CVM, joined with the NVIDIA-signed GPU attestation report from the H100 -- is consumable by Microsoft Azure Attestation as a single policy decision ^[8]. Secure Key Release from Azure Key Vault Premium or Azure Managed HSM then gates customer key material on that composite attestation, so the model weights or the user's prompt encryption key are released to the workload only when the entire chain (AMD silicon, AMD firmware, Microsoft hypervisor, customer guest OS, NVIDIA GPU firmware, NVIDIA hardware root of trust) verifies ^[8] ^[15].

Ctrl + scroll to zoom

Diagram source

flowchart TD
A[Customer workload] --> B[Host CVM
AMD SEV-SNP + RMP]
B -->|SPDM session, mutual auth| C[NVIDIA H100 NVL
CC-On mode]
C -->|Signed GPU attestation| D[NVIDIA NRAS]
B -->|SEV-SNP attestation report| E[Microsoft Azure Attestation]
D --> E
E -->|MAA JWT, x-ms claims| F[Azure Key Vault Premium
or Managed HSM]
F -->|SKR release policy check| G[Customer key released
to workload]
style C fill:#e6f3ff,stroke:#36c,color:#1a365d
style E fill:#fff3e6,stroke:#c63,color:#7b341e

Diagram source

flowchart TD
A[Customer workload] --> B[Host CVM
AMD SEV-SNP + RMP]
B -->|SPDM session, mutual auth| C[NVIDIA H100 NVL
CC-On mode]
C -->|Signed GPU attestation| D[NVIDIA NRAS]
B -->|SEV-SNP attestation report| E[Microsoft Azure Attestation]
D --> E
E -->|MAA JWT, x-ms claims| F[Azure Key Vault Premium
or Managed HSM]
F -->|SKR release policy check| G[Customer key released
to workload]
style C fill:#e6f3ff,stroke:#36c,color:#1a365d
style E fill:#fff3e6,stroke:#c63,color:#7b341e

Azure NCCads_H100_v5 confidential GPU composition. The host CVM runs under AMD SEV-SNP with the Reverse Map Table enforcing page ownership. An SPDM session establishes mutual authentication between the CVM and the H100 GPU in CC-On mode. NVIDIA NRAS signs the GPU attestation, MAA composes it with the CVM attestation into a single JWT, and SKR uses that JWT to gate customer key release from Azure Key Vault Premium or Azure Managed HSM.

The NVIDIA H100 Tensor Core GPU is the first ever GPU to introduce support for confidential computing. -- NVIDIA Developer Blog ^[26]

Two breakthroughs. Two cryptographic envelopes. Both prove something about a workload. Both are signed by hardware. Both will satisfy a JWT verifier. And underneath that surface similarity sits a genuinely different epistemological model.

Apple PCC commits, publicly and in advance, to the exact image hash that will be served, and refuses to serve any other. Azure CC-AI does not publicly commit in advance to the bits the verifier runs against -- it produces a JWT that says "I verified what I was given." Both are cryptographic; one is structurally auditable by an independent researcher, the other is a single vendor's word.

This is the aha moment to mark with both hands. "Verify me" is architecturally different from "trust me," even when both produce a JWT.

To turn that distinction into something a reader can carry into procurement, we have to actually walk the six axes. On which do these architectures genuinely differ, and on which do they differ only in implementation strategy?

6. Six Axes, One Difference In Kind

Of the six architectural axes, five are differences in degree -- both PCC and Azure CC-AI do similar things differently. Exactly one is a difference in kind: verifiable transparency of the production fleet. Apple ships a public append-only log of every production node image hash; no other major-cloud confidential-AI substrate ships an architectural equivalent as of mid-2026. The rest of this section walks each axis with the trade-off named, the threat model spelled out, and the primary cited.

Axis 1: Silicon control

PCC is a single-vendor stack end to end. Apple controls the SoC, the SEP, the firmware, the OS, the Swift-based inference runtime, and the bug-bounty program ^[1]. Apple has not publicly named the specific chip family used in PCC nodes; firmware identifiers and independent analyses point to M2-Ultra-class silicon at launch (firmware identifier ComputeModule14,1 ^[36]) with a transition to M5-class silicon during 2026 (identifier J226C ^[37] ^[38]), and the Apple Machine Learning Research introduction confirms only that the cloud-side model runs on "Apple silicon servers" without naming a generation ^[39].

Azure CC-AI is a multi-vendor commodity composition by design. AMD provides the EPYC CPU and the AMD Platform Security Processor; Intel provides the Xeon CPU and the TDX module on the alternate Intel SKU family; NVIDIA provides the H100 GPU and the on-die hardware root of trust; Microsoft provides the hypervisor and MAA; the customer chooses the guest OS ^[15] ^[7] ^[26].

The trade-off is direct. Apple's single-vendor stack is operationally simpler and the trust posture is internally consistent, but the trust root collapses to Apple. Azure's multi-vendor stack spreads trust across four independent signers, but no one of them sees the entire system, and the composition itself is a source of complexity.

Axis 2: Hardware root of trust

PCC anchors per-node trust in the Secure Enclave Processor on each Apple-Silicon server. The SEP is bound to an Apple-controlled certificate authority; the SEP signs the node's attestation envelope; the Apple-controlled CA's chain is the root the user's device trusts ^[1] ^[12].

Azure's hardware root of trust is structurally distributed. A vTPM exposed to the CVM provides one anchor; the AMD Platform Security Processor signs SEV-SNP attestation reports with a per-chip Versioned Chip Endorsement Key (VCEK) ^[40] ^[16]; the NVIDIA on-die RoT signs the GPU attestation; MAA operates as the verifier-of-record that joins these into a single decision artefact ^[8].

Versioned Chip Endorsement Key (VCEK)

A per-die ECDSA signing key derived inside the AMD Platform Security Processor (PSP) from a chip-specific secret fused into the silicon at manufacture. The VCEK signs SEV-SNP attestation reports; the certificate chain runs VCEK -> AMD SEV signing key (ASK) -> AMD Root Key (ARK), with the ARK pinned out-of-band against AMD's published fingerprint and the per-chip VCEK fetched from the AMD Key Distribution Service (KDS) at kdsintf.amd.com keyed on the chip ID plus the four TCB-version-vector *Spl parameters (blSpl, teeSpl, snpSpl, ucodeSpl) parsed out of the 1184-byte attestation report ^[40] ^[16].

The chain itself is short and walkable. The ARK and ASK PEMs are served as a single bundle from the KDS endpoint /vcek/v1/<family>/cert_chain on host kdsintf.amd.com (returning, on the Milan family, an ARK-Milan and SEV-Milan certificate pair issued from AMD Engineering's Santa Clara CA with 25-year validity dated 2020-10-22 ^[40]). The per-die VCEK is served from /vcek/v1/<family>/<chip_id>?blSpl=..&teeSpl=..&snpSpl=..&ucodeSpl=.. on the same KDS host, where the chip ID and the four *Spl TCB-version-vector query parameters are parsed out of the SEV-SNP attestation report itself.

A relying party that wants to verify a SEV-SNP attestation without trusting MAA fetches the chain from KDS, validates the chain against an out-of-band-pinned ARK fingerprint, and checks that the chip ID and TCB version in the report match the chain. The canonical open-source CLI for this is virtee/snpguest ^[41], the active successor to the deprecated AMDESE/sev-tool ^[42].

Axis 3: Attestation surface

PCC produces a per-device attestation envelope cross-checked against the public Transparency Log. The user's device does not just verify the SEP signature; it verifies that the image hash named in the envelope is included in the public log. If the hash is not in the log, the device refuses to forward the request ^[1] ^[6].

Azure produces an MAA-issued JWT. The customer's relying party parses the JWT and matches claims. The MAA overview documents the SEV-SNP-specific claims and the platform-vs-guest distinction explicitly ^[8]. For confidential GPU workloads, NVIDIA's NRAS claims about the H100 are joined into the same JWT.

The procurement-grade payoff: a customer can verify SEV-SNP attestation without trusting MAA by running the snpguest workflow directly against the AMD KDS ^[41] ^[40]. Or they can trust MAA's JWT and validate it against the MAA JWKS, trading one trust anchor (AMD's ARK fingerprint) for another (Microsoft's JWKS). Both paths are real; most production customers deploy the MAA path because it is operationally simpler, but the snpguest-based path is what unlocks "we do not have to trust MAA" for a procurement audit.

JavaScript Decode a Microsoft Azure Attestation JWT and inspect the x-ms claims

// Demonstrates the structure of an MAA JWT for an AMD SEV-SNP confidential VM.
// In production the JWT would be signed by an MAA tenant key and verified
// against the tenant's JWKS endpoint. This example just decodes a sample payload.

const sampleMaaJwt = [
// header (base64url)
'eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9',
// payload (base64url) -- sample x-ms claims
'eyJ4LW1zLWlzb2xhdGlvbi10ZWUiOiJzZXZzbnB2bSIsIngtbXMtY29tcGxpYW5jZS1zdGF0dXMiOiJhenVyZS1jb21wbGlhbnQtY3ZtIiwieC1tcy1zZXZzbnB2bS1ndWVzdHN2biI6OCwieC1tcy1zZXZzbnB2bS1sYXVuY2htZWFzdXJlbWVudCI6InhEa0...","x-ms-runtime":"e30="}',
// signature placeholder
'signature'
].join('.');

function decodeJwtPayload(jwt) {
const [, payload] = jwt.split('.');
// base64url -> base64
const b64 = payload.replace(/-/g, '+').replace(/_/g, '/');
return JSON.parse(atob(b64));
}

const payload = decodeJwtPayload(sampleMaaJwt);
console.log('TEE family:        ', payload['x-ms-isolation-tee']);
console.log('Compliance status: ', payload['x-ms-compliance-status']);
console.log('Guest SVN:         ', payload['x-ms-sevsnpvm-guestsvn']);
console.log('Launch measurement:', payload['x-ms-sevsnpvm-launchmeasurement']);

// A Secure Key Release policy would gate key release on claims like:
//   "x-ms-isolation-tee" == "sevsnpvm"
//   "x-ms-compliance-status" == "azure-compliant-cvm"
//   "x-ms-sevsnpvm-guestsvn" >= 8
// matched against the MAA-issued JWT.

Press Run to execute.

The MAA path hides KDS fetching, certificate-chain validation, and TCB-rollback policy enforcement from the relying party by emitting a JWT whose x-ms-attestation-type claim is sevsnpvm and x-ms-compliance-status claim is azure-compliant-cvm. The relying party then validates against the MAA JWKS instead of pinning the AMD ARK fingerprint. Operationally simpler, but it trades trust in AMD for trust in MAA. A customer that wants a procurement-defensible "we do not have to trust MAA" posture runs the six-step snpguest Regular Attestation Workflow directly against the AMD KDS ^[41]. The snpguest verify certs step validates the VCEK -> ASK -> ARK chain but cannot detect a substituted ARK; the ARK fingerprint must be pinned out-of-band against AMD's published value before the chain is trusted. The other architectural delta: snpguest verify attestation checks the TCB version vector in the attestation report against the version baked into the VCEK certificate, surfacing TCB rollback. Once both checks pass, the relying party has cryptographic evidence the workload is running on a specific physical AMD CPU at a specific firmware level -- without ever talking to Microsoft.

Bash Verify a SEV-SNP attestation report independently of MAA with snpguest read-only

# The six-step Regular Attestation Workflow from the virtee/snpguest README.
# Each step maps to a wire-level KDS GET except step 1 (which talks to the SNP
# guest firmware device locally). Run this from inside an SEV-SNP guest VM on
# Azure (e.g. on a DCasv5 SKU) -- not from the host.

# Step 1: ask the guest firmware for a fresh attestation report bound to a
# 64-byte nonce. The report includes chip_id and the four *Spl TCB vector
# fields the next steps will use to fetch the per-die VCEK.
snpguest report attestation-report.bin request-data.bin --random

# Step 2: fetch the ARK + ASK PEM bundle for this CPU family from AMD KDS.
# Endpoint: GET /vcek/v1/<family>/cert_chain on host kdsintf.amd.com
snpguest fetch ca pem milan ./certs

# Step 3: fetch the per-die VCEK certificate from AMD KDS, keyed on chip_id
# and the four *Spl values parsed out of the attestation report.
# Endpoint: GET /vcek/v1/<family>/<chip_id>?blSpl=..&... on the KDS host
snpguest fetch vcek pem milan ./certs attestation-report.bin

# Step 4: fetch the current AMD CRL so revoked VCEKs can be rejected.
# Endpoint: GET /vcek/v1/<family>/crl on the KDS host
snpguest fetch crl pem milan ./certs

# Step 5: validate the chain locally (VCEK -> ASK -> ARK).
# IMPORTANT: snpguest cannot detect a substituted ARK. Before running this
# command, pin the ARK fingerprint out-of-band against AMD's published value.
snpguest verify certs ./certs

# Step 6: verify the attestation signature with the validated VCEK and check
# the TCB version vector in the report against the VCEK certificate.
# This is the step that surfaces TCB rollback.
snpguest verify attestation ./certs attestation-report.bin

Axis 4: Key release and state model

This is where the architectural philosophies diverge most visibly. PCC nodes are stateless by design. There is no customer key material on the node, no key release ceremony, no HSM gating. Apple's first core requirement names this verbatim: "stateless computation on personal user data" ^[1]. State that needs to persist across requests does so on the user's device, not on the PCC fleet.

Azure treats stateful, customer-managed keys as a first-class architectural primitive. Secure Key Release from Azure Key Vault Premium or Azure Managed HSM gates key release on an MAA-issued JWT whose claims must match the release policy attached to the encrypted key ^[15]. The Microsoft reference confidential-LLM tutorial walks the SKR-from-AKV-Premium flow end to end on a Standard_NCC40ads_H100_v5 SKU ^[43]. Customer-managed keys, customer-controlled HSMs, and customer audit logs are how regulated buyers reason about confidential workloads, and Azure's design accommodates that workflow directly.

What a Secure Key Release policy actually looks like

A minimal SKR release policy is a JSON document referencing MAA-issued claims. A simplified example for an SEV-SNP CVM target:

{
  "version": "1.0.0",
  "anyOf": [
    {
      "authority": "<your MAA tenant URL>",
      "allOf": [
        { "claim": "x-ms-isolation-tee", "equals": "sevsnpvm" },
        { "claim": "x-ms-compliance-status", "equals": "azure-compliant-cvm" },
        { "claim": "x-ms-sevsnpvm-guestsvn", "greater-than-or-equals": 8 }
      ]
    }
  ]
}

At unwrap time the HSM evaluates the policy against the JWT the workload presents. Only if every condition is met is the key material released. The policy is bound to the key at creation time and cannot be modified after the fact without rewrapping under a fresh policy.

Axis 5: GPU TEE

PCC uses Apple GPUs that are integrated on the same SoC as the CPU and SEP. By construction they sit inside the same SEP-rooted attestation envelope -- there is no separate cross-vendor PCIe attestation handshake because there is no PCIe handshake to begin with ^[1].

Azure uses NVIDIA H100 NVL GPUs in CC-On mode, with the architecture described above: on-die RoT, SPDM session, encrypted bounce buffer, NRAS-signed attestation report joined to the SEV-SNP CVM attestation through MAA ^[7] ^[26]. The NVIDIA H100 exposes three confidential-computing modes: CC-Off (the normal non-confidential default; no isolation, no encryption); CC-On (full confidential mode, the only mode that should be used in production); and CC-DevTools (per NVIDIA's developer blog, "a partial CC mode that will match the workflows of CC-On mode, but with security protections disabled and performance counters enabled" ^[26]) ^[35]. The three modes share a bring-up surface, but only CC-On enforces the full isolation contract.

AMD's MI300X GPU ships as compute across multiple clouds (Oracle OCI, DigitalOcean, Vultr, Crusoe, TensorWave, Hot Aisle, Seeweb ^[44]) but has no production-equivalent confidential-GPU mode at GA on a major commercial cloud as of mid-2026. PCIe TDISP and SEV-TIO Linux support is landing in 2025-2026 kernels, but the GA gap is the load-bearing fact for any procurement that prefers AMD over NVIDIA at the accelerator tier. Azure's confidential GPU offering is H100-only at GA.

A subtle and procurement-critical detail: Microsoft Azure Attestation does not directly attest the GPU. The MAA overview documents the SEV-SNP path and the platform-vs-guest distinction, but the GPU attestation is produced and signed by NVIDIA NRAS, not MAA ^[8] ^[26]. The composed MAA JWT carries the NVIDIA-signed GPU attestation as a nested claim. A customer's relying party that wants to verify the GPU attestation against NVIDIA's hardware root of trust must validate the NRAS signature, not the MAA signature, on that nested portion.

This is the double attestation pattern: the SEV-SNP CVM attestation is signed by AMD VCEK; the H100 GPU attestation is signed by NVIDIA's on-die root of trust; MAA composes them into one JWT, but the two signatures must be verified against two different roots. The Azure confidential-computing-cvm-guest-attestation and az-cgpu-onboarding repositories provide the reference patterns for both halves of this verification ^[45].

The double attestation is one place the "MAA is the verifier of record" framing oversimplifies. MAA is the verifier of record for the composition -- but the underlying signatures still come from AMD and NVIDIA. A relying party that wants to refuse a workload running on a TCB-rolled-back AMD CPU plus a CC-DevTools-mode H100 needs to check the AMD TCB version vector against a TCB-version policy (snpguest can do this) and the NVIDIA GPU mode field against a "CC-On only" policy. MAA can be configured to enforce both of these in the release policy, but the customer has to actively write the policy; the defaults will not catch a CC-DevTools-mode H100.

Performance overhead is small. Zhu, Yin, Deng, Almeida, and Zhou (Phala / Fudan / io.net), in arXiv 2409.03992 (v4, November 5, 2024), benchmarked H100 CC-On on vLLM v0.5.4 with the ShareGPT dataset on Llama-3.1-8B-Instruct and report that "for the majority of typical LLM queries, the overhead remains below 7%, with larger models and longer sequences experiencing nearly zero overhead" ^[46]. The dominant overhead source is the PCIe encrypted bounce buffer, capped at the ~4 GB/s CPU-encryption ceiling discussed in §5(b); large models amortise that cost across many tokens.

The "below 7%" overhead number is benchmarked on a specific stack (vLLM v0.5.4, ShareGPT dataset, Llama-3.1-8B-Instruct) and depends on sequence length and batch size in non-trivial ways ^[46]. Smaller models with short prompts and high batch turnover spend a larger fraction of wall-clock time on the bounce-buffer crossings; larger models with long context windows amortise that cost. Quoting "below 7%" without the workload qualification is misleading.

Axis 6: Network anonymization

This is the axis where the two architectures differ in kind.

PCC routes client requests through a third-party-operated Oblivious HTTP relay -- RFC 9458 ^[30] -- that strips the client IP address before the request reaches the PCC cluster. This implements one of Apple's five named core requirements, non-targetability: an attacker who compromises the PCC fleet cannot single out a specific user's traffic because the fleet does not know which IP issued which request ^[1]. Apple's Target Diffusion design layers RSA Blind Signatures (RFC 9474) ^[31] on top to issue single-use credentials, so even the relay cannot link two requests from the same client.

Azure has no equivalent operator-level anonymization layer. This is intentional in Azure's design: an enterprise customer who knows that traffic originates from their own employees generally does not want to anonymize that traffic from their own audit logs. But it is an axis the two architectures differ on in kind rather than in degree, and worth naming as such -- a procurement reader who needs operator-level anonymization will not get it from Azure CC-AI without building it themselves.

The six axes, side by side

The following table consolidates the comparison.

Axis	Apple Private Cloud Compute	Azure Confidential AI
Silicon control	Single-vendor end-to-end (Apple SoC, SEP, firmware, OS, runtime) ^[1]	Multi-vendor commodity composition (AMD EPYC, Intel Xeon, NVIDIA H100, Microsoft hypervisor) ^[15] ^[7]
Hardware root of trust	Per-node SEP bound to Apple-controlled CA ^[1]	vTPM + AMD PSP / VCEK + NVIDIA on-die RoT + MAA as verifier-of-record ^[8] ^[40]
Attestation surface	Per-device envelope cross-checked against public Transparency Log ^[6]	MAA-issued JWT with documented `x-ms-*` claims ^[8]
Key release / state	Stateless nodes; no customer keys; no release ceremony ^[1]	SKR from AKV Premium / Managed HSM gated on MAA JWT ^[15]
GPU TEE	Integrated Apple GPU in same SEP-rooted envelope ^[1]	NVIDIA H100 CC-On + SPDM + NRAS joined to MAA ^[26] ^[7]
Network anonymization	Third-party OHTTP relay strips client IP ^[30] ^[1]	No equivalent operator-level anonymization layer

Ctrl + scroll to zoom

Diagram source

flowchart LR
subgraph PCC["Apple PCC stack"]
P1[Apple SoC + integrated GPU]
P2[SEP per node
Apple-controlled CA]
P3[Transparency Log
append-only public]
P4[Stateless node
no customer keys]
P5[OHTTP relay
third party]
end
subgraph AZ["Azure CC-AI stack"]
A1[AMD EPYC + NVIDIA H100
multi-vendor]
A2[AMD PSP + vTPM
NVIDIA on-die RoT]
A3[MAA JWT
x-ms claims]
A4[SKR from AKV Premium
customer-managed keys]
A5[no operator-level
anonymization layer]
end

Diagram source

flowchart LR
subgraph PCC["Apple PCC stack"]
P1[Apple SoC + integrated GPU]
P2[SEP per node
Apple-controlled CA]
P3[Transparency Log
append-only public]
P4[Stateless node
no customer keys]
P5[OHTTP relay
third party]
end
subgraph AZ["Azure CC-AI stack"]
A1[AMD EPYC + NVIDIA H100
multi-vendor]
A2[AMD PSP + vTPM
NVIDIA on-die RoT]
A3[MAA JWT
x-ms claims]
A4[SKR from AKV Premium
customer-managed keys]
A5[no operator-level
anonymization layer]
end

Where the six axes live in each architecture. PCC concentrates silicon, root of trust, attestation, key model, and GPU into a single Apple-controlled stack and adds an OHTTP relay at network ingress. Azure spreads silicon and root-of-trust across AMD plus NVIDIA plus Microsoft, with MAA as the verifier of record and SKR as the key release primitive.

Verifiable transparency (of the production fleet)

An architectural property whereby every production software image actually serving customer requests is committed in advance to a public, append-only log accessible to any third party. The property requires both that the cryptographic log be publicly auditable (a Certificate-Transparency-style Merkle tree, for example) and that the system refuse to serve requests against images not present in the log. Apple Private Cloud Compute ships verifiable transparency as a first-class architectural primitive; no other major-cloud confidential-AI substrate ships an architectural equivalent as of mid-2026 ^[1] ^[6].

The two architectures differ in degree on five axes: silicon control, hardware root of trust, attestation surface, key release, and GPU TEE. On the sixth -- verifiable transparency of the production fleet -- they differ in kind. Apple's Transparency Log is not a slightly-better MAA. It is an architectural primitive Microsoft does not ship.

Six axes, two architectures, one axis where the divergence is in kind. But Apple PCC and Microsoft Azure are not the only games in town. Where do AWS Nitro Enclaves and Google Cloud Confidential Space fit on the same six axes?

7. Beyond the Two Headliners

If verifiable transparency is the architectural difference, the obvious question is why AWS and Google have not just shipped a Transparency Log too. The short answer is that the three other production substrates each chose a different epistemic model, and shifting any one of them to PCC's model would require rebuilding the trust root from scratch.

AWS Nitro Enclaves

AWS Nitro Enclaves does not anchor in a CPU-vendor TEE at all. Trust is rooted in AWS-as-signer through the Nitro Hypervisor and the Nitro Security Chip ^[47]. The Nitro System "provides enhanced security that continuously monitors, protects, and verifies the instance hardware and firmware" and offloads virtualization resources to dedicated hardware ^[47]. A Nitro Enclave is created from a parent EC2 instance and is "isolated from the parent EC2 instance through the Nitro Hypervisor"; per the AWS documentation verbatim, "the Nitro Hypervisor ensures that the parent instance has no access to the isolated vCPUs and memory of the enclave" ^[48].

The trust model is different in kind from SGX, SEV, or TDX. Attestation is rooted in AWS's signing key, not in a CPU-vendor key. The Nitro architecture is processor-agnostic over Intel, AMD, and AWS Graviton, which is a different posture again -- the enclave's confidentiality does not depend on a specific silicon vendor's TEE primitive. There is also no published GPU confidential-computing extension for Nitro Enclaves as of mid-2026.

Google Cloud Confidential Space

Google Cloud Confidential Space combines Intel TDX (and AMD SEV / SEV-SNP) with Google Cloud Attestation and Workload Identity Federation. Per the GCA documentation: "Google Cloud Attestation provides a unified solution for remotely verifying the trustworthiness of all Google confidential environments ... The service supports attestation of confidential environments backed by a Virtual Trusted Platform Module (vTPM) for SEV and the TDX Module for Intel TDX" ^[49]. The overview page describes the multi-party-collaboration use case for PII, PHI, IP, and LLM-interaction data ^[50].

Google added an interesting wrinkle in 2025: an Intel Trust Authority integration that lets a GCP customer use ITA as a second verifier alongside Google Cloud Attestation. Per the integration documentation: "GCP Confidential Space provides a method for isolating a workload and sensitive data ensuring that data is released only to authorized workloads ... Intel Trust Authority is used to validate the evidence" ^[51]. A second verifier is not the same architectural primitive as a public transparency log -- it provides cross-checking but not append-only public auditability -- but it is the closest move any other major-cloud confidential platform has made toward PCC's direction as of mid-2026.

Confidential Containers and the orchestration tier

Confidential Containers (CoCo) is a CNCF Sandbox project that wraps Kubernetes pods in confidential VMs running on AMD SEV-SNP, Intel TDX, or IBM Secure Execution ^[52]. Per the project: "Confidential Containers is an open source community working to enable cloud native confidential computing by ... Trusted Execution Environments to protect containers and data" ^[52]. CoCo composes on top of the same Generation-3 silicon Azure CC-AI uses; it does not compete with PCC architecturally because it is at a different layer of the stack.

Around CoCo and the underlying TEEs sits a small set of orchestration-tier vendors that take responsibility for what the raw SKUs do not. The procurement-relevant distinctions between them are sharper than the marketing copy suggests.

Anjuna Seaglass is the cross-cloud unified confidential-deployment plane. It packages AWS Nitro Enclave, Azure CVM, and GCP Confidential Space behind a single command and a customer-supplied policy ^[53], with the explicit value proposition of "any cloud, any region, with the only Universal Confidential Computing platform." Anjuna's Seaglass platform supplanted the older Anjuna Northstar nomenclature, but reads the same way to a procurement audit: a single control plane spanning three different silicon vendors' TEE primitives, with a uniform policy DSL on top.

Edgeless Systems' Contrast is the runtime-and-runtime-encryption layer for confidential Kubernetes. Contrast runs confidential container deployments on Kubernetes at scale, built on Kata Containers and the Confidential Containers concept, and provides PKI, mTLS, and encrypted state disks across the deployment ^[54]. The architecture documentation is explicit that "the Contrast Coordinator is the central remote-attestation service for a Contrast deployment" and verifies the Contrast components inside a confidential VM ^[55] ^[56]. Contrast is the active successor to Edgeless Constellation, which is now archived ("This repository has been archived ... Edgeless Systems has shifted focus to Contrast, our solution for confidential containers, which addresses the modern needs of confidential cloud workloads" ^[57]). The procurement signal is that customers evaluating Constellation should be redirected to Contrast in any new deployment.

Fortanix is two distinct products that the marketing collapses into one. Fortanix Confidential Computing Manager (CCM) is the orchestration and policy management layer that "is used to securely deploy and manage confidential computing applications using Intel SGX, AMD SEV-SNP, and Intel TDX runtimes" ^[58]. Fortanix Data Security Manager (DSM) is the FIPS 140-2 Level 3 HSM that holds the keys; per Fortanix's DSM page, DSM "delivers Cryptographic Services, Key Management Services, Secrets Management, Tokenization, Code Signing ... powered by Confidential Computing" ^[59] and carries FIPS 140-2 Level 3 certification on the underlying platform ^[60]. Procurement teams that need a customer-managed-keys story almost always need both: CCM to orchestrate the confidential-workload deployment, DSM to custody the keys.

CCM is not DSM. CCM is the orchestration plane (which workload runs where, attested by what); DSM is the FIPS 140-2 Level 3 HSM (which holds the keys, releases them on attested workload verification, audits the access). A procurement that asks for "Fortanix" without specifying CCM or DSM is asking for two different products at two different price points with two different compliance postures. The two integrate but they are not the same SKU.

Vendor	Layer	Pick when...
Anjuna Seaglass	Cross-cloud confidential deployment control plane ^[53]	You run the same regulated workload on more than one cloud and need one policy DSL spanning AWS Nitro + Azure CVM + GCP Confidential Space
Edgeless Contrast	Confidential Kubernetes runtime with mTLS and encrypted state ^[55] ^[56]	You run confidential workloads as Kubernetes pods and want a remote-attestation Coordinator inside the deployment rather than an external SaaS verifier
Fortanix CCM	Confidential-app orchestration on SGX/SEV-SNP/TDX ^[58]	You need centralized policy for which signed confidential workloads run on which TEEs, with audit
Fortanix DSM	FIPS 140-2 Level 3 HSM with attested key release ^[59] ^[60]	You need customer-managed keys, FIPS 140-2 L3 custody, and attested-workload-gated release as a single SKU

The third-party tier exists because the raw cloud SKUs sell the substrate but not the operational pattern. Procurement decisions in this category typically pair a cloud SKU with one or two of these orchestration vendors to get something workable for a regulated workload.

Where these fit on the six axes

Substrate	Silicon	Root of trust	Transparency	GPU TEE
Apple PCC	Apple end-to-end ^[1]	SEP + Apple CA ^[12]	Public Transparency Log ^[6]	Integrated Apple GPU ^[1]
Azure CC-AI	AMD + Intel + NVIDIA + MS ^[15]	AMD PSP + NVIDIA RoT + vTPM + MAA ^[8] ^[40]	None (MAA claims only) ^[8]	NVIDIA H100 CC-On ^[26]
AWS Nitro Enclaves	AWS-signed, CPU-agnostic ^[47]	Nitro Hypervisor + Security Chip ^[48]	None	None at GA
GCP Confidential Space	Intel TDX + AMD SEV-SNP ^[50]	vTPM + TDX Module + GCA (+ optional ITA) ^[49] ^[51]	None (second verifier via ITA)	None at GA on Confidential Space
Third-party tier (CoCo / Contrast / Anjuna)	Composes on top of cloud SKUs ^[52] ^[54]	Inherits underlying TEE root	None	Inherits underlying GPU TEE

Five substrates, one rough trade-off space. But every one of them rests on silicon, and silicon has its own theoretical limits. What can no TEE-based confidential AI architecture do?

8. What No TEE Can Do

The Confidential Computing Consortium's "A Technical Analysis of Confidential Computing" v1.3 -- the vendor-neutral definitional document both Apple and Microsoft anchor on -- explicitly enumerates side-channels as a residual risk ^[10]. This is not a contestable empirical claim. It is the field's own lower bound on what TEE-based confidential AI can deliver. The CCC names what the architecture does not close, in plain text, in the same document that defines what it does.

There are roughly six classes of limit, and the architectures we have walked do not close any of them by construction.

1. Side-channels on shared silicon

The Foreshadow / L1TF, SgxPectre, and Plundervolt cascade ^[18] ^[19] ^[20] is the historical evidence. The principled extension is direct: any TEE built on shared microarchitectural state -- shared caches, shared branch predictors, shared functional units, shared voltage / frequency control -- inherits a side-channel surface that the architectural threat model does not name. Both Apple's SEP and the AMD-Intel-NVIDIA composition rest on silicon that does not have an architectural primitive that closes this surface. Wojtczuk and Rutkowska's 2009 paper on Intel TXT made the same point fifteen years earlier in a different generation, demonstrating that SMM-based bypasses of TXT were not addressed by TXT's own threat model ^[61]. The cycle keeps repeating.

Even Intel SGX's memory encryption/authentication technology cannot protect against Plundervolt. -- the Plundervolt project page ^[20]

2. Trust-anchor compromise

Every vendor behind a hardware root of trust is itself a trust anchor that nothing inside the architecture can close. AMD-as-signer through the PSP and VCEK certificate chains ^[40]; Intel-as-signer for the TDX Module, SEAMLDR, and Provisioning Service; NVIDIA-as-signer for the on-die RoT and NRAS; Microsoft-as-signer for the MAA service ^[8]; and Apple-as-signer for the SEP-bound CA and the Apple-controlled Transparency Log ^[1]. If any of those signing infrastructures is compromised, the architecture cannot defend itself against the signer. PCC's trust root collapses to Apple; Azure's spreads across four vendors but each one is still a trust anchor for the workload that depends on it.

3. ROM-burned single-signer revocation

Fuse-burned silicon roots of trust are not field-revocable on a chip already deployed. If an attacker recovers a vendor-signing key that has been burned into the boot ROM of millions of chips, the recovery path is fleet rotation, not credential revocation. This is not a flaw of any specific vendor; it is a property of how hardware roots of trust are physically anchored. The recovery model for a leaked AMD ARK key, an Intel SEAM key, or an Apple SEP signing key is the same: replace the silicon. That is a multi-quarter operation at fleet scale.

4. Supply-chain compromise of the AI model

Apple binds the model into the attested image hash. The same Transparency Log that proves what code is running also proves what model weights are running, because the model is part of the published image ^[1] ^[6]. PCC closes the model supply-chain question at the architecture level.

Azure shifts model integrity to customer-controlled SKR of model artefacts. The model weights become encrypted blobs that the workload unwraps inside the TEE using a customer-managed key released only on a satisfying MAA JWT ^[15] ^[43]. The customer is the trust anchor for the model's identity, not the cloud provider. This is a different trust-rooting model -- not stronger or weaker in the abstract, but routed through different organizations. It is not accurate to say only Apple defends against model supply-chain compromise.

5. Prompt-output exfiltration via the model itself

The TEE protects the input boundary -- it can prove the cloud operator never saw the prompt. It does not constrain what the model puts in the output. A model that is fine-tuned, prompt-injected, or simply chooses to emit memorised data can exfiltrate information through its own output channel, and no architectural primitive in either PCC or Azure CC-AI prevents that. Both architectures are equally exposed on this axis. This is also why prompt-output safety, content filtering, and model-side privacy controls are unrelated work that confidential computing does not subsume.

6. Compelled vendor and lawful access

A property of the trust-rooting model, not of any one architecture. If a vendor is compelled by law to push a software update that exfiltrates user data, the architecture cannot defend itself against that vendor. PCC's compelled-vendor exposure is concentrated on Apple. Azure's is distributed across AMD, Intel, NVIDIA, and Microsoft, but a compelled Microsoft is sufficient to compromise an MAA-rooted workload; the diffusion does not multiply protections.

And one more: MAA-as-service compromise

Azure's centralised verifier is a control point Apple does not have, because Apple's verifier is the user's device itself. If MAA is compromised -- if an attacker controls the MAA signing key, or if the MAA policy-evaluation code is modified maliciously -- every relying party that trusts MAA-issued JWTs trusts the attacker.

Threat	Apple PCC	Azure CC-AI
Malicious cloud operator (passive memory disclosure)	Defended (SEP-rooted attestation, OHTTP relay) ^[1]	Defended (SEV-SNP / TDX guest measurement, MAA verifier) ^[8]
Compromised hypervisor (active remap / Iago attacks)	Defended (Apple-controlled kernel + SEP-rooted measured boot) ^[1]	Defended (SEV-SNP RMP enforces page ownership; TDX Module isolates) ^[15]
Supply-chain compromise of the AI model	Defended at architecture level (model bound into Transparency-Log-published image) ^[1]	Defended via customer-controlled SKR of model artefacts; trust shifts to customer ^[43]
Side-channels on shared silicon	Not closed by construction ^[10] ^[20]	Not closed by construction ^[10] ^[24]
Compelled-vendor / lawful access	Not closed by construction (trust collapses to Apple)	Not closed by construction (trust spreads across four vendors; compelled MAA suffices)
Verifier / signer compromise	Apple SEP-CA + Transparency Log signer is a control point	MAA signer + AMD / Intel / NVIDIA signers are control points
Prompt-output exfiltration via model	Not closed by construction	Not closed by construction

Trust diffusion (Azure's contribution) and verifiable transparency (Apple's contribution) close different trust-anchor gaps. Neither closes both. No production substrate as of mid-2026 closes both gaps simultaneously. A hypothetical Generation-7 design that combined Azure-style multi-vendor TEE composition with Apple-style append-only transparency of production images would close that gap. No vendor has shipped it.

Two architectures, two distinct upper bounds, neither closing the same gap. So what is the field actually working on?

9. Where Active Work Is Happening

September 5, 2024, arXiv. Ceren Kocaoğullar (University of Cambridge), Tina Marjanov (Cambridge), Ivan Petrov (Google), Ben Laurie (Google), Al Cutter (Google), Christoph Kern (Google), Alice Hutchings (Cambridge), and Alastair R. Beresford (Cambridge) post "A Confidential Computing Transparency Framework for a Trust Chain" ^[62]. The paper does not name MAA specifically. It generalises the question Apple PCC raises in concrete form: can the verifiable-transparency primitive be replicated on commodity multi-vendor silicon without collapsing to a single trust root? The authors propose "a three-level conceptual framework providing organisations with a practical pathway to incrementally improve Confidential Computing transparency" ^[62]. The inclusion of Ben Laurie -- one of the original architects of Certificate Transparency (RFC 6962) -- is not incidental. The paper is the direct architectural descendant of CT brought into the confidential-computing domain.

The v2 December 5, 2024 revision of the Kocaoğullar et al. paper added an 800+ participant empirical study showing that greater transparency improves end-user trust in confidential computing services ^[62]. That empirical signal is the closest thing the field has, as of mid-2026, to a measurement of the procurement consequences of verifiable transparency vs verifier-as-a-service. The framework itself is conceptual; the empirical contribution is the part procurement teams should read.

Six open problems are visible in the current production work.

9.1 Verifiable transparency of the verifier itself

No major-cloud verifier ships a public append-only log of its own code. MAA does not; Google Cloud Attestation does not; AWS Nitro's hypervisor signer does not. The Intel Trust Authority integration on GCP introduces a second verifier, which is a partial cross-check, but a second verifier is not the same architectural primitive as a transparency log ^[51]. Where the work is happening: the CCC Attestation Special Interest Group on GitHub coordinates Formal Specifications of Attestation Mechanisms, an RA-TLS proof of concept, an interoperable RA-TLS effort, an IETF RATS terms cheat sheet, and a formal-spec-KBS (key broker service) project ^[63]. The IETF RATS Working Group continues to extend RFC 9334 with Entity Attestation Token (EAT) and Concise Reference Integrity Manifest (CoRIM) drafts ^[17].

9.2 GPU confidential-computing parity across vendors

NVIDIA H100 CC-On is the only confidential-GPU mode at GA on a major commercial cloud as of mid-2026 ^[26] ^[7]. AMD MI300X ships as compute across multiple clouds but has no production-equivalent SEV-TIO confidential-GPU mode at GA on a major commercial cloud. PCIe TDISP and SEV-TIO Linux support is landing in 2025-2026 kernels, but the GA gap is the load-bearing fact for any procurement that wants AMD silicon end-to-end. AMD's MI400X-class roadmap is forward-looking. Until a second confidential GPU is at GA, single-vendor lock-in at the accelerator tier is the unavoidable procurement reality for any cloud confidential-AI workload.

9.3 Cross-vendor attestation portability

IETF RFC 9334 standardises the vocabulary ^[17]; CoRIM and EAT, in active drafting in the IETF RATS WG, aim at portable claim formats. The vocabulary work matters because a confidential workload that wants to run unchanged on Azure SEV-SNP and Azure TDX and GCP TDX needs a single attestation parser that understands all three evidence formats. The MAA approach maps onto RFC 9334's Passport pattern; the GCA approach maps onto OIDC tokens that play well with federated-identity tooling. As of mid-2026 no single relying-party library handles all three production verifiers transparently, and that is one of the things the CCC Attestation SIG is working on ^[63].

9.4 Confidential inferencing for Azure OpenAI models

Microsoft's Azure-Samples/confidential-ai-workshop repository ^[64] is the cleanest procurement-grade reference for what confidential inferencing actually looks like in production on Azure today. It contains three end-to-end tutorials at three different points on the cost-versus-isolation curve, and reading them in sequence is the fastest way for a procurement team to map the abstract architecture to concrete SKU lines.

Tutorial 1: ML-training on a CPU-only confidential VM (Standard_DCasv5). The confidential-ml-training directory walks training of an XGBoost-class classical-ML model on a Standard_DCasv5 SKU, which is an AMD SEV-SNP confidential VM without a confidential GPU ^[65]. The workload posture is plaintext-data-and-model on a TEE-protected substrate, with the SEV-SNP attestation gating access to encrypted training data in Azure Storage via the standard MAA + SKR path. The deliberate choice of XGBoost over a deep-learning model is the architectural lesson: when the model and training data fit in CPU memory and TCB-sealed CPU compute is sufficient, the confidential GPU SKU is overkill. This is the lowest-cost on-ramp into the architecture.

Tutorial 2: LLM inferencing on a confidential GPU (Standard_NCC40ads_H100_v5). The confidential-llm-inferencing directory walks serving microsoft/Phi-4-mini-reasoning on a Standard_NCC40ads_H100_v5 SKU ^[43]. Phi-4-mini-reasoning is a 3.8 B-parameter dense decoder-only Transformer with a 128 K-token context window, MIT-licensed on Hugging Face ^[66], chosen because it fits comfortably in the H100 NVL's 94 GB HBM3 capacity with room for activation memory. The novel architectural feature here is double attestation: the tutorial's setup script uses Azure/az-cgpu-onboarding ^[45] to verify both the SEV-SNP CVM attestation (against AMD VCEK) and the NVIDIA H100 GPU attestation (against NVIDIA's on-die root of trust via NRAS) before model weights are released from Azure Key Vault Premium via SKR. This is the architectural pattern any production GPU-confidential workload should match.

Tutorial 3: Inferencing via the Confidential Whisper service (OHTTP + HPKE). Whisper, the speech-to-text model, is the publicly-demoed Microsoft Build 2024 confidential inferencing reference workload. The confidential-whisper-inferencing tutorial directory confirms the Azure AI Foundry Confidential Whisper service uses Oblivious HTTP with HPKE end-to-end encryption to keep audio encrypted until it reaches the TEE-protected Whisper model ^[27]. The reference OHTTP gateway implementation is microsoft/attested-ohttp-client and its server-side counterpart, "an Attested OHTTP gateway and client implementation by Microsoft" that "uses the Cloudflare OHTTP client/server implementation as a basis" ^[67]. This is the closest architectural pattern Azure has to PCC's non-targetability requirement -- a third-party-operated OHTTP relay strips the client IP before the request reaches the confidential inferencing endpoint, the same architectural primitive Apple uses for PCC at network ingress.

The three tutorials are the canonical references because they walk the wire-level flow. A procurement team that wants to know "what does confidential inferencing actually look like on Azure" can read the README files, the Bicep templates, the attestation-policy JSON, and the SKR-policy JSON, and answer the question without speculation. GPT-class confidential endpoints staging through 2024-2026 are forward-looking roadmap. There is no May-2024 GA for "Confidential GPT-4," but the three workshop tutorials cover the architectural primitives that such a GA would compose.

9.5 The Apple PCC node-chip transition

Apple has not publicly named the chip family used in PCC nodes. Firmware identifiers and independent analyses make the transition story concrete enough to reason about. At launch in June 2024 the PCC nodes ran on M2-Ultra-class silicon, identified by the firmware string ComputeModule14,1 visible in independent device-identifier databases ^[36]. During 2026 the PCC fleet transitioned to a new node generation identified as J226C and reported (independently, not by Apple) as built around M5-class silicon manufactured in Houston, Texas ^[37] ^[38]. The 9to5Mac report dated February 17, 2026 describes Apple's M5-based Private Cloud Compute servers tied to iOS 26.4 ^[37], and the parallel Winbuzzer coverage from the next day confirms a new "Private Cloud Compute Agent Worker" component running on M5-class node hardware ^[38].

What is architecturally interesting is not the chip identity. It is what the transition did not change. The Transparency Log architecture absorbs a generational chip change as a matter of routine policy because the log's verifier policy is a list of approved image hashes and the SEP-rooted attestation envelope structure, not a list of approved chip families. New node generation, new image hashes (visible in PrivateCloudCompute/Release.swift and validated by PrivateCloudCompute/NodeValidator.swift ^[68] ^[69]), same envelope structure, same client-side verification. From a procurement-trust perspective, the transition was an architectural non-event in exactly the way Apple's public commitments said it should be.

Two invariants held across the M2-Ultra to M5 node transition. First, the device-side envelope check is stable: the NodeValidator validates SEP-signed attestation against the SEPAttestationPolicy it parses from the release artefact ^[68] ^[70], and the policy schema did not change. Second, the public transparency log absorbed the transition without any client-side trust ceremony because the chip family is not in the verifier policy -- only the image hash is. A device that started talking to the M2-Ultra fleet in 2024 and woke up in 2026 talking to the M5 fleet did exactly one new thing: it fetched the new approved image hashes from the log. Three things did change. First, the on-node software stack (firmware, kernel, OS, inference runtime) is rebuilt for the new silicon; that is why the image hashes change. Second, the routing policy may shift -- some workloads may schedule onto the new node generation preferentially. Third, the chip family itself is not publicly named by Apple; the M5 identification is inferential from independent reporting plus firmware identifiers, not from a primary Apple source. Procurement narratives should use "Apple-designed silicon, not publicly named" when precision matters, and reach for the inferential M5 identification only when chip-family granularity is load-bearing.

The architectural payoff of a public transparency log is precisely that it absorbs a generational chip transition without any client-side trust ceremony, because the chip family is not in the verifier policy -- only the image hash is. This is what "verifiable transparency" buys procurement teams in practice: the trust contract survives silicon turnover because the contract was never about silicon. It was about which bits the silicon ran.

9.6 Third-party PCC equivalents

Could AWS or Google replicate Apple's Transparency-Log model on commodity multi-vendor silicon? The architectural feasibility is open. The Kocaoğullar et al. framework provides a conceptual pathway ^[62]. The CCC Attestation SIG's interoperable-ra-tls work is one of several substrates that a multi-vendor transparency log could ride on top of ^[63]. Whether any major cloud will actually ship it is the architectural bet the next generation hinges on. No GA product as of mid-2026.

The field is wide open. But the reader's procurement deadline is not. How do you actually choose between PCC and Azure today?

10. A Procurement Decision Tree

Six questions, asked in order. The first determines whether PCC is even in play; the rest sharpen the choice.

Question 1: Do you control the device that originates the request, and is it Apple-Intelligence-capable?

PCC requires Apple-Intelligence-capable client devices. The supported set as of mid-2026 is iPhone 15 Pro and later, iPads on M1 silicon or later, and Macs on M1 silicon or later ^[1]. If your end users are on Windows laptops, Android phones, browsers, or any non-Apple endpoint, PCC is out of scope by construction. Azure / GCP / AWS confidential AI workloads do not have an analogous client-side requirement -- they are workload-shape-agnostic and the client can be any HTTPS-speaking device.

Question 2: Can you accept Apple-as-signer as the trust root?

PCC's trust collapses to Apple's signing infrastructure. The SEP-bound CA, the Apple-operated Transparency Log signer, the Apple bug-bounty program, and the Apple Security Engineering and Architecture team are the entire trust root ^[1]. Azure spreads trust across AMD plus Intel plus NVIDIA plus Microsoft as separate signers ^[8] ^[40] ^[26]. If your security posture explicitly requires multi-vendor trust diffusion -- for example, because your regulator does not accept single-vendor SBOMs as evidence -- Azure wins this axis (see §6 for the architectural reasoning).

Question 3: Do you need customer-managed key material?

Azure: yes, via SKR from Azure Key Vault Premium or Azure Managed HSM, with a release policy bound to MAA-issued claims ^[15] ^[8]. Apple: no by design, because PCC nodes are stateless and there is no customer key material on the node to be released ^[1]. Regulated buyers whose framework requires customer-held keys -- for example, a FIPS 140-3 Level 3 customer-key-escrow requirement -- cannot map PCC into that framework, because PCC does not have the architectural primitive the framework is asking for.

Question 4: Do you need verifiable transparency of the actually-running code?

Apple: yes, via the published Transparency Log ^[6]. Azure: not via the architecture itself. You can build a customer-side log of the MAA tokens you have observed, or you can accept MAA's claims at face value. There is no Azure architectural primitive that proves the bits MAA verified are the same bits the workload is actually executing today, in the way that PCC's Transparency Log proves the image hash served to you is the same one served to every other PCC user.

This is the one axis where the architectures differ in kind. If your threat model requires that you be able to confirm what code the cloud is running, not just that the cloud says it is running specific code, PCC is the only production answer.

Question 5: Do you need GPU-class confidential compute?

Both ship it. Pay attention to two facts. First, Azure's confidential GPU is H100 only at GA in mid-2026 ^[26] ^[7]. AMD MI300X CC-On is not at GA on a major commercial cloud; NVIDIA H200 and Blackwell-class GB200 GPUs are GA on Azure as non-confidential SKUs. If you need confidential GPU compute, the only major-cloud answer is NCCads_H100_v5 (or its successor). Second, Apple's GPU is integrated on the SoC and is inside the SEP-rooted attestation envelope by construction; there is no separate cross-vendor GPU attestation step, which simplifies the trust analysis at the cost of being available only on the Apple stack.

Question 6: What does your auditor accept as evidence?

The MAA JWT is consumable by every off-the-shelf JWT verifier. It is also broadly accepted in regulated audits because the JWT format and the x-ms-* claim names are documented in publicly-fetchable Microsoft Learn pages ^[8], and auditors can map MAA tokens onto NIST SP 800-53 attestation evidence requirements without exotic tooling.

PCC's Transparency Log proof is newer. An audit that accepts a Merkle inclusion proof against an Apple-published log root as evidence is uncommon as of mid-2026; most regulated audit programs were designed before such a primitive existed in cloud AI. If your auditor needs PCC evidence, expect to write explainer documentation that translates "your image hash is in append-only public log at Merkle position N with signed root R" into the language your audit framework uses.

JavaScript Verify a Merkle inclusion proof for a Transparency Log entry

// Sketch of a Certificate-Transparency-style Merkle inclusion proof check.
// The PCC Transparency Log inherits this structural primitive from RFC 6962.
// This is educational -- a production verifier would use a maintained library.

const sha256Hex = async (data) => {
const bytes = typeof data === 'string' ? new TextEncoder().encode(data) : data;
const buf = await crypto.subtle.digest('SHA-256', bytes);
return [...new Uint8Array(buf)].map((b) => b.toString(16).padStart(2, '0')).join('');
};

const concat = (a, b) => {
const out = new Uint8Array(a.length + b.length);
out.set(a); out.set(b, a.length);
return out;
};

async function verifyInclusion(leafHashHex, leafIndex, treeSize, sibling, root) {
// sibling is the audit path (array of sibling node hashes, leaf to root)
let node = Uint8Array.from(leafHashHex.match(/.{2}/g).map(h => parseInt(h, 16)));
let idx = leafIndex;
let size = treeSize;
for (const s of sibling) {
  const sBytes = Uint8Array.from(s.match(/.{2}/g).map(h => parseInt(h, 16)));
  // RFC 6962 prefixes internal hashes with 0x01
  const prefixed = (left, right) => concat(new Uint8Array([0x01]), concat(left, right));
  const combined = (idx % 2 === 0)
    ? prefixed(node, sBytes)
    : prefixed(sBytes, node);
  const h = await sha256Hex(combined);
  node = Uint8Array.from(h.match(/.{2}/g).map(x => parseInt(x, 16)));
  idx = Math.floor(idx / 2);
  size = Math.floor((size + 1) / 2);
}
const computedRoot = [...node].map((b) => b.toString(16).padStart(2, '0')).join('');
return computedRoot === root;
}

// In production: fetch (signed log root, audit path) from the log
// and the leaf hash from the attestation envelope's image-hash field.
// If verifyInclusion returns true AND the signed root matches what your
// device trusts, the image you are about to talk to is in the public log.
console.log('Educational sketch only; use a maintained CT library in production.');

Press Run to execute.

The decision tree in one diagram

Ctrl + scroll to zoom

Diagram source

flowchart TD
Q1{"Apple-Intelligence-capable
client device required?"}
Q2{"Single-vendor (Apple)
trust root acceptable?"}
Q3{"Customer-managed key
material required?"}
Q4{"Need public-log
verifiable transparency?"}
Q5{"Need GPU TEE
at fleet scale?"}
Q6{"Auditor accepts
Merkle inclusion proof?"}
Q1 -->|No| AZ[Azure / GCP / AWS]
Q1 -->|Yes| Q2
Q2 -->|No| AZ
Q2 -->|Yes| Q3
Q3 -->|Yes| AZ
Q3 -->|No| Q4
Q4 -->|Yes| Q5
Q4 -->|No| AZ
Q5 -->|Yes, Apple integrated GPU OK| PCC[Apple PCC]
Q5 -->|Yes, need NVIDIA H100| AZ
PCC --> Q6
Q6 -->|Yes| PCC2[PCC fits the audit posture]
Q6 -->|No| PCC3[Write explainer documentation,
or fall back to Azure JWT-based evidence]

Diagram source

flowchart TD
Q1{"Apple-Intelligence-capable
client device required?"}
Q2{"Single-vendor (Apple)
trust root acceptable?"}
Q3{"Customer-managed key
material required?"}
Q4{"Need public-log
verifiable transparency?"}
Q5{"Need GPU TEE
at fleet scale?"}
Q6{"Auditor accepts
Merkle inclusion proof?"}
Q1 -->|No| AZ[Azure / GCP / AWS]
Q1 -->|Yes| Q2
Q2 -->|No| AZ
Q2 -->|Yes| Q3
Q3 -->|Yes| AZ
Q3 -->|No| Q4
Q4 -->|Yes| Q5
Q4 -->|No| AZ
Q5 -->|Yes, Apple integrated GPU OK| PCC[Apple PCC]
Q5 -->|Yes, need NVIDIA H100| AZ
PCC --> Q6
Q6 -->|Yes| PCC2[PCC fits the audit posture]
Q6 -->|No| PCC3[Write explainer documentation,
or fall back to Azure JWT-based evidence]

The procurement decision tree as a flowchart. Question 1 is the only one that can drop PCC entirely; the rest sharpen the trade-off between PCC's single-vendor verifiable transparency and Azure's multi-vendor diffusion plus customer-managed keys.

Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs. -- NVIDIA Blog, September 24, 2024 ^[4]

What the verifier actually does, on the wire

Once procurement has chosen the architecture, an engineer somewhere has to write the verifier. The two architectures end up being symmetric in this regard: each produces a cryptographic envelope, and a relying party has to parse, validate signatures, and check inclusion or claims. Three procurement-grade reference primitives anchor the choice -- two from Azure (already shown above), one from Apple PCC.

On Azure, the relying party walks an MAA JWT verification flow (decode the JWT, validate signature against the MAA JWKS, match claims against an SKR release policy -- the JavaScript reference appears in §6 Axis 3 alongside the MAA JWT decode) ^[8]. For customers who want to not trust MAA, the alternative path uses snpguest to fetch the AMD VCEK chain and verify the SEV-SNP attestation directly (the bash reference also in §6 Axis 3) ^[41]. The two paths produce structurally equivalent confidence in the same evidence.

On Apple PCC, the relying-party verifier is PrivateCloudCompute/NodeValidator.swift and friends ^[68]. The flow is: parse the AttestationBundle from the response (the bundle structure is defined in SEPAttestation.swift ^[71]); call the SEP attestation context verifier (aks_attest_context_verify) on the SEP signature against the per-die Apple-rooted certificate chain; parse the Release.swift Release struct as ASN.1 DER and compute its SHA-256 digest ^[69]; check the SEP attestation policy claims (SEPAttestationPolicy.swift ^[70]) constrain the release digest; then call SWTransparencyVerifier.verifyExpiringInclusion to verify the release digest's inclusion proof in the public transparency log ^[72] ^[73]. The full reference is the apple/private-cloud-compute repository's VerifiableReleasesExtension directory and the VerifiableReleasesExtension tutorial ^[74].

Python Sketch of a PCC attestation envelope verifier in Python

# This is a procurement-grade SKETCH, not production code. It walks the four
# verification steps a real PCC client performs (see PrivateCloudCompute/
# NodeValidator.swift for the canonical reference [@apple-pcc-nodevalidator]).
# Each function is a stub showing the contract the caller must satisfy.

from hashlib import sha256
from typing import Optional
from dataclasses import dataclass

@dataclass
class AttestationBundle:
  """The Apple PCC AttestationBundle, parsed from the response envelope.
  Structure defined in SEPAttestation.swift [@apple-pcc-sepattest]."""
  sep_signature: bytes
  sep_cert_chain: list
  release_der: bytes
  sep_attestation_policy_claims: dict
  transparency_inclusion_proof: dict

def aks_attest_context_verify(
  sep_signature: bytes,
  sep_cert_chain: list,
  apple_root_anchor: bytes,
) -> bool:
  """Step 1: verify the SEP signature against the per-die Apple-rooted
  certificate chain. In the real client this calls the Security framework's
  aks_attest_context_verify; the SEP cert chain is rooted at Apple's PCC CA.
  Returns True if the signature chains to the pinned anchor."""
  raise NotImplementedError("calls Security.framework in a real client")

def compute_release_digest(release_der: bytes) -> bytes:
  """Step 2: the Release struct is serialised as ASN.1 DER; the canonical
  release digest is SHA-256 over the DER bytes. See Release.swift for the
  schema [@apple-pcc-release-swift]."""
  return sha256(release_der).digest()

def check_sep_attestation_policy(
  claims: dict,
  expected_release_digest: bytes,
) -> bool:
  """Step 3: the SEP attestation policy claims must constrain the release
  digest. See SEPAttestationPolicy.swift for the policy schema
  [@apple-pcc-sepattestpolicy]. A real client checks the policy version,
  the claimed release digest, and the attestation freshness window."""
  claimed_digest = claims.get("release_digest")
  return claimed_digest == expected_release_digest

def verify_expiring_inclusion(
  release_digest: bytes,
  inclusion_proof: dict,
  log_witness_root: bytes,
) -> bool:
  """Step 4: verify the release digest's inclusion in the public PCC
  transparency log against a witness-cosigned tree head. Reference impl:
  SWTransparencyVerifier.verifyExpiringInclusion
  [@apple-pcc-swtrans-verifier] [@apple-pcc-transparencypolicy]."""
  raise NotImplementedError("merkle proof + cosigned witness check")

def verify_pcc_envelope(
  bundle: AttestationBundle,
  apple_root_anchor: bytes,
  log_witness_root: bytes,
) -> bool:
  """The four-step PCC verifier flow. Returns True only if every step
  passes. A real client refuses to send the user's prompt if this returns
  False."""
  if not aks_attest_context_verify(
      bundle.sep_signature, bundle.sep_cert_chain, apple_root_anchor
  ):
      return False
  release_digest = compute_release_digest(bundle.release_der)
  if not check_sep_attestation_policy(
      bundle.sep_attestation_policy_claims, release_digest
  ):
      return False
  if not verify_expiring_inclusion(
      release_digest, bundle.transparency_inclusion_proof, log_witness_root
  ):
      return False
  return True

Press Run to execute.

The symmetry is the procurement point. Azure: validate JWT signature against MAA JWKS, match claims against SKR policy. Apple PCC: validate SEP signature against Apple PCC CA, validate inclusion proof against transparency log witness root. Both are cryptographic; both produce a yes/no decision against a hardware-anchored chain of trust. The architectural difference is what the relying party is allowed to know: with PCC, the relying party knows the exact image hash that ran (because the log says so); with Azure, the relying party knows the workload met an MAA policy (because the JWT says so). The two are not interchangeable evidence, but the verifier code-paths are roughly the same shape.

The decision tree handles the typical questions. The atypical questions, and the misconceptions, are next.

11. Frequently Asked Questions

Frequently asked questions

Can the cloud provider really not see my prompt?

Yes, in both architectures, against the threats the architecture names. Apple PCC's SEP-rooted attestation envelope plus the Transparency Log refusal to forward to unlogged images defends against a malicious Apple operator passively reading prompts ^[1]. Azure CC-AI's SEV-SNP RMP-enforced memory plus MAA-gated SKR defends against a malicious Microsoft operator on the SEV-SNP path ^[8]. Neither closes side-channels on shared silicon ^[10]; neither closes compelled-vendor or lawful-access exposure; neither closes prompt-output exfiltration via the model itself. The "the cloud cannot see your prompt" claim is true against the named threat model and not against every conceivable threat.

Are TEE side-channels still a thing?

Yes. The 2018-2020 cascade closed the SGX-era residuals -- Foreshadow / L1TF ^[18], SgxPectre ^[19], Plundervolt (CVE-2019-11157) ^[20] -- and the principled extension is that any TEE built on shared microarchitectural state inherits a similar surface. The CCC's "A Technical Analysis of Confidential Computing" v1.3 names this explicitly as a residual risk that the architecture does not close by construction ^[10]. CipherLeaks (USENIX Security 2021) demonstrated the same point on the AMD SEV side via a deterministic-ciphertext side channel ^[24]. Vendor microcode updates are an ongoing operational requirement, not a one-time fix.

Is Apple's source release the same as open source?

No. Per the apple/security-pcc README verbatim: "The publication of this code is intended for security research and verification purposes only" ^[28]. The publication's purpose is research-grade transparency -- so that an independent researcher can inspect what is running, exercise the architecture inside the Virtual Research Environment, and submit findings to the Apple Security Bounty program with rewards up to $1,000,000 ^[2]. It is not a typical open-source contribution model and the license and intended use are explicitly different. The substantive thing PCC ships is verifiable transparency of the running fleet, not community-driven development.

Does Azure require a Windows guest OS for confidential AI?

No. Both Linux and Windows guest OSes are supported on Azure confidential VMs, and the reference confidential-inferencing stack Microsoft publishes is Linux-based. The microsoft/confidential-ai-workshop repository contains three Linux-based tutorial directories: confidential-llm-inferencing, confidential-whisper-inferencing, and confidential-ml-training, with reusable modules for attestation, key management, key origin, model sourcing, and OS disk encryption ^[64]. The LLM inferencing tutorial deploys a Standard_NCC40ads_H100_v5 confidential VM with a vLLM-plus-Streamlit-plus-Caddy stack ^[43]. Windows is supported; Linux is the canonical reference.

How does this relate to Confidential Containers (CoCo)?

Confidential Containers is an orchestration-layer abstraction that maps Kubernetes pods onto Generation-3 confidential VMs running on AMD SEV-SNP, Intel TDX, or IBM Secure Execution ^[52]. It composes on top of the same substrate Azure CC-AI uses. It does not compete with Apple PCC architecturally -- they live at different layers of the stack. A CoCo deployment on Azure can use MAA and SKR for its attestation and key-release primitives, and orchestration vendors like Edgeless Systems' Contrast wrap that pattern into a workload-level confidential-computing primitive on Kubernetes ^[54].

Does either architecture defend against a vendor compelled by law?

No. Both rest on vendor-controlled signing infrastructure. PCC's compelled-vendor exposure is concentrated on Apple, because the signer of every PCC attestation chain is Apple. Azure's is distributed across AMD, Intel, NVIDIA, and Microsoft, but a compelled Microsoft is sufficient to compromise an MAA-rooted workload because MAA is the single verifier whose JWT every downstream relying party trusts ^[8]. Trust diffusion across multiple vendors makes the collapse harder, but it does not make any one vendor's compelled-update path architecturally impossible. This is a property of the trust-rooting model, not a flaw of either architecture, and neither closes it by construction.

Is Mark Russinovich's Confidential AI on Azure talk an OSDI 2024 paper?

No. The canonical late-2024 Mark Russinovich confidential-AI session is Microsoft Ignite 2024 BRK430, "Inside Azure Innovations with Mark Russinovich," also published on YouTube as "Confidential AI and Inference -- Inside Azure Innovations." Russinovich's "data in use" framing for confidential computing originally appeared in his September 14, 2017 Azure blog "Introducing Azure confidential computing," not in an academic OSDI venue ^[9]. Microsoft Build 2024's confidential-inferencing session was BRK227, "Inside AI Security with Mark Russinovich," which announced confidential inferencing for the Azure OpenAI Whisper speech-to-text model -- not for GPT-4, and not under the title "Confidential GPT" ^[27].

What to carry into the next conversation

Two architectures. One promise. One axis on which they differ in kind. The end-user pitch -- "the cloud cannot see your prompt" -- is now functionally identical across Apple Private Cloud Compute and Azure Confidential AI, but the architectural machinery underneath ships two genuinely different things. PCC ships verifiable transparency of the production fleet through an Apple-controlled stack and a public Transparency Log. Azure CC-AI ships multi-vendor trust diffusion plus customer-managed keys through AMD SEV-SNP plus NVIDIA H100 CC-On plus MAA plus SKR. Each closes a trust-anchor gap the other leaves open. Neither closes the gap the other closes. Neither closes the side-channel, compelled-vendor, or model-output exfiltration gaps -- the CCC's own v1.3 analysis names these as residual risks for any TEE-based design ^[10].

The next architectural generation -- the one that combines Azure-style multi-vendor TEE composition with Apple-style append-only transparency of production images -- would close the gap both leave open. The Kocaoğullar et al. transparency framework is the conceptual sketch ^[62]; the CCC Attestation SIG and the IETF RATS Working Group are where the production work is happening ^[63] ^[17]. No vendor has shipped it.

For now, the load-bearing decision is the one Question 4 in §10 asks. If your threat model requires that you be able to confirm what code the cloud is actually running -- and not just that the cloud says it is running specific code -- PCC is the only production answer in mid-2026. If your threat model is satisfied by multi-vendor trust diffusion and a managed-verifier JWT, Azure CC-AI gives you a richer key-management story and broader silicon optionality. The architectures are not better and worse. They are answers to different questions. The first useful step in any confidential-AI procurement is naming which question you are actually trying to answer.

Study guide

Key terms

Trusted Execution Environment (TEE): Hardware-isolated execution context that protects confidentiality and integrity of code and data even from the host OS, hypervisor, or peripheral firmware.
Secure Enclave Processor (SEP): Apple-designed separate processor core on the same SoC as the main application processor, with its own boot ROM, AES engine, and protected memory. Per-node hardware root of trust on every Apple PCC server.
Reverse Map Table (RMP): Hardware-maintained table in AMD SEV-SNP recording owner and validation state for every 4 KB physical page. Defends against SEVered-style hypervisor remap attacks by construction.
Microsoft Azure Attestation (MAA): Managed Microsoft verifier service that consumes hardware attestation evidence (SEV-SNP, TDX, SGX, vTPM) and issues a signed JWT whose claims downstream relying parties consume.
Secure Key Release (SKR): Azure Key Vault Premium / Managed HSM capability that gates release of a wrapped key on a successful MAA JWT verification against a customer-defined release policy.
Transparency Log (Apple PCC): Append-only public log of every production PCC node software image hash. The user's device refuses to forward a request to a node whose image hash is not in the log.
Security Protocol and Data Model (SPDM): DMTF DSP0274 standard for mutually-authenticated PCIe-endpoint sessions, used by the NVIDIA H100 CC-On architecture to bind the host CPU TEE to the GPU.
Oblivious HTTP (OHTTP, RFC 9458): IETF protocol for forwarding HTTP requests through a third-party relay that strips the client IP, preventing the origin or any single intermediary from linking requests to a client.

References

Apple Security Engineering and Architecture (2024). Private Cloud Compute: A new frontier for AI privacy in the cloud. https://security.apple.com/blog/private-cloud-compute/ - Apple's June 2024 PCC announcement; defines the five PCC requirements and the user-facing 'not even to Apple' promise. ↩
(2024). Security research on Private Cloud Compute. https://security.apple.com/blog/pcc-security-research/ - Apple's October 2024 source-release announcement; introduces the Virtual Research Environment and Apple Security Bounty extension to PCC. ↩
(2024). General Availability: Azure Confidential VMs with NVIDIA H100 Tensor Core GPUs. https://techcommunity.microsoft.com/blog/azureconfidentialcomputingblog/general-availability-azure-confidential-vms-with-nvidia-h100-tensor-core-gpus/4242644 - Microsoft's September 24, 2024 GA announcement of confidential VMs with NVIDIA H100 (NCCads_H100_v5). ↩
(2024). Azure is the first cloud provider to offer confidential computing with NVIDIA H100 GPUs. https://blogs.nvidia.com/blog/azure-confidential-vm-h100-general-availability/ - NVIDIA's blog companion to the Sept 24 2024 Azure GA event; 'first cloud provider to offer confidential computing with H100' framing. ↩
(2024). Microsoft Trustworthy AI: Unlocking human potential starts with trust. https://blogs.microsoft.com/blog/2024/09/24/microsoft-trustworthy-ai-unlocking-human-potential-starts-with-trust/ - Microsoft Trustworthy AI post (September 2024), the companion narrative to the H100-GA announcement. ↩
(2024). Release Transparency (Private Cloud Compute). https://security.apple.com/documentation/private-cloud-compute/releasetransparency - Apple's canonical 'Transparency Log / Release Transparency' documentation page. ↩
(2026). NCCads_H100_v5 size series (Azure). https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/nccadsh100v5-series - Microsoft Learn product page for the NCCads_H100_v5 SKU; TEE spans CVM and attached H100 GPU. ↩
(2025). Microsoft Azure Attestation overview. https://learn.microsoft.com/en-us/azure/attestation/overview - Microsoft Azure Attestation product overview; documents the JWT-based attestation flow across SGX/TDX/SEV-SNP. ↩
Mark Russinovich (2017). Introducing Azure confidential computing. https://azure.microsoft.com/en-us/blog/introducing-azure-confidential-computing/ - Russinovich's September 2017 announcement of Azure confidential computing; introduces 'data in use' as the third protection state. ↩
Confidential Computing Consortium Technical Advisory Council (2022). A Technical Analysis of Confidential Computing v1.3. https://confidentialcomputing.io/wp-content/uploads/sites/10/2023/03/CCC-A-Technical-Analysis-of-Confidential-Computing-v1.3_Updated_November_2022.pdf - Confidential Computing Consortium 'A Technical Analysis of Confidential Computing' v1.3 (Nov 2022); vendor-neutral CC definition. ↩
(2025). About the Confidential Computing Consortium. https://confidentialcomputing.io/about/ - Confidential Computing Consortium institutional 'About' page; Linux Foundation project community. ↩
(2024). Secure Enclave (Apple Platform Security). https://support.apple.com/guide/security/secure-enclave-sec59b0b31ff/web - Apple Platform Security guide entry on the Secure Enclave Processor (SEP) architecture. ↩
Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos Rozas, Hisham Shafi, Vedvyas Shanbhogue, & Uday Savagaonkar (2013). Innovative Instructions and Software Model for Isolated Execution. https://dl.acm.org/doi/10.1145/2487726.2488368 - HASP 2013; ACM landing page is bot-gated to non-browser fetchers but reachable to readers. ↩
Victor Costan & Srinivas Devadas (2016). Intel SGX Explained. https://eprint.iacr.org/2016/086 - IACR ePrint 2016/086; landing page bot-gated to non-browser fetchers, reachable to readers. ↩
(2025). Azure confidential computing overview. https://learn.microsoft.com/en-us/azure/confidential-computing/overview - Azure confidential computing canonical overview page. ↩
(2020). SEV-SNP: Strengthening VM Isolation with Integrity Protection and More. https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf - AMD SEV-SNP whitepaper (January 2020); canonical primary for Reverse Map Table, SEV-SNP firmware, and TCB version vector. ↩
Henk Birkholz, Dave Thaler, Michael Richardson, Ned Smith, & Wei Pan (2023). RFC 9334: Remote ATtestation procedureS (RATS) Architecture. https://datatracker.ietf.org/doc/rfc9334/ - RFC 9334 Remote ATtestation procedureS (RATS) Architecture; canonical RATS vocabulary (Attester/Verifier/Relying Party). ↩
Jo Van Bulck, Marina Minkin, Ofir Weisse, Daniel Genkin, Baris Kasikci, Frank Piessens, Mark Silberstein, Thomas F. Wenisch, Yuval Yarom, & Raoul Strackx (2018). Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution. https://www.usenix.org/conference/usenixsecurity18/presentation/van-bulck - Van Bulck et al., 'Foreshadow', USENIX Security 2018; L1TF transient-execution attack on SGX (EPID key leakage). ↩
Guoxing Chen, Sanchuan Chen, Yuan Xiao, Yinqian Zhang, Zhiqiang Lin, & Ten H. Lai (2019). SgxPectre Attacks: Stealing Intel Secrets from SGX Enclaves via Speculative Execution. https://ieeexplore.ieee.org/document/8806720 - Chen et al., 'SgxPectre', IEEE EuroS&P 2019; Spectre-v1 inside SGX enclaves. ↩
Kit Murdock, David Oswald, Flavio D. Garcia, Jo Van Bulck, Daniel Gruss, & Frank Piessens (2020). Plundervolt (CVE-2019-11157). https://plundervolt.com/ - Murdock et al., 'Plundervolt'; software fault injection via privileged voltage control (CVE-2019-11157). ↩
Mathias Morbitzer, Manuel Huber, Julian Horsch, & Sascha Wessel (2018). SEVered: Subverting AMD's Virtual Machine Encryption. https://www.usenix.org/conference/woot18/presentation/morbitzer - Morbitzer et al., 'SEVered', WOOT 2018; hypervisor-mediated page remapping defeats SEV memory encryption without keys. ↩
(2026). Announcing general availability of Azure Intel TDX confidential VMs. https://techcommunity.microsoft.com/blog/azureconfidentialcomputingblog/announcing-general-availability-of-azure-intel%C2%AE-tdx-confidential-vms/4495693 - Microsoft's February 2026 announcement of GA Azure Intel TDX confidential VMs (v6 family). ↩
(2026). DCesv6-series (Azure Confidential VMs). https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/general-purpose/dcesv6-series - Microsoft Learn product page for the DCesv6 series (Intel TDX, Emerald Rapids). ↩
Mengyuan Li, Yinqian Zhang, Huibo Wang, Kang Li, & Yueqiang Cheng (2021). CipherLeaks: Breaking Constant-time Cryptography on AMD SEV via the Ciphertext Side Channel. https://www.usenix.org/conference/usenixsecurity21/presentation/li-mengyuan - Li et al., 'CipherLeaks', USENIX Security 2021; deterministic-ciphertext side channel on AMD SEV-ES. ↩
(2023). NVIDIA Hopper H100 Confidential Compute Whitepaper (WP-11459-001). https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/HCC-Whitepaper-v1.0.pdf - NVIDIA Hopper H100 Confidential Compute Whitepaper (WP-11459-001), July 2023; architectural primary. ↩
(2023). Confidential Computing on NVIDIA H100 GPUs for Secure and Trustworthy AI. https://developer.nvidia.com/blog/confidential-computing-on-h100-gpus-for-secure-and-trustworthy-ai/ - NVIDIA Developer Blog deep dive on H100 confidential computing (CC-Off / CC-On / CC-DevTools modes; SPDM; bounce buffer). ↩
(2025). confidential-whisper-inferencing tutorial (confidential-ai-workshop). https://github.com/microsoft/confidential-ai-workshop/tree/main/tutorials/confidential-whisper-inferencing - confidential-whisper-inferencing tutorial; documents the Build 2024 Confidential Whisper preview (not Confidential GPT-4). ↩
(2024). apple/security-pcc -- Private Cloud Compute source release. https://github.com/apple/security-pcc - Apple's PCC source release (apple/security-pcc) for security-research verification. ↩
(2024). Private Cloud Compute (Apple Security Documentation). https://security.apple.com/documentation/private-cloud-compute/ - Apple's PCC Security Guide DocC site; canonical reference for PCC architecture terminology. ↩
Martin Thomson & Christopher A. Wood (2024). RFC 9458: Oblivious HTTP. https://www.rfc-editor.org/info/rfc9458 - RFC 9458 Oblivious HTTP; the standardised primitive PCC's third-party OHTTP relay implements. ↩
(2023). RFC 9474: RSA Blind Signatures. https://www.rfc-editor.org/rfc/rfc9474.html - RFC 9474 RSA Blind Signatures; the primitive Apple PCC's Target Diffusion single-use credentials build on. ↩
(2025). NVIDIA/nvtrust -- Ancillary Software for NVIDIA Trusted Computing Solutions. https://github.com/NVIDIA/nvtrust - NVIDIA/nvtrust open-source attestation tooling; Apache 2.0 reference implementation for H100 CC verification. ↩
(2020). DMTF DSP0274 Security Protocol and Data Model (SPDM) Specification v1.1.0. https://www.dmtf.org/sites/default/files/standards/documents/DSP0274_1.1.0.pdf - DMTF DSP0274 Security Protocol and Data Model v1.1.0; the SPDM session protocol used by H100 CC mode for CPU↔GPU attestation. ↩
(2025). NVIDIA Attestation SDK (C++) Introduction. https://docs.nvidia.com/attestation/nv-attestation-sdk-cpp/latest/sdk-c/introduction.html - NVIDIA nv-attestation-sdk-cpp introduction; C++ attestation SDK successor to nvtrust Python SDK. ↩
(2025). NVIDIA Confidential Computing documentation. https://docs.nvidia.com/confidential-computing/index.html - NVIDIA Confidential Computing documentation hub; links to CC Deployment Guide, attestation, nvtrust, secure-AI matrix. ↩
(2025). AppleDB device identifier ComputeModule14,1. https://appledb.dev/device/identifier/ComputeModule14,1 - AppleDB device-identifier page for ComputeModule14,1; the M2-Ultra-class firmware identifier of original PCC hardware. ↩
(2026). Apple plans M5-based Private Cloud Compute architecture for Apple Intelligence. https://9to5mac.com/2026/02/17/apple-plans-m5-based-private-cloud-compute-architecture-for-apple-intelligence/ - 9to5Mac February 2026 report on Apple's M5-based PCC server transition (J226C, iOS 26.4, Houston manufacturing). ↩
(2026). Apple Upgrades Private Cloud Compute With M5 Chips. https://winbuzzer.com/2026/02/18/apple-upgrades-private-cloud-compute-m5-chips-xcxwbn/ - Winbuzzer corroborating M5-PCC report (February 2026); confirms J226C and the M3/M4 skip. ↩
(2024). Introducing Apple Foundation Models. https://machinelearning.apple.com/research/introducing-apple-foundation-models - Apple's introduction of Apple Foundation Models, confirming the server-side model runs on Apple-silicon PCC. ↩
(2024). AMD Key Distribution Service: VCEK Milan Certificate Chain. https://kdsintf.amd.com/vcek/v1/Milan/cert_chain - AMD Key Distribution Service endpoint; canonical AMD-hosted source for the VCEK certificate chain. ↩
(2025). virtee/snpguest -- AMD SEV-SNP guest CLI. https://github.com/virtee/snpguest - VirTEE snpguest CLI; canonical successor to AMDESE/sev-tool for SEV-SNP guest-side attestation operations. ↩
(2024). AMDESE/sev-tool (deprecated; superseded by virtee/snpguest and snphost). https://github.com/AMDESE/sev-tool - AMDESE/sev-tool repository (now deprecated); legacy SEV/SEV-ES tool superseded by VirTEE snpguest/snphost. ↩
(2025). confidential-llm-inferencing tutorial (confidential-ai-workshop). https://github.com/microsoft/confidential-ai-workshop/tree/main/tutorials/confidential-llm-inferencing - confidential-llm-inferencing tutorial; deploys Standard_NCC40ads_H100_v5 with vLLM + Streamlit + Caddy and SKR. ↩
(2026). AMD Instinct MI300X cloud pricing and provider list (ComputePrices). https://computeprices.com/gpus/mi300x - ComputePrices.com cloud-provider list for AMD Instinct MI300X (Hot Aisle, Seeweb, et al.); no production confidential-GPU mode at GA. ↩
(2025). Azure/az-cgpu-onboarding -- NVIDIA H100 CC-On onboarding for Azure NCC SKUs. https://github.com/Azure/az-cgpu-onboarding - Azure/az-cgpu-onboarding reference repository; end-to-end NCCads_H100_v5 + H100 CC-On attestation flow (Bash + PowerShell). ↩
Jianwei Zhu, Hang Yin, Pengfei Deng, Andrew Almeida, & Shunfan Zhou (2024). Confidential Computing on Nvidia H100 GPU: A Performance Benchmark Study. https://arxiv.org/abs/2409.03992 - Zhu et al., 'Confidential Computing on NVIDIA H100 GPU: A Performance Benchmark Study', arXiv 2409.03992; <7% LLM overhead, PCIe bounce buffer is the dominant cost. ↩
(2025). AWS Nitro System. https://aws.amazon.com/ec2/nitro/ - AWS Nitro System product page; Nitro Cards + Nitro Security Chip + lightweight Nitro Hypervisor. ↩
(2025). What is AWS Nitro Enclaves?. https://docs.aws.amazon.com/enclaves/latest/user/nitro-enclave.html - AWS Nitro Enclaves user guide; vsock-only parent channel, no persistent storage, KMS attestation integration. ↩
(2025). Google Cloud Attestation. https://docs.cloud.google.com/confidential-computing/docs/attestation - Google Cloud Attestation; unified verifier across SEV / SEV-SNP / Intel TDX confidential environments. ↩
(2025). Confidential Space overview (Google Cloud). https://docs.cloud.google.com/confidential-computing/confidential-space/docs/confidential-space-overview - Google Cloud Confidential Space overview; multi-party collaboration with TEE-released secrets. ↩
(2025). Intel Trust Authority: Integrate with GCP Confidential Space. https://docs.trustauthority.intel.com/main/articles/articles/ita/integrate-gcp-cs.html - Intel Trust Authority + GCP Confidential Space integration documentation. ↩
(2025). Confidential Containers (CoCo) on GitHub. https://github.com/confidential-containers - Confidential Containers (CoCo) GitHub org; CNCF Sandbox project for Kubernetes-pod-level CC across multiple TEEs. ↩
(2025). Anjuna Security. https://www.anjuna.io/ - Anjuna Security homepage; Seaglass runtime portability layer for AWS Nitro/Azure CVM/GCP confidential containers. ↩
(2025). edgelesssys/contrast -- confidential container deployments on Kubernetes. https://github.com/edgelesssys/contrast - Edgeless Contrast GitHub repository; active Kata-Containers-based Kubernetes confidential-containers product (successor to Constellation). ↩
(2025). Contrast architecture overview (Edgeless Systems). https://docs.edgeless.systems/contrast/architecture/overview - Edgeless Contrast architecture overview; custom RuntimeClass + Coordinator-as-CA + Initializer + mTLS service mesh. ↩
(2025). Contrast documentation (Edgeless Systems). https://docs.edgeless.systems/contrast - Edgeless Contrast documentation root; supports AMD SEV-SNP + Intel TDX, BSL license. ↩
(2025). edgelesssys/constellation (archived; superseded by Contrast). https://github.com/edgelesssys/constellation - edgelesssys/constellation (archived); CNCF-certified Kubernetes engine that wrapped a whole cluster in a single confidential context; superseded by Contrast. ↩
(2025). Fortanix Confidential Computing Manager. https://www.fortanix.com/platform/confidential-computing-manager - Fortanix Confidential Computing Manager product page; control plane spanning TDX/SEV-SNP CPUs and Hopper/Blackwell GPUs. ↩
(2025). Fortanix Data Security Manager. https://www.fortanix.com/platform/data-security-manager - Fortanix Data Security Manager product page; FIPS 140-2 L3 KMS/HSM-as-a-service (distinct architectural plane from CCM). ↩
(2019). Fortanix Awarded FIPS 140-2 Level 3 Certification. https://www.fortanix.com/company/pr/2019/10/fortanix-awarded-fips140-2-level-3-certification - Fortanix DSM FIPS 140-2 Level 3 certification press release (October 2019). ↩
Rafal Wojtczuk & Joanna Rutkowska (2009). Attacking Intel Trusted Execution Technology. https://www.invisiblethingslab.com/resources/bh09dc/Attacking%20Intel%20TXT%20-%20paper.pdf - Wojtczuk & Rutkowska, 'Attacking Intel TXT', Black Hat DC/USA 2009; SMM-based TXT bypass. ↩
Ceren Kocaoğullar, Tina Marjanov, Ivan Petrov, Ben Laurie, Al Cutter, Christoph Kern, Alice Hutchings, & Alastair R. Beresford (2024). A Confidential Computing Transparency Framework for a Trust Chain. https://arxiv.org/abs/2409.03720 - Kocaoğullar et al., 'A Confidential Computing Transparency Framework for a Comprehensive Trust Chain', arXiv 2409.03720; three-level transparency framework with empirical study. ↩
(2025). CCC-Attestation GitHub Organization. https://github.com/CCC-Attestation - CCC Attestation SIG GitHub org; hosts formal specs, RATS cheat sheet, attested-TLS PoC. ↩
(2025). microsoft/confidential-ai-workshop. https://github.com/microsoft/confidential-ai-workshop - microsoft/confidential-ai-workshop repository; reference stack for confidential Whisper/LLM/ML training tutorials. ↩
(2025). confidential-ml-training tutorial (confidential-ai-workshop). https://github.com/microsoft/confidential-ai-workshop/tree/main/tutorials/confidential-ml-training - confidential-ml-training tutorial; CPU-only DCasv5 (SEV-SNP) ML training reference with MAA + AKV SKR. ↩
(2025). microsoft/Phi-4-mini-reasoning model card. https://huggingface.co/microsoft/Phi-4-mini-reasoning - Microsoft Phi-4-mini-reasoning model card on Hugging Face; 3.8B parameters, 128K context, MIT license. ↩
(2025). microsoft/attested-ohttp-client -- reference OHTTP client for Azure AI confidential inferencing. https://github.com/microsoft/attested-ohttp-client - microsoft/attested-ohttp-client reference implementation; Rust + Python + Docker attested-OHTTP client for Azure AI confidential inferencing. ↩
(2024). NodeValidator.swift (apple/security-pcc CloudAttestation). https://github.com/apple/security-pcc/blob/main/CloudAttestation/CloudAttestation/NodeValidator.swift - PCC CloudAttestation NodeValidator.swift; the device-side eight-policy-check chain implementation. ↩
(2024). Release.swift (apple/security-pcc CloudAttestation). https://github.com/apple/security-pcc/blob/main/CloudAttestation/CloudAttestation/Transparency/Release.swift - PCC Release.swift; the SHA-256-of-DER digest scheme for PCC release artefacts. ↩
(2024). SEPAttestationPolicy.swift (apple/security-pcc CloudAttestation). https://github.com/apple/security-pcc/blob/main/CloudAttestation/CloudAttestation/Policy/SEPAttestationPolicy.swift - PCC SEPAttestationPolicy.swift; the SEP-envelope verification policy implementation. ↩
(2024). SEP+Attestation.swift (apple/security-pcc CloudAttestation). https://github.com/apple/security-pcc/blob/main/CloudAttestation/CloudAttestation/SEP/SEP+Attestation.swift - PCC SEP+Attestation.swift extension; the aks_attest_context_verify SEP-envelope verification call site. ↩
(2024). SWTransparencyVerifier.swift (apple/security-pcc CloudAttestation). https://github.com/apple/security-pcc/blob/main/CloudAttestation/CloudAttestation/Transparency/SWTransparencyLog/SWTransparencyVerifier.swift - PCC SWTransparencyVerifier.swift; the software-transparency inclusion-proof check. ↩
(2024). TransparencyPolicy.swift (apple/security-pcc CloudAttestation). https://github.com/apple/security-pcc/blob/main/CloudAttestation/CloudAttestation/Policy/TransparencyPolicy.swift - PCC TransparencyPolicy.swift; the Release-Transparency policy enforcing Transparency-Log membership. ↩
(2024). Virtual Research Environment (Private Cloud Compute documentation). https://security.apple.com/documentation/private-cloud-compute/virtualresearchenvironment - Apple Virtual Research Environment DocC page; the load-bearing publication anchor for M5-PCC transition discovery. ↩

1. Same Promise, Opposite Architectures#

2. Confidential Computing's Two Parents#

Parent one: the hardware TEE lineage#

Parent two: the cloud-operator-as-adversary lineage#

Convergence#

3. Process Enclaves and the Operator-Honesty Assumption#

4. Three Architectural Waves That Made Cloud Confidential AI Feasible#

Wave 1 (~2020-2022): VM-level TEEs with hardware-enforced page ownership#

Wave 2 (~2022-2024): Attestation and key release as managed services#

Wave 3 (June-October 2024): GPU TEEs, vendor-controlled fleets, and the public arrival of confidential AI#

5. Two Distinct 2024 Designs#

(a) Apple's Verifiable Transparency model#

(b) Microsoft and NVIDIA's cross-vendor CPU+GPU TEE composition#

6. Six Axes, One Difference In Kind#

Axis 1: Silicon control#

Axis 2: Hardware root of trust#

Axis 3: Attestation surface#

Axis 4: Key release and state model#

Axis 5: GPU TEE#

Axis 6: Network anonymization#

The six axes, side by side#

7. Beyond the Two Headliners#

AWS Nitro Enclaves#

Google Cloud Confidential Space#

Confidential Containers and the orchestration tier#

Where these fit on the six axes#

8. What No TEE Can Do#

1. Side-channels on shared silicon#

2. Trust-anchor compromise#

3. ROM-burned single-signer revocation#

4. Supply-chain compromise of the AI model#

5. Prompt-output exfiltration via the model itself#

6. Compelled vendor and lawful access#

And one more: MAA-as-service compromise#

9. Where Active Work Is Happening#

9.1 Verifiable transparency of the verifier itself#

9.2 GPU confidential-computing parity across vendors#

9.3 Cross-vendor attestation portability#

9.4 Confidential inferencing for Azure OpenAI models#

9.5 The Apple PCC node-chip transition#

9.6 Third-party PCC equivalents#

10. A Procurement Decision Tree#

Question 1: Do you control the device that originates the request, and is it Apple-Intelligence-capable?#

Question 2: Can you accept Apple-as-signer as the trust root?#

Question 3: Do you need customer-managed key material?#

Question 4: Do you need verifiable transparency of the actually-running code?#

Question 5: Do you need GPU-class confidential compute?#

Question 6: What does your auditor accept as evidence?#

The decision tree in one diagram#

What the verifier actually does, on the wire#

11. Frequently Asked Questions#

Frequently asked questions

What to carry into the next conversation#

Key terms

References

Share

1. Same Promise, Opposite Architectures

2. Confidential Computing's Two Parents

Parent one: the hardware TEE lineage

Parent two: the cloud-operator-as-adversary lineage

Convergence

3. Process Enclaves and the Operator-Honesty Assumption

4. Three Architectural Waves That Made Cloud Confidential AI Feasible

Wave 1 (~2020-2022): VM-level TEEs with hardware-enforced page ownership

Wave 2 (~2022-2024): Attestation and key release as managed services

Wave 3 (June-October 2024): GPU TEEs, vendor-controlled fleets, and the public arrival of confidential AI

5. Two Distinct 2024 Designs

(a) Apple's Verifiable Transparency model

(b) Microsoft and NVIDIA's cross-vendor CPU+GPU TEE composition

6. Six Axes, One Difference In Kind

Axis 1: Silicon control

Axis 2: Hardware root of trust

Axis 3: Attestation surface

Axis 4: Key release and state model

Axis 5: GPU TEE

Axis 6: Network anonymization

The six axes, side by side

7. Beyond the Two Headliners

AWS Nitro Enclaves

Google Cloud Confidential Space

Confidential Containers and the orchestration tier

Where these fit on the six axes

8. What No TEE Can Do

1. Side-channels on shared silicon

2. Trust-anchor compromise

3. ROM-burned single-signer revocation

4. Supply-chain compromise of the AI model

5. Prompt-output exfiltration via the model itself

6. Compelled vendor and lawful access

And one more: MAA-as-service compromise

9. Where Active Work Is Happening

9.1 Verifiable transparency of the verifier itself

9.2 GPU confidential-computing parity across vendors

9.3 Cross-vendor attestation portability

9.4 Confidential inferencing for Azure OpenAI models

9.5 The Apple PCC node-chip transition

9.6 Third-party PCC equivalents

10. A Procurement Decision Tree

Question 1: Do you control the device that originates the request, and is it Apple-Intelligence-capable?

Question 2: Can you accept Apple-as-signer as the trust root?

Question 3: Do you need customer-managed key material?

Question 4: Do you need verifiable transparency of the actually-running code?

Question 5: Do you need GPU-class confidential compute?

Question 6: What does your auditor accept as evidence?

The decision tree in one diagram

What the verifier actually does, on the wire

11. Frequently Asked Questions

What to carry into the next conversation