# Above Ring Zero: How the Windows Hypervisor Became a Security Primitive

> A deep tour of the Windows hypervisor as the substrate of VBS, HVCI, Credential Guard, and Secure Launch -- its five primitives, the boundary it commits to, and the public failures that calibrate it.

*Published: 2026-05-10*
*Canonical: https://paragmali.com/blog/above-ring-zero-how-the-windows-hypervisor-became-a-security*
*License: CC BY 4.0 - https://creativecommons.org/licenses/by/4.0/*

---
<TLDR>
**The Windows hypervisor is the program that loaded before Windows did.** It runs at a privilege level the Windows kernel cannot reach and owns the page tables that decide which memory the Windows kernel may even see. Virtualization-Based Security, Credential Guard, HVCI, Application Control, VBS Enclaves, and System Guard Secure Launch are all built by composing five primitives the hypervisor exposes -- partitions, hypercalls, intercepts, SynIC, and per-VTL SLAT. The substrate is real, alive, and producing two to four public CVEs per year; the residual attack surface (firmware below, side channels above, IOMMU bypass beside, hypervisor rollback) is where Windows security still earns its hardest miles.
</TLDR>

## 1. Above Ring Zero

On a Windows 11 machine with VBS turned on, a kernel-mode driver running with full Ring-0 privilege cannot read a single byte of the LSASS process's credential cache. It cannot load an unsigned driver. It cannot patch `ntoskrnl.exe`. It cannot disable HVCI without a reboot. None of this is enforced by Windows. It is enforced by a different program -- one that loaded before Windows did, that runs at a privilege level the Windows kernel cannot reach, and that owns the page tables that say which memory the Windows kernel may even *see*. That program is the Windows hypervisor [@ms-hyperv-architecture, @ms-tlfs-vsm].

The intuition this fact violates is older than most readers' careers. "SYSTEM owns the box." Every introductory security course teaches it. Local administrator escalates to SYSTEM, SYSTEM loads a driver, the driver runs in the kernel, and the kernel can do anything to the machine. That model is correct for a Windows installation running without Virtualization-Based Security. It is wrong, in three specific and load-bearing ways, for a Windows installation that has VBS turned on.

<Definition term="Virtualization-Based Security (VBS)">
A Windows security architecture that uses the Hyper-V hypervisor to create a small, isolated execution environment alongside the normal Windows operating system. The hypervisor allocates a portion of memory, configures its second-level page tables to make that memory unreadable and unwritable from normal kernel mode, and runs Microsoft-signed code there -- the Secure Kernel and isolated user-mode trustlets -- that the regular NT kernel cannot reach. Credential Guard, HVCI, Application Control, and System Guard all sit on top of this primitive [@ms-tlfs-vsm].
</Definition>

The binary in question is named `hvix64.exe` on Intel hosts and `hvax64.exe` on AMD hosts.<Sidenote>Loose security writing sometimes calls the hypervisor's privilege level "Ring -1." That phrase is colloquial. Intel's manuals say "VMX root operation"; AMD's manuals say "SVM host mode." Both terms denote a CPU operating mode that sits architecturally outside the four-ring privilege stack the guest OS sees, not a fifth ring inside it.</Sidenote> It is loaded by `hvloader.efi` before `winload.exe` ever runs. By the time the Windows boot manager hands control to the NT kernel, the hypervisor has already configured the CPU's virtualization extensions, allocated its own private memory, taken ownership of the IOMMU, and set up the per-partition second-level page tables that decide which physical pages each partition can see [@ms-tlfs-pdf]. From the NT kernel's point of view, the machine starts up already inside a guest partition. There is no escape upward.

This article is about the program that loaded first. The siblings in this series -- on the [Secure Kernel](https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/), on [Credential Guard and NTLMless](https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/), on [Secure Boot](https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/), and on [Adminless](https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/) -- all assume what this article explains. Each of them describes a policy: the Secure Kernel enforces code integrity; Credential Guard isolates LSASS; Adminless raises the bar on local administrator. None of those policies would be enforceable without a piece of software running at a privilege level the policy's adversary cannot reach. The hypervisor is that piece of software, and "security primitive" is how Microsoft, the security research community, and the bug-bounty market all describe its current role.

By the end of this article you will know five things. First, *why* the hypervisor became a security primitive -- the architectural failure of Ring-0 defenses that Microsoft fought for a decade and finally gave up on in 2015. Second, *how* it became one, in three steps: Popek and Goldberg's 1974 virtualizability theorem; Intel VT-x and AMD-V in 2005-2006; and David Hepkin and Arun Kishan's 2013 patent on hierarchical Virtual Trust Levels [@us9430642b2-patent]. Third, *what* it enforces, feature by feature, with the hypervisor primitive that backs each: HVCI rides on per-VTL SLAT; Credential Guard rides on SynIC plus the secure-call ABI; System Guard Secure Launch rides on DRTM [@ms-system-guard-secure-launch]. Fourth, *where* it has actually failed in public -- six worked CVEs across three distinct attack classes, all narrowly localized. Fifth, *what* is structurally outside its mandate: firmware below the hypervisor, microarchitectural side channels above it, IOMMU bypass beside it, and hypervisor rollback through the update pipeline.

The story is half engineering and half conceptual inversion. How did a server-consolidation hypervisor that shipped in 2008 with `Windows Server 2008` -- a product whose original marketing pitch was "run more VMs per box" -- become the architectural substrate that protects every load-bearing Windows security boundary in 2026? The answer begins in 1974, with a paper that defined what a hypervisor even *is*. But the political and engineering thread begins five years before that, in San Mateo, California.

## 2. Origins -- Connectix to Viridian to Hyper-V

Microsoft entered the virtualization market three years late and by acquisition. On February 19, 2003, the company bought Connectix, a small San Mateo software house founded in 1988 that had built Virtual PC for Macintosh and, later, Virtual PC for Windows [@macrumors-connectix-2003, @zdnet-connectix-2003]. The Connectix engineers became the nucleus of what Microsoft would internally call the Windows Server Virtualization team. The acquired products shipped as Microsoft Virtual PC 2004 and Microsoft Virtual Server 2005. Both were Type-2 hypervisors -- user-mode applications that ran on top of Windows, using software techniques rather than CPU virtualization extensions, because the CPU virtualization extensions did not yet exist on shipping x86 hardware.

<Definition term="Type-1 hypervisor">
A hypervisor that runs directly on hardware rather than as an application on top of a host operating system. The hypervisor owns the CPU, the second-level page tables, and (in the security-relevant case) the IOMMU; guest operating systems run at a lower privilege level, in partitions or virtual machines that the hypervisor schedules and isolates. IBM's CP-67/CMS in 1968 is the genre's origin; VMware ESX, Xen, and the Microsoft hypervisor (`hvix64.exe`/`hvax64.exe`) are the modern examples [@wp-hypervisor].
</Definition>

In 2005, the team began a new project under the codename "Viridian." The goal was a Type-1 micro-kernelized hypervisor for x86-64 -- a fresh build, not a derivative of Virtual Server -- that required hardware virtualization extensions at install time. Intel's VT-x had shipped in November 2005 with the Pentium 4 662/672; AMD-V had shipped on May 23, 2006 across the Athlon 64 family (Athlon 64, Athlon 64 X2, and Athlon 64 FX). Both were now broadly enough deployed that Microsoft could make hardware virtualization a system requirement rather than a configuration option. Three years later, on June 26, 2008, Hyper-V reached RTM and was delivered as a Windows Server 2008 feature through Windows Update [@wp-hyperv].<Sidenote>Microsoft ships two hypervisor binaries: `hvix64.exe` for Intel hosts (using VT-x) and `hvax64.exe` for AMD hosts (using AMD-V). The instruction-set-architecture divergence is real -- Intel uses `vmcall` to enter the hypervisor; AMD uses `vmmcall` -- but the hypercall ABI surface above that single instruction is identical, so the rest of the Microsoft hypervisor codebase is shared between the two binaries.</Sidenote>

The 2008 design choices are worth naming individually because the ones that mattered for *server consolidation* turned out, twelve years later, to also be the ones that mattered for *security*. Three deserve flagging:

- **Micro-kernelized architecture.** The hypervisor binary contains only the minimum machinery needed to virtualize the CPU, schedule VMs, and enforce memory isolation. It does not contain device drivers. It does not contain a network stack. It does not contain a filesystem.
- **Root partition plus child partitions.** From the Microsoft architecture documentation: "*The Microsoft hypervisor must have at least one parent, or root, partition, running Windows. The virtualization management stack runs in the parent partition and has direct access to hardware devices. The root partition then creates the child partitions which host the guest operating systems*" [@ms-hyperv-architecture]. The root partition is a full Windows install; the child partitions are guest VMs.
- **VMBus, VSP, and VSC.** Inter-partition I/O happens over the VMBus -- a paravirtualized message channel. A Virtualization Service Provider (VSP) runs in the root partition and owns the real device; a Virtualization Service Client (VSC) runs in each child partition and talks to the VSP over VMBus. Device emulation lives in the root partition's user-mode and kernel-mode code, *not in the hypervisor binary itself*. This is the choice that, twelve years later, kept the hypervisor's Trusted Computing Base small enough to be defensible.

<Mermaid caption="Hyper-V's 2008 partition model. The hypervisor binary at the bottom owns only CPU virtualization, scheduling, and the second-level page tables. The root partition is a full Windows install that runs the Virtualization Service Providers and owns the real devices. Each child partition runs a guest OS that talks to its VSCs over the VMBus.">
flowchart TD
    subgraph Root["Root partition (Windows Server)"]
        RD["Real device drivers"]
        VSP["Virtualization Service Providers"]
        VMM["VM Worker Processes (vmwp.exe)"]
    end
    subgraph Child1["Child partition 1 (guest OS)"]
        VSC1["Virtualization Service Clients"]
        Guest1["Guest kernel + apps"]
    end
    subgraph Child2["Child partition 2 (guest OS)"]
        VSC2["Virtualization Service Clients"]
        Guest2["Guest kernel + apps"]
    end
    HV["Microsoft Hypervisor (hvix64.exe / hvax64.exe)"]
    HW["Hardware (CPU, RAM, NIC, disk)"]
    Root -. VMBus .- Child1
    Root -. VMBus .- Child2
    Root --> HV
    Child1 --> HV
    Child2 --> HV
    HV --> HW
</Mermaid>

The micro-kernel, root-plus-child, and VMBus choices were defensible *server* engineering. Their server engineering rationale was that emulating a NIC, or a SCSI controller, or a graphics adapter inside a hypervisor binary would balloon the binary's size, lock its code-review cycles to those of every device the company shipped, and force the same security-critical code that scheduled CPUs to also handle Ethernet frame parsing. Putting device emulation in a normal Windows process inside the root partition -- the VM Worker Process `vmwp.exe` -- meant the hypervisor binary could stay small enough to reason about.

The 2008 design goal was, again, server consolidation. Microsoft's positioning materials at the time named "run more VMs per box, get better hardware utilization" as the customer pitch. Nothing in the 2008 Hyper-V documentation describes the hypervisor as a security primitive for the host OS. The security re-purposing -- the moment Hyper-V's hardware-privilege isolation became the way Windows itself protected its own kernel from itself -- did not arrive until 2015. To understand why it arrived at all, we have to back up thirty-four years to a 1974 paper that defined what virtualization formally requires.

## 3. The Theoretical Anchor -- Popek, Goldberg, and SLAT

Before Microsoft could build a hypervisor that ran security-critical code at a higher privilege than the Windows kernel, two unrelated decisions had to land. One was made in 1974, by two researchers who would never see Windows. The other was made in 2005, by Intel.

In July 1974, Gerald Popek of UCLA and Robert Goldberg of Harvard published "Formal Requirements for Virtualizable Third Generation Architectures" in *Communications of the ACM* [@popek-goldberg-1974, @wp-popek-goldberg]. The paper laid down three properties any "true" virtual machine monitor must satisfy:

- **Equivalence.** Programs run on the VMM exhibit behavior essentially identical to behavior on the bare machine, except for differences due to timing and resource availability.
- **Resource control.** The VMM, not the guest, controls the system resources -- CPU time slices, memory, devices.
- **Efficiency.** A statistically dominant subset of the instruction stream executes directly on hardware, without VMM intervention.

The theorem that gave the paper its lasting reputation followed from those properties. Let a *sensitive instruction* be one that either reads or modifies privileged state (the processor's mode bits, page-table base register, interrupt mask). Let a *privileged instruction* be one that traps when executed in user mode. Then a sufficient condition for an ISA to be virtualizable is that every sensitive instruction is privileged. The intuition is simple: the VMM must get a chance to see -- and to handle -- every guest action that touches the machine's privileged state. If the CPU silently lets the guest do something privileged-feeling without trapping, the VMM cannot maintain equivalence and control simultaneously.

<Definition term="Popek-Goldberg virtualizability">
A property of a processor architecture: every sensitive instruction in the instruction set is privileged. An architecture with this property can be virtualized "classically" -- with a thin trap-and-emulate hypervisor whose only entry points are the traps the CPU raises on privileged-instruction violations. An architecture without this property requires software workarounds (binary translation, paravirtualization) or hardware extensions (VT-x, AMD-V) before a Popek-Goldberg-style VMM can be built [@popek-goldberg-1974, @wp-popek-goldberg].
</Definition>

For three decades, x86 was famously *not* virtualizable in the Popek-Goldberg sense. John Robin and Cynthia Irvine enumerated the problem in their 2000 USENIX Security paper: seventeen protected-mode instructions on the IA-32 architecture either read or modified privileged state without trapping from user mode.<Sidenote>The Robin and Irvine enumeration includes instructions like `SGDT` (store global descriptor table register), `SIDT` (store interrupt descriptor table register), `SLDT` (store local descriptor table register), `SMSW` (store machine status word), and `PUSHF/POPF` (push/pop flags including IOPL). Each of these silently returned or accepted privileged state from user mode without raising a fault. The aggregate effect was that no classical Popek-Goldberg VMM could correctly virtualize an unmodified x86 guest -- every one of those seventeen instructions was a hole the VMM could not see through.</Sidenote> VMware Workstation, released in 1999 by VMware Inc. (which had been founded the year prior by Mendel Rosenblum, Diane Greene, Scott Devine, Ellen Wang, and Edouard Bugnion), worked around the problem with *binary translation*: it dynamically rewrote each protected-mode guest instruction stream to substitute or trap the seventeen offenders. The technique imposed double-digit overhead, made debugging miserable, and was a security liability in its own right -- the binary translator itself was a parser of arbitrary attacker-controlled code.

Intel and AMD ended the problem in hardware. Intel VT-x (codename Vanderpool, November 2005) and AMD-V (codename Pacifica, May 2006) added a new CPU mode -- *VMX root operation* for Intel, *SVM host mode* for AMD -- and a new instruction-emulation mechanism. A *VM exit* could be configured to fire on every sensitive instruction the hypervisor wished to intercept, transferring control to the host with a structured exit reason and an opaque, host-controlled snapshot of guest state. After 2006, x86-64 became Popek-Goldberg-virtualizable in hardware [@wp-x86-virtualization].

<Mermaid caption="VT-x / AMD-V round-trip. The guest issues a sensitive instruction; the CPU traps via VM-EXIT to the hypervisor running in VMX root mode; the hypervisor consults the VMCS execution-control bits, services the instruction, and resumes the guest via VM-ENTRY.">
sequenceDiagram
    participant Guest as Guest OS (VMX non-root)
    participant CPU as CPU hardware
    participant HV as Hypervisor (VMX root)
    Guest->>CPU: MOV CR3, rax  (sensitive instr)
    CPU->>HV: VM-EXIT (reason 28: CR access)
    HV->>HV: Read VMCS exit-qualification
    HV->>HV: Validate, emulate, update SLAT
    HV->>CPU: VMRESUME
    CPU->>Guest: Continue guest at next instruction
</Mermaid>

One architectural element more was needed before any of this could be a *security* primitive rather than just a virtualization primitive. Classical x86 paging maps a guest virtual address to a physical address through a single CPU-walked page table. In a virtualized system that single table cannot be enough, because the guest needs its own virtual-to-physical map and the host needs to remap the guest's "physical" address to a real machine-physical address. The first generations of VT-x simulated this two-level mapping in software through *shadow page tables*, which the hypervisor had to maintain alongside the guest's tables on every page-table edit. Shadow paging was correct but slow, and it gave the hypervisor no clean way to enforce a *different* memory map for different parts of the same guest.

Second-Level Address Translation (SLAT) -- Intel's Extended Page Tables (EPT, shipped with Nehalem in November 2008) and AMD's Nested Page Tables (NPT, shipped with the Barcelona-generation Opteron on September 10, 2007) -- solved both problems in hardware. The guest walks its own page table from virtual to "guest physical"; the CPU then walks a second, hypervisor-owned page table from "guest physical" to "system physical." Two key properties follow. First, the hypervisor has exclusive control of the second-level mapping; the guest cannot read, write, or even know that it exists. Second, because the second-level mapping is per-partition, the hypervisor can give two partitions different views of the same machine physical memory -- the same page can be readable in one partition and entirely absent in another.

<Definition term="Second-Level Address Translation (SLAT)">
A hardware feature on Intel (EPT) and AMD (NPT) CPUs that lets the hypervisor maintain a second page table mapping guest-physical addresses to system-physical addresses. The CPU walks the guest's own page table for the virtual-to-guest-physical mapping, then walks the hypervisor's table for the guest-physical-to-system-physical mapping. Because the second table is hypervisor-controlled and per-partition, the hypervisor can give different partitions -- and, in VBS, different Virtual Trust Levels inside the same partition -- different views of physical memory. SLAT is the bedrock of VTL memory protection [@ms-tlfs-pdf].
</Definition>

Hyper-V required VT-x or AMD-V at install time from day one. SLAT became mandatory with Windows Server 2016 and Windows 10 1607 [@ms-hyperv-architecture].

Popek and Goldberg gave us the property. Intel and AMD gave us the hardware. Microsoft used both to build a server hypervisor in 2008. But for the first seven years of Hyper-V's life, none of that machinery protected Windows from itself. Microsoft hadn't yet noticed the architectural problem that made it necessary -- or rather, they had noticed the problem (PatchGuard's bypass record was public) and had not yet conceded that the problem was structural. The concession came in 2015. What forced it was the same-privilege paradox.

## 4. The Same-Privilege Paradox -- Why PatchGuard Was Never Enough

PatchGuard, which Microsoft shipped in 2005 with Windows Server 2003 SP1 x64, ran inside `ntoskrnl.exe` at Ring 0 and scanned a curated list of kernel structures -- the system service dispatch table, the interrupt descriptor table, the kernel image's `.text` section -- at randomized intervals to detect tampering. It was bypassed within months by Skywing's *Uninformed* writeups. Microsoft kept shipping it. Researchers kept bypassing it. The pattern lasted a decade. The reason is not that PatchGuard's authors were sloppy [@wp-kpp]. The reason is structural, and naming it correctly is the first of the three insights this article is built around.

> **Key idea:** Any defense reachable by `mov` from Ring 0 is defeasible by `mov` from Ring 0.

The intuition is simple. PatchGuard is a piece of code. It lives in the kernel's virtual address space at some page. It owns a timer that re-runs it periodically. It maintains a randomization seed for which structures it checks next. It has a callback path into `KeBugCheckEx` if it detects tampering. Every one of those four assets -- the code page, the timer callback, the randomization seed, the bug-check path -- is a kernel data structure or a kernel virtual address. An attacker with Ring-0 code execution can locate each of them by searching the same kernel address space PatchGuard searches. They can patch the callback so the timer no-ops. They can patch the seed so the randomization is predictable. They can patch the bug-check path so it reports success. They can do all of this with a sequence of plain `mov` instructions. PatchGuard cannot defend against this, because PatchGuard's defenses live in the same place its attacker's writes do.

<PullQuote>
PatchGuard and its attacker are colleagues, not adversaries. They share an office. The office is `ntoskrnl.exe`'s virtual address space, and there is no key on the door.
</PullQuote>

This is the *same-privilege paradox*. It is not an implementation bug. It does not yield to better obfuscation, more randomization, or harder-to-find timers. It is an architectural ceiling. A defense at privilege level $P$ cannot be enforced against an attacker who also runs at privilege level $P$, because the defender's state lives in the attacker's address space. The defender can be made *expensive* to find; it cannot be made impossible to find, because the attacker has the same instructions, the same address-space view, and the same MMU privileges as the defender.

> **Note:** The same-privilege paradox is a property of where the defense *lives*, not of how clever the defense is. PatchGuard's authors did add randomization. They did add multiple decoy callbacks. They did add cryptographically derived integrity checks. None of those reductions changes the basic fact that the attacker, holding the same Ring-0 privilege, can locate and edit each of them. The architectural fix is not better PatchGuard. The architectural fix is moving the defender to a privilege level the attacker cannot reach.

Once the paradox is named, the defender's choice is binary. Either give up on having a defense at all -- treat Ring 0 as a free-fire zone where any malware that gets there has won -- or move the defender to a privilege level *above* Ring 0, at a hardware boundary the attacker's `mov` instructions cannot cross. Microsoft picked the second. It is the only architecturally honest choice.

To make it work, Microsoft needed three things. The first was a hypervisor already deployed on every Windows install. They had that since 2008. The second was a way to put a piece of Windows itself -- code, data, secrets -- *inside* the hypervisor's protection without spawning a separate VM, because spawning a separate VM doubles the system's resource cost and forces every Windows process to choose between living on the normal side or the secure side. That required an architectural idea that did not yet exist in 2010: a way to split a single partition into two privilege levels, each with its own SLAT mapping and its own register state. The third was a way to ensure the hypervisor itself could not be silently replaced or rolled back beneath the OS. That required a hardware-rooted measurement -- a DRTM event -- that the OS could attest to.

The architectural idea is the subject of section 6. The DRTM measurement is the subject of section 11. Both of them required a decade-long conversation about whether the *hypervisor itself* could be trusted at all -- a conversation that ran in parallel during the same years and that briefly seemed to argue the opposite case. We turn to that conversation next.

## 5. The Hyperjacking Era -- SubVirt, Blue Pill, and CloudBurst

While Microsoft was finishing Hyper-V, the security community was establishing that a hypervisor was not just a defense -- it was also the most powerful possible attacker against the OS sitting above it. Three demonstrations in three years made the point unmistakable.

**SubVirt.** In May 2006, Samuel King and Peter Chen at the University of Michigan, joined by Yi-Min Wang, Chad Verbowski, Helen Wang, and Jacob Lorch at Microsoft Research, presented "SubVirt: Implementing Malware with Virtual Machines" at IEEE S&P [@king-subvirt-2006]. Their construction was a *Virtual Machine Based Rootkit* (VMBR). A privileged installer running inside a legitimate OS installed a malicious VMM at boot time; on the next reboot, the malicious VMM ran first, brought up the original OS as a guest underneath it, and gained the privileged position of seeing every CPU instruction, every memory access, and every I/O the OS performed. The original OS had no architectural way to tell it was no longer the most-privileged software on the box. SubVirt was demonstrated against Windows XP (using Microsoft Virtual PC as the malicious VMM substrate) and against Linux (using VMware Workstation), specifically to show that the technique was not tied to any one operating system or any one hypervisor product.

**Blue Pill.** Three months later, at Black Hat USA 2006, Joanna Rutkowska of COSEINC demonstrated "Subverting Vista Kernel for Fun and Profit" [@wp-blue-pill]. Her tool, codenamed *Blue Pill*, took a step beyond SubVirt by doing the VMM insertion at *runtime* rather than at boot. The technique: a Ring-0 driver, running inside an already-booted Windows install on an AMD-V capable host, executed `VMRUN` against an attacker-controlled Virtual Machine Control Block (VMCB) whose initial state matched the current physical CPU. The CPU dropped out of SVM root mode and re-entered as a guest under the attacker's VMM. The OS continued running normally, with no boot-loader modification and no reboot.

By 2007, Rutkowska and Alexander Tereshkin returned to Black Hat USA with the more polished "IsGameOver(,) Anyone?" presentation, refining the technique and addressing the early critics' detection ideas [@rutkowska-isgameover-2007, @wp-blue-pill].<Sidenote>Rutkowska's marketing claim that Blue Pill was "100% undetectable" attracted a public counter-effort: in 2007, Edgar Barbosa, Nate Lawson, Peter Ferrie, and Tom Ptacek all proposed detection techniques relying on side channels (timing artifacts of trapped instructions, TSC skew, structural differences in how `RDTSC` behaves under VT-x). The claim softened in subsequent publications, but the underlying point survived: a hostile thin hypervisor below a victim OS can be made arbitrarily difficult to detect from inside that OS, and the only architecturally clean way to know what you are running under is to measure the boot chain before the OS starts.</Sidenote>

**CloudBurst.** At Black Hat USA 2009, Kostya Kortchinsky of Immunity Inc. presented CLOUDBURST [@infocondb-bh-2009]. It was the first public guest-to-host escape against a commercial hypervisor: a heap overflow in VMware's emulated SVGA-II graphics adapter, tracked as CVE-2009-1244 [@nvd-cve-2009-1244]. A guest VM, executing entirely inside a VMware-managed user-mode process on the host, could overflow a buffer in that process and gain host code execution. CloudBurst's lasting operational lesson was not the specific bug but the *attack surface*: device emulation -- not the trap-and-emulate core of the hypervisor -- is the largest piece of guest-attacker-controlled code in any commercial VMM. Every Hyper-V guest-to-host escape Microsoft has shipped a patch for since 2018 lands in either this device-emulation surface or the hypercall input-validation surface that mediates the same kinds of structured guest-controlled input.

<Mermaid caption="The hyperjacking threat model. A privileged installer (SubVirt) or a runtime Ring-0 driver (Blue Pill) inserts a malicious VMM strictly below an already-installed OS. From the OS's point of view, the machine continues to operate; from the attacker's point of view, every CPU instruction, every memory access, and every interrupt is now under their control.">
flowchart TD
    subgraph Before["Before hyperjacking"]
        OS1["Victim OS"]
        FW1["Firmware (UEFI)"]
        HW1["Hardware"]
        OS1 --> FW1
        FW1 --> HW1
    end
    subgraph After["After hyperjacking"]
        OS2["Victim OS (now a guest)"]
        VMM["Hostile VMM (SubVirt / Blue Pill)"]
        FW2["Firmware (UEFI)"]
        HW2["Hardware"]
        OS2 --> VMM
        VMM --> FW2
        FW2 --> HW2
    end
</Mermaid>

The three demonstrations established a difficult dual truth. The hypervisor is the most powerful defender against an OS-level attacker, *and* it is the most powerful attacker against an OS-level defender. The same primitive can play either role; which role it plays in any given system depends only on *whose* hypervisor it is and whether the OS above it can prove that. SubVirt-style attacks did not require Microsoft to invent anything new -- they only had to be a possibility -- to force Microsoft into a design constraint: any "hypervisor as security primitive" architecture has to start by being *the only* hypervisor on the box, with a measurement of the hypervisor binary recorded in a [TPM platform configuration register](/blog/the-tpm-in-windows-one-primitive-twenty-five-years-and-the-c/) so that any malicious VMBR underneath could be detected at attestation time. This is the role that System Guard Secure Launch (DRTM) plays in the architecture, and we will return to it in section 11.

<Aside label="The same primitive, two roles">
Blue Pill (offense) and VBS (defense) are architecturally identical. Each is a thin Type-1 hypervisor that interposes between firmware and OS. Each owns the CPU's virtualization mode, the second-level page tables, and the IOMMU. Each is invisible to the OS unless the OS can prove what is underneath it. The only differences between them are whose hypervisor it is, whether it was measured at load time, and what it does with its privilege. The defense is the offense, run by the right people, in the right order, and attested to.
</Aside>

By 2010 the security community had agreed: the hypervisor is the most powerful primitive in the system, and whoever owns the SLAT page tables owns the box. Joanna Rutkowska's Invisible Things Lab launched Qubes OS, an explicitly hypervisor-rooted security OS, on April 7, 2010 [@qubes-introducing-2010]. Microsoft owned the SLAT page tables. They had a hypervisor on every Windows install. They had a server-consolidation product. What they did not yet have was a *reason* to re-purpose any of it for security. The reason was already being filed at the United States Patent and Trademark Office. The priority date was September 17, 2013.

## 6. The Pivot -- VSM, VTLs, and the Hepkin-Kishan Patent

On September 17, 2013, David Hepkin and Arun Kishan filed United States patent application 14/186,415, which would issue on August 30, 2016 as US Patent 9,430,642 B2 [@us9430642b2-patent]. The patent's title, "Providing virtual secure mode with different virtual trust levels," reads like marketing now because the words it introduced -- "Virtual Trust Level," "VTL," "Virtual Secure Mode" -- became Microsoft's own canonical terminology. In 2013 the words did not exist. The patent describes, in 2013, exactly what Microsoft shipped twenty-two months later in Windows 10 build 10240 [@ms-tlfs-vsm].

The patent's claim language is unusually specific. It teaches a virtual-machine manager that makes "*multiple different virtual trust levels available to virtual processors of a virtual machine*"; it teaches that "*different memory access protections (such as the ability to read, write, and/or execute memory) can be associated with different portions of memory (e.g., memory pages) for each virtual trust level*"; and it teaches that "*the virtual trust levels are organized as a hierarchy with a higher level virtual trust level being more privileged than a lower virtual trust level.*" Each of those phrases is now a feature of the shipping Microsoft hypervisor.

<Definition term="Virtual Trust Level (VTL)">
A hypervisor-managed privilege level inside a single partition. Each VTL has its own SLAT mapping (so the same machine page can be readable in one VTL and absent in another), its own virtual-processor register state (so a VTL transition is a context switch, not a procedure call), and its own interrupt subsystem (so interrupts targeted at one VTL do not preempt code running in another). VTLs are hierarchical: a higher VTL can read all of a lower VTL's memory, but not vice versa. The shipping Microsoft hypervisor implements two VTLs (VTL0 = Normal world, VTL1 = Secure world); the architecture admits up to sixteen [@ms-tlfs-vsm].
</Definition>

Windows 10 RTM on July 29, 2015, and Windows Server 2016, shipped VBS atop the *existing* Hyper-V hypervisor [@wp-windows-10]. The architectural innovation -- the thing the patent was for -- was that VTL0 (Normal world, containing the NT kernel, user mode, and LSASS) and VTL1 (Secure world, containing the Secure Kernel and Isolated User Mode trustlets) ran *inside the same partition* rather than in two separate partitions. VBS is not a second VM. It is a per-VTL SLAT split inside the root partition, plus a per-VTL register-state snapshot, plus a per-VTL interrupt delivery surface. The hypervisor switches SLAT contexts on VTL transitions, exactly as it would switch SLAT contexts on a partition switch -- but the switch happens inside a single partition's address space, so there is no extra VM scheduling and no extra OS image to manage.

<Mermaid caption="The VTL split inside the root partition. Both VTL0 and VTL1 live in the same partition; the hypervisor switches SLAT mappings and register state on VTL transitions. VTL0 runs the normal NT kernel and user mode (including LSASS). VTL1 runs the Secure Kernel and isolated user-mode trustlets (LSAISO for credential isolation; VMSP/vTPM for the virtual TPM). Memory marked unreadable in VTL0's SLAT is the architectural basis for Credential Guard and HVCI.">
flowchart TD
    subgraph Root["Root partition"]
        subgraph VTL0["VTL0 -- Normal world"]
            NT["NT kernel (ntoskrnl.exe)"]
            User["User mode (lsass.exe, applications)"]
        end
        subgraph VTL1["VTL1 -- Secure world"]
            SK["Secure Kernel (securekernel.exe)"]
            IUM["Isolated User Mode trustlets"]
            LSAISO["LSAISO.EXE"]
            VTPM["vTPM trustlet"]
            IUM --- LSAISO
            IUM --- VTPM
        end
    end
    HV["Microsoft Hypervisor (hvix64 / hvax64)"]
    HW["Hardware (CPU, RAM, IOMMU, TPM)"]
    VTL0 -. "Secure call (hypercall + SynIC)" .-> VTL1
    VTL1 --> HV
    VTL0 --> HV
    HV --> HW
</Mermaid>

The Hyper-V Top-Level Functional Specification, chapter 15, names the architectural facts verbatim. "*VSM achieves and maintains isolation through Virtual Trust Levels (VTLs). VTLs are enabled and managed on both a per-partition and per-virtual processor basis.*" "*Virtual Trust Levels are hierarchical, with higher levels being more privileged than lower levels.*" "*Architecturally, up to 16 levels of VTLs are supported; however a hypervisor may choose to implement fewer than 16 VTL's. Currently, only two VTLs are implemented.*" The C-level definition `#define HV_NUM_VTLS 2` is published in the same specification [@ms-tlfs-vsm]. Two VTLs are what ships; the architecture has room for more.

<PullQuote>
"VSM enables operating system software in the root and guest partitions to create isolated regions of memory for storage and processing of system security assets. Access to these isolated regions is controlled and granted solely through the hypervisor, which is a highly privileged, highly trusted part of the system's Trusted Compute Base (TCB)." -- Microsoft, *Hyper-V Top-Level Functional Specification*, chapter 15 [@ms-tlfs-vsm]
</PullQuote>

This is the second insight the article is built around: VBS is not a re-architecture. It is a re-purposing. The hypervisor was already on every Windows install for unrelated reasons. The 2015 pivot did not require new hardware, new VMs, or new CPUs. It required a new way to *organize* what was already there -- two SLAT mappings instead of one, two register snapshots instead of one, a secure-call ABI on top of the SynIC -- and a Windows-side Secure Kernel binary to run inside the new VTL1 view. The patent gave the design its formal expression; the engineering had been waiting since 2008 for the right architectural insight.<Sidenote>David Hepkin spent over a decade on the NT kernel architecture team before the VSM design; Arun Kishan was an NT kernel architect and is now Microsoft's Corporate Vice President for the Operating Systems Platform group. Neither is a virtualization specialist by background. Their patent is, in retrospect, a kernel-team idea about how to put a piece of the kernel itself behind a hardware boundary the kernel cannot cross -- exactly the kind of design that an architect who had lived inside `ntoskrnl.exe` for years would invent.</Sidenote>

Alex Ionescu's Black Hat USA 2015 deck "Battle of SKM and IUM: How Windows 10 Rewrites OS Architecture" reverse-engineered the entire VSM stack within four weeks of Windows 10 RTM [@ionescu-bh-2015]. The vocabulary Ionescu introduced has become the canonical research language for talking about VBS: VTL as "synthetic ring level managed by the hypervisor"; *trustlets* for the user-mode processes that run inside VTL1's Isolated User Mode; Signature Level 12 plus the IUM EKU `1.3.6.1.4.1.311.10.3.37` as the loader's signing requirement. Microsoft's own developer documentation now uses the same terms [@ms-iso-user-mode-trustlets].

The pivot, then, was not a sudden re-architecture. It was the cash-out of a deliberate multi-year engineering plan that began at least twenty-two months before Windows 10 RTM. To see what VBS actually enforces -- and which hypervisor primitive backs each piece of that enforcement -- we need to walk the hypervisor's public surface. There are five surfaces. They are the architectural body of the article.

## 7. Architecture Tour -- The Hypervisor's Public Surface

What does the Windows hypervisor actually look like as a piece of software? It is a small kernel, on the order of one to two hundred thousand lines of C and C++ by community estimate; Microsoft has not published a primary line count. It has five externally visible surfaces, all of which are documented in the Hyper-V Top-Level Functional Specification (TLFS) v6.0b [@ms-tlfs-pdf]. We walk them in turn.

### 7.1 Partitions, VMBus, and the VSP/VSC pair

A *partition* is the hypervisor's unit of isolation. From the Microsoft architecture page: "*The Microsoft hypervisor must have at least one parent, or root, partition, running Windows. The virtualization management stack runs in the parent partition and has direct access to hardware devices. The root partition then creates the child partitions which host the guest operating systems*" [@ms-hyperv-architecture]. The root partition is a full Windows install with privileged hypercalls and direct access to hardware; each child partition is a guest VM with only the hardware the root has chosen to expose.

A guest VM does I/O over the VMBus. A network packet, for example, travels from the guest application down to the guest's Windows NDIS stack; through the synthetic NIC miniport driver (the VSC) in the guest's kernel; over the VMBus message channel; into the network VSP in the root partition; into the root's real NDIS stack; into the physical NIC driver; out the wire. The hypervisor's role in this chain is structural: it owns the VMBus message channel, the SynIC interrupts that notify the VSP and VSC of new traffic, and the per-partition SLAT mappings that decide which bytes either side can read.

The architectural implication is that *device emulation lives in the root partition, not in the hypervisor binary*. The TCB the hypervisor binary itself has to protect is narrow. The TCB the root partition's drivers have to protect is much wider -- but those drivers live in normal Windows kernel mode, where Microsoft has thirty years of tooling. This is why almost every public Hyper-V CVE since 2018 has landed in `vmswitch.sys`, `storvsp.sys`, or the NT Kernel Integration VSP, rather than in `hvix64.exe` itself.

> **Note:** Putting device emulation in the root partition means the hypervisor binary does not need to parse Ethernet frames, SCSI commands, USB descriptors, or graphics-adapter command rings. The trade-off is that the root partition becomes part of the TCB -- a root-partition kernel-mode bug is a hypervisor-equivalent break -- but the small hypervisor binary itself can be reviewed, fuzzed, and reasoned about as a single piece of code.

### 7.2 The hypercall ABI

Hypercalls are how partitions request services from the hypervisor. The TLFS documents two flavors. A *fast* hypercall passes its parameters inline in CPU registers: on x64, `rcx` carries a 64-bit hypercall input value (the low 16 bits are the call code; the upper 48 bits are a control word with fields for the Fast flag, variable-header size, Rep Count, and Rep Start Index), `rdx` carries the first input parameter, and `r8` carries the second. A *slow* hypercall instead passes the GPA (guest physical address) of an input-parameter page in `rdx`, and the GPA of an output-parameter page in `r8`; the actual parameter content lives in those pages. The instruction that triggers the hypercall is `vmcall` on Intel and `vmmcall` on AMD; the hypervisor maps both onto the same internal entry point [@ms-tlfs-pdf].

<Definition term="Hypercall">
A guest-to-hypervisor call. The guest issues `vmcall` (Intel) or `vmmcall` (AMD); the CPU traps via VM-EXIT into the hypervisor in VMX root mode; the hypervisor reads the call code from `rcx`, reads the inputs from registers (fast) or from a GPA-pointed page (slow), services the request, writes outputs back, and returns via VM-ENTRY. Hypercalls are the only legitimate way for a partition to invoke hypervisor services [@ms-tlfs-pdf].
</Definition>

<RunnableCode lang="js" title="Hypercall input value layout (TLFS section 3)">{`
// A JavaScript model of the rcx hypercall input value layout.
// In a real hypercall the guest sets rcx, rdx, r8 and issues vmcall / vmmcall.
function packHypercallInput({ callCode, fastFlag, varHeaderSize, isNested, repCount, repStartIdx }) {
  // rcx layout (TLFS section 3 "Hypercall Interface", verbatim bit map)
  //   bits  0..15  Call Code
  //   bit      16  Fast (1 = inline params in rdx/r8)
  //   bits 17..26  Variable header size (in QWORDs)
  //   bits 27..30  RsvdZ
  //   bit      31  Is Nested
  //   bits 32..43  Rep Count
  //   bits 44..47  RsvdZ
  //   bits 48..59  Rep Start Index
  //   bits 60..63  RsvdZ
  let rcx = 0n;
  rcx |= BigInt(callCode) & 0xFFFFn;
  if (fastFlag) rcx |= 1n << 16n;
  rcx |= (BigInt(varHeaderSize) & 0x3FFn) << 17n;
  if (isNested) rcx |= 1n << 31n;
  rcx |= (BigInt(repCount) & 0xFFFn) << 32n;
  rcx |= (BigInt(repStartIdx) & 0xFFFn) << 48n;
  return rcx;
}
// HvCallPostMessage = 0x005C, fast hypercall (TLFS section 11)
const rcx = packHypercallInput({
  callCode: 0x005C,
  fastFlag: 1,
  varHeaderSize: 0,
  isNested: 0,
  repCount: 0,
  repStartIdx: 0,
});
console.log('rcx = 0x' + rcx.toString(16).padStart(16, '0'));
// Output: rcx = 0x000000000001005c
`}</RunnableCode>

The call-code space is small and well-documented: a few hundred codes, each one a structured request with typed inputs and outputs. The hypercall path is also where the most consequential 2024 Hyper-V CVE lived. CVE-2024-21407 was a use-after-free in `hvix64.exe`'s handling of a specific file-operation hypercall, the rare case where the bug was in the hypervisor binary itself rather than in a root-partition driver [@nvd-cve-2024-21407].

### 7.3 Intercepts

Intercepts are how the hypervisor virtualizes guest behavior. The TLFS distinguishes four categories: *instruction* intercepts (`CPUID`, MSR reads/writes, I/O-port instructions), *exception* intercepts (page faults, general protection faults), *memory-access* intercepts (a guest tries to read or write a specific guest-physical-address region), and *partition-state* intercepts (a guest hits a state that the hypervisor wants to be notified about). Each is configured per-partition through the Intel VMCS execution-control bits or the AMD VMCB control fields [@ms-tlfs-pdf].

<Definition term="Intercept">
A configurable hypervisor notification on a specific guest event. The hypervisor programs the VMCS or VMCB to fire a VM-EXIT when the guest issues a particular instruction, raises a particular exception, accesses a particular memory region, or transitions to a particular state. Intercepts are the policy mechanism that lets the hypervisor implement device emulation, security checks, and VTL transitions [@ms-tlfs-pdf].
</Definition>

For VBS, the load-bearing intercept is the memory-access intercept. When VTL0 code tries to access a region whose VTL0 SLAT mapping is unreadable or unwritable, the access traps to the hypervisor with the offending GPA; the hypervisor can deliver the intercept to the VTL1 Secure Kernel as a *secure call*, letting VTL1 see what VTL0 was trying to do and decide whether to allow it. This is how HVCI's W^X enforcement is wired: a VTL0 page that is marked writable in VTL0's SLAT is marked non-executable in the same SLAT; an attempt to switch the same page to executable becomes a memory-access intercept that VTL1 must approve.

### 7.4 The Synthetic Interrupt Controller (SynIC)

The Synthetic Interrupt Controller, SynIC, is the hypervisor's per-virtual-processor event delivery surface. Each VP has 16 Synthetic Interrupt Source (SINT) lines, a message page (where the hypervisor places message-shaped events), an event-flag page (where it places bit-flag events), and a set of synthetic timers. SynIC is the bus on which VMBus traffic between VSP and VSC moves; it is also the bus on which VTL transitions between VTL0 and VTL1 are delivered inside the root partition [@ms-tlfs-pdf].

<Definition term="Synthetic Interrupt Controller (SynIC)">
A hypervisor-emulated interrupt controller, parallel to the hardware APIC, that delivers hypervisor-originated events to a virtual processor. Each VP has 16 SINT lines, a message page, an event-flag page, and synthetic timers. VMBus signaling rides on SynIC; secure-call delivery between VTL0 and VTL1 rides on SynIC; vTPM, virtual-PCI, and other paravirtualized device events ride on SynIC [@ms-tlfs-pdf].
</Definition>

For VBS, the secure-call ABI -- the way VTL0 code asks VTL1 to do something -- is built on SynIC. A VTL0 caller writes a request into a shared message page, signals a SINT, and yields the CPU; the hypervisor switches SLAT context to VTL1, delivers the message, and lets VTL1 read the request. When VTL1 finishes, it signals a SINT back to VTL0 and the hypervisor switches contexts again. Credential Guard's whole communication path between VTL0 LSASS and VTL1 LSAISO is one of these secure-call channels.

### 7.5 Memory and per-VTL SLAT

The last surface is also the most important: memory. Guest physical addresses (GPAs) are translated to system physical addresses (SPAs) by per-partition SLAT page tables. The hypervisor has exclusive control of these tables; no partition, including the root, can read or modify them directly. For VBS specifically, the hypervisor maintains *two* SLAT mappings per partition -- one for VTL0 and one for VTL1 -- and switches between them on VTL transitions.

<Mermaid caption="Per-VTL SLAT enforcement. The same machine-physical page can have different mappings in VTL0's and VTL1's SLAT tables. A VTL1 trustlet's secret-key page is mapped read/write/execute in VTL1's SLAT but is entirely absent (or marked no-read, no-write, no-execute) in VTL0's SLAT. The hypervisor switches SLAT contexts on every VTL transition, so a VTL0 access to a VTL1-only page traps before the hardware MMU can satisfy it.">
flowchart LR
    GPA["Guest physical address (GPA)"]
    SLAT0["VTL0 SLAT mapping"]
    SLAT1["VTL1 SLAT mapping"]
    SPA["System physical address (SPA)"]
    HV["Hypervisor (owns both SLAT trees)"]
    GPA -->|VTL0 active| SLAT0
    GPA -->|VTL1 active| SLAT1
    SLAT0 -->|"normal pages"| SPA
    SLAT1 -->|"secret pages, +rwx"| SPA
    SLAT0 -.->|"VTL1 secret pages: not present"| SPA
    HV -.->|"switches context on VTL transition"| SLAT0
    HV -.->|"switches context on VTL transition"| SLAT1
</Mermaid>

This is the architectural reason VTL0 kernel mode, even with full Ring-0 code execution, cannot read or execute VTL1 memory. The VTL0 page-table walker on a load from a VTL1-only page does not see the page at all; the SLAT walker on the host returns *no mapping*; the hardware MMU raises an EPT/NPT violation; the hypervisor handles the violation according to the VTL0 partition's intercept policy. In the security-relevant case, the hypervisor delivers an access-denied result to VTL0 and continues. There is no kernel-mode `mov` instruction sequence that can defeat this, because the gating happens in hardware page-table walks that VTL0 kernel mode cannot influence.

Five surfaces. Two of them -- the hypercall ABI and the device-emulation paths that surface over VMBus -- are where every public Hyper-V escape since 2018 has lived. The other three (intercepts, SynIC, per-VTL SLAT) are the substrate on which VBS, HVCI, Credential Guard, and System Guard Secure Launch are built. We turn to those next.

## 8. How the Hypervisor Enforces Each VBS Feature

The hypervisor itself does not know anything about credentials, code signing, application allowlisting, or DMA protection. It knows about partitions, VTLs, intercepts, SLAT entries, and hypercalls. Each Windows security feature is built by *composing* those primitives in a specific way. The mapping is precise and worth walking, because it is what makes the substrate a *security* primitive rather than just a virtualization product [@ms-hardware-root-of-trust].

**HVCI / Memory Integrity.** [Hypervisor-protected Code Integrity](/blog/when-system-isnt-enough-the-windows-secure-kernel-and-the-en/) is the most consequential VBS feature on a per-byte basis: it changes Windows from a system that lets the kernel execute any signed driver to one where the kernel cannot execute *any* page until VTL1 has approved it. VTL1's code-integrity service inspects every kernel-mode page mapping change request before the SLAT entry that would make the page executable in VTL0 is granted. The W^X invariant -- a single page can be writable or executable, but never both -- is enforced not by NT kernel cooperation but by the per-VTL SLAT, exactly as described in section 7.5. An NT-kernel attempt to mark a writable page executable becomes a memory-access intercept that VTL1's CI service evaluates [@ms-enable-vbs-hvci]. The hypervisor primitives composed: per-VTL SLAT + memory-access intercepts + secure-call ABI.

<Definition term="Trustlet">
A user-mode process that runs inside VTL1's Isolated User Mode (IUM). Trustlets must be signed with the Windows System Component Verification certificate (Signature Level 12) and carry the IUM EKU `1.3.6.1.4.1.311.10.3.37`. The shipping inbox trustlets include `LSAISO.EXE` (Credential Guard), `VMSP.EXE` (host side of virtual TPM), and the vTPM provisioning trustlet [@ms-iso-user-mode-trustlets, @ionescu-bh-2015].
</Definition>

**Credential Guard.** `LSAISO.EXE` -- the LSA-Isolated trustlet -- runs in VTL1 Isolated User Mode. NTLM password hashes and Kerberos Ticket-Granting Tickets that LSASS used to keep in normal VTL0 memory are moved to VTL1 memory that VTL0 cannot read. VTL0 LSASS performs credential operations by sending a request to LSAISO over a secure-call channel mediated by the hypervisor's SynIC; LSAISO does the cryptographic work and returns a result. The plaintext of the credential never leaves VTL1. This is why a Ring-0 attacker on a Credential Guard-enabled Windows install cannot dump LSASS hashes -- they aren't in LSASS [@ms-iso-user-mode-trustlets]. The hypervisor primitives composed: per-VTL SLAT (to hide LSAISO's memory) + SynIC (to deliver secure calls) + intercepts (to catch VTL0 attempts to access LSAISO memory). See the sibling [Credential Guard / NTLMless](https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/) article for VTL1 internals.

<Definition term="Secure Call">
The VTL0-to-VTL1 calling convention. A VTL0 caller fills in a shared parameter page, signals a SynIC interrupt configured for VTL transition, and yields. The hypervisor switches SLAT context to VTL1, delivers the message, and lets the Secure Kernel dispatch it via `IumInvokeSecureService` to a registered VTL1 service. On return, the hypervisor switches contexts back. The whole round-trip is mediated by hypervisor primitives the calling VTL cannot bypass [@ionescu-bh-2015].
</Definition>

**[Application Control (WDAC)](/blog/who-is-this-code----the-quiet-33-year-reinvention-of-app-ide/).** The same VTL1 code-integrity service that backs HVCI also evaluates user-mode policy. When VTL0 user mode tries to load a binary that is restricted by WDAC policy, the load becomes a secure call into VTL1; VTL1's policy engine evaluates the signature, the certificate chain, and the configured policy; the secure call returns approval or denial. WDAC policy lives in VTL1, the policy database lives in VTL1, and a VTL0 administrator who has been compromised cannot edit either. The hypervisor primitives composed: same as HVCI, plus a richer secure-call API for policy evaluation.

**VBS Enclaves.** A third-party application can load native code into a VTL1 IUM enclave. The enclave executes in VTL1, with its memory hidden from VTL0; the application talks to the enclave through a secure-call ABI exposed by the Secure Kernel. Architecturally parallel to Credential Guard but available to ordinary application developers. The hypervisor primitives composed: per-VTL SLAT (to hide enclave memory) + secure-call ABI (to invoke enclave code) + a Secure Kernel API for enclave creation, attestation, and destruction.

**System Guard Secure Launch (DRTM).** Intel TXT's `SENTER` instruction (and AMD's `SKINIT` on AMD platforms) executes a hardware-rooted dynamic measurement of the hypervisor and the Secure Kernel into TPM PCRs 17-22 *after* firmware initialization [@ms-system-guard-secure-launch]. This re-establishes the trust root post-firmware: a pre-boot firmware compromise that survived UEFI Secure Boot cannot silently poison the hypervisor's launch state without showing up as an unexpected measurement in a PCR that VTL1 can read. The hypervisor primitives composed: DRTM event registration with the hardware + TPM PCR extension + a VTL1-side attestation API. See the sibling [Secure Boot](https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/) article for the static-RTM half of the same story.

**Kernel DMA Protection.** External devices over Thunderbolt, USB4, or hot-plug PCIe can issue DMA to arbitrary physical addresses, bypassing the CPU's MMU entirely. The hypervisor configures the IOMMU (Intel VT-d / AMD-Vi) to deny DMA from externally-attached devices outside of explicitly-authorized memory regions, and to refuse DMA from any device before its kernel-mode driver has been loaded under a trusted policy [@ms-kernel-dma-protection]. The hypervisor primitives composed: hypervisor-owned IOMMU configuration + memory-access intercepts on the IOMMU configuration MMIO region.

The shape of the table is the point.

| Feature | Composed primitives | Verbatim hypervisor mechanism |
|---|---|---|
| HVCI | per-VTL SLAT + memory-access intercepts + secure-call ABI | VTL1 vets each VTL0 page-mapping change before granting +X |
| Credential Guard | per-VTL SLAT + SynIC + intercepts | LSAISO trustlet memory absent from VTL0 SLAT mapping |
| WDAC (AppControl) | secure-call ABI + VTL1 policy engine | VTL0 binary load = secure call into VTL1 CI service |
| VBS Enclaves | per-VTL SLAT + secure-call ABI | Third-party VTL1 IUM enclave invoked over secure call |
| System Guard Secure Launch | hardware DRTM (TXT/SKINIT) + TPM PCR extension | `SENTER` / `SKINIT` measures hypervisor into PCRs 17-22 |
| Kernel DMA Protection | hypervisor-owned IOMMU + MMIO intercepts | VT-d/AMD-Vi denies DMA outside authorized regions |

<Aside label="What the hypervisor does not know">
The hypervisor knows nothing about NTLM hashes, Kerberos tickets, code-signing certificates, WDAC policy XML, or DMA-region authorization. All of that policy lives in VTL1 -- in the Secure Kernel, in LSAISO, in the WDAC service. The hypervisor only provides the *mechanism* for one piece of policy to evaluate a request from another piece of policy in isolation. This is the architectural separation that lets the hypervisor binary stay small and the Windows-side security feature set keep growing.
</Aside>

The pattern: each feature is a different *composition* of the same five primitives (partitions, hypercalls, intercepts, SynIC, per-VTL SLAT). The hypervisor is genuinely a primitive in the formal sense -- a small set of mechanisms that compose into many security policies. If the hypervisor is the mechanism, the *boundary* the hypervisor enforces is the contract. Microsoft commits to servicing certain attacks against that boundary and explicitly excludes others. To know what we are getting, we need to read the contract.

## 9. The Security Boundary Microsoft Commits To

The Microsoft Security Servicing Criteria for Windows is a public document. It enumerates which classes of attack Microsoft will issue a CVE and an out-of-band patch for, and which it will not. For the hypervisor, the document is unusually specific [@ms-msrc-servicing-criteria].

The two relevant boundaries:

- **Hypervisor / virtualization boundary.** An L1-guest-to-host or guest-to-guest break is a serviced boundary. If a guest VM can execute code in the root partition or in another guest's address space, Microsoft will issue a CVE.
- **Virtual Secure Mode (VBS) boundary.** VTL0 kernel-mode code reading or writing VTL1 memory, or executing VTL1 code, is a serviced break. If a Ring-0 attacker in VTL0 can defeat the per-VTL SLAT, Microsoft will issue a CVE.

What the servicing criteria *does not* commit to is also worth naming. A same-VTL elevation of privilege inside a guest (a guest user becoming guest SYSTEM) is not a hypervisor break -- it is a Windows EoP, serviced under the Windows kernel boundary, not the hypervisor boundary. A denial-of-service of the host from a guest is generally not a serviced hypervisor break unless it produces a memory corruption that an attacker can ride to RCE. An administrator in the root partition reading guest memory is not a break at all -- the root partition is part of the hypervisor's TCB by definition, and root-partition admin is hypervisor-admin in the threat model.

The dollar figures for these boundaries are documented in the Microsoft Hyper-V Bounty Program [@ms-msrc-bounty-hyperv]. The program ranges from \$5,000 for the lowest-impact qualifying submission up to \$250,000 for the highest. The eligibility language is verbatim:

<PullQuote>
"An eligible submission includes a Remote Code Execution (RCE) vulnerability in Microsoft Hyper-V that enables a L1 guest virtual machine to compromise the hypervisor, escape from the guest virtual machine to the host, or escape to another L1 guest virtual machine." -- Microsoft Hyper-V Bounty Program [@ms-msrc-bounty-hyperv]
</PullQuote>

\$250,000 is the highest standing Hyper-V bounty in the industry. Comparable programs from the other major hypervisor vendors do not publish the same calibration. KVM is a community project with no vendor-paid bounty pool of equivalent size. Xen is a Linux Foundation project that runs a bug bounty through HackerOne but does not publicly attach a \$250,000 figure to a guest-to-host RCE. ESXi (Broadcom) does not publish a standing bounty program with a per-bug ceiling; bounty payments for ESXi RCEs typically flow through Pwn2Own and similar marketplaces, where Trend Micro's Zero Day Initiative sets the prize for any given competition.<Sidenote>The bounty calibration is itself a data point. If \$250,000 were too high, Microsoft would be drowning in submissions; if it were too low, the public CVE record would show more hypervisor breaks reported through Pwn2Own than directly to MSRC. The current equilibrium -- two to four Microsoft-direct Hyper-V CVEs per year, plus zero Pwn2Own Hyper-V guest-to-host escapes through Pwn2Own Berlin 2025 [@zdi-pwn2own-day3] -- is consistent with the bounty being calibrated roughly correctly relative to the cost of finding a real bug.</Sidenote>

| Vendor | Hypervisor | Published bounty | Ceiling | Servicing-criteria boundary published |
|---|---|---|---|---|
| Microsoft | Hyper-V / `hvix64.exe` | Yes | \$250,000 | Yes, verbatim language |
| Xen Project | Xen | Yes (HackerOne) | Lower, varies | Yes, security policy |
| KVM | KVM (community) | No standing program | -- | No vendor-published criteria |
| Broadcom/VMware | ESXi | No standing public bounty | -- | Vendor advisories per CVE |
| seL4 Project | seL4 | No (proof-rooted argument) | -- | Functional-correctness proof [@sel4-whitepaper] |

The seL4 row is included because seL4 is the only hypervisor in the table whose claim to a security boundary is *mathematical* rather than operational. seL4 ships approximately ten thousand lines of C and assembly with a machine-checked proof of functional correctness against a higher-level specification. The proof took roughly twenty-five person-years and covers a microkernel that does not by itself ship the full surface area of Hyper-V. The Microsoft hypervisor is unverified at the §7-estimated line count an order of magnitude larger; its security argument is operational (a small TCB, heavy fuzzing, a standing bounty, public servicing) rather than mathematical.

A serviced boundary is a contract. Contracts are not promises; they are obligations that come due when an attacker finds a way around them. To see what the contract has actually had to pay out, we read the public CVE record.

## 10. The Public Track Record -- Six Worked CVEs Across Three Classes

We do not need an exhaustive Hyper-V CVE catalog to understand the boundary's real shape. Six worked examples, drawn from three distinct attack classes, cover every public failure mode the boundary has produced since 2018. We walk them in order.

### Class A: Device emulation in the root partition

**CVE-2021-28476 (vmswitch.sys, May 2021, CVSS 9.9).** Discovered by Ophir Harpaz at Guardicore Labs and Peleg Hadar at SafeBreach Labs using Guardicore's `hAFL1` hypervisor fuzzer, this was a guest-controlled `OID_SWITCH_NIC_REQUEST` OID parameter passed to the host-side `vmswitch.sys` driver. The driver dereferenced an attacker-influenced object pointer; the host kernel performed an arbitrary pointer dereference; the guest gained RCE in the root partition's kernel mode. The CVSS 9.9 score (AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H) reflects guest-to-host RCE with Azure-scale blast radius: the bug was reachable from the vmswitch driver shipped in Windows builds well before the May 2021 patch, per the Guardicore Labs technical analysis [@nvd-cve-2021-28476, @securityweek-vmswitch]. The bug is the canonical anchor for "device emulation in the root partition is the largest Hyper-V attack surface."

**CVE-2025-21333 (NT Kernel Integration VSP, January 2025, CWE-122).** The first publicly-acknowledged in-the-wild exploited Hyper-V CVE. The "Hyper-V NT Kernel Integration VSP" is a relatively new component that ties the Windows kernel-mode container architecture to Hyper-V's VSP/VSC pattern. A guest-controlled input triggered a heap-based buffer overflow on the host side of the integration; the host's address space was corruptible from a guest [@nvd-cve-2025-21333, @helpnet-jan2025]. The operational pattern matches the vmswitch family: a host-side component receives structured, attacker-shaped input from a guest, and the host-side component overflows.

### Class B: The hypercall input-validation path

**CVE-2024-21407 (Hyper-V hypercall UAF, March 2024, CVSS 8.1, CWE-416).** The rare case where the bug is in `hvix64.exe` / `hvax64.exe` itself, not in a root-partition driver. A guest crafted specially-formed file-operation hypercalls; the hypervisor dereferenced freed memory; the guest gained arbitrary host code execution [@nvd-cve-2024-21407, @theregister-march2024].

**CVE-2024-30092 (Hyper-V RCE, October 2024, CWE-20 + CWE-829).** A Hyper-V remote code execution that combined improper input validation with inclusion of functionality from an untrusted control sphere -- another hypercall-path-class bug [@nvd-cve-2024-30092, @zdi-october-2024].

**CVE-2024-49117 (Hyper-V RCE, December 2024, CVSS 8.8).** A third 2024 Hyper-V RCE; the December Patch Tuesday entry rounded out a year in which three publicly-disclosed Hyper-V RCEs landed in twelve months, the most since the 2018 vmswitch family [@nvd-cve-2024-49117].

### Class C: VTL0-to-VTL1 (the VBS break, not the hypervisor break)

**CVE-2020-0917 and CVE-2020-0918 -- Amar and King, Black Hat USA 2020.** Saar Amar and Daniel King's "Breaking VSM by Attacking SecureKernel" disclosed two paired vulnerabilities discovered with their Hyperseed hypercall fuzzer retargeted at `securekernel!IumInvokeSecureService`, the secure-call entry point. Vulnerability #1 -- which maps to CVE-2020-0917 -- is an *out-of-bounds write* in `securekernel!SkmmObtainHotPatchUndoTable`, the function that parses the hot-patch undo table at secure-call invocation time.<Sidenote>The Black Hat USA 2020 deck (verified via pdftotext at the canonical MSRC-Security-Research GitHub URL) explicitly labels Vulnerability #1 as **OOB Write**, in slides titled "The Vulnerable Function" and "The OOB" in the "Hardening SK" section [@amar-king-bh-2020]. Several secondary writeups across the web have transcribed the bug class as "OOB read," which is incorrect; the deck itself is the primary source and says write. The functions involved are also commonly conflated: `IumInvokeSecureService` is the secure-call dispatcher Hyperseed retargets to reach the buggy code; the actual bug is in `SkmmObtainHotPatchUndoTable`. The NVD entries for both CVEs are tracked as CWE-269 (Improper Privilege Management).</Sidenote> Vulnerability #2 -- CVE-2020-0918 -- is a design flaw in `SkmmUnmapMdl` that lets VTL0 pass a fully attacker-controlled Memory Descriptor List to `SkmiReleaseUnknownPTEs`.

The Microsoft response is documented end-to-end in the same deck: the Secure Kernel pool was migrated to segment heap in mid-2019, four W+X regions were reduced to +X only, and `SkpgContext` -- a HyperGuard equivalent for Secure Kernel -- was introduced [@nvd-cve-2020-0917, @nvd-cve-2020-0918].

This is a different failure class than vmswitch RCE: not guest-to-host, but VTL0-to-VTL1 -- a Secure Kernel break reached through the hypervisor's secure-call dispatch from a privileged VTL0 attacker. Microsoft services it under the VBS / VSM boundary in the servicing criteria document, even though no guest VM is involved.

> **Key idea:** Every public Hyper-V CVE since 2018 lives in one of three narrow code paths -- device emulation, hypercall input validation, or VTL0-to-VTL1 secure-call dispatch. The TLFS-visible primitives (intercepts, SynIC, per-VTL SLAT) have produced none.

### The Pwn2Own dimension

Through Pwn2Own Berlin 2025, no public live Hyper-V guest-to-host escape has been demonstrated at Pwn2Own. The cross-vendor analogue -- and the industry's best calibration of how hard a hypervisor escape is to find when a researcher has a public dollar incentive and a deadline -- is STAR Labs SG's first-ever ESXi escape at Pwn2Own Berlin 2025, executed by Nguyen Hoang Thach on Day Two (May 16, 2025) using a single integer overflow vulnerability in the hypervisor's DMA-handling path. The award was \$150,000 plus 15 Master of Pwn points; STAR Labs went on to win overall Master of Pwn for the competition with \$320,000 across three days [@starlabs-pwn2own-2025, @zdi-pwn2own-day3].

The technique class is a TOCTOU on a length field read twice during a DMA operation: the first read validates the length, the second read uses it; race the second read and you write past a fixed-size buffer on the host heap. The exploit class is structurally the same as the vmswitch family, just landed in a different vendor's device-emulation path.

| CVE | Class | Year | CVSS | Location | Source |
|---|---|---|---|---|---|
| CVE-2021-28476 | A: device emulation | 2021 | 9.9 | `vmswitch.sys` (root partition) | [@nvd-cve-2021-28476] |
| CVE-2025-21333 | A: device emulation | 2025 | 7.8 | NT Kernel Integration VSP (root partition) | [@nvd-cve-2025-21333] |
| CVE-2024-21407 | B: hypercall path | 2024 | 8.1 | `hvix64.exe` / `hvax64.exe` (hypervisor binary) | [@nvd-cve-2024-21407] |
| CVE-2024-30092 | B: hypercall path | 2024 | 7.5 | Hyper-V hypercall validation | [@nvd-cve-2024-30092] |
| CVE-2024-49117 | B: hypercall path | 2024 | 8.8 | Hyper-V hypercall validation | [@nvd-cve-2024-49117] |
| CVE-2020-0917/0918 | C: VTL0-to-VTL1 | 2020 | 6.8 | `securekernel.exe` (VTL1, reached via secure call) | [@nvd-cve-2020-0917, @nvd-cve-2020-0918, @amar-king-bh-2020] |

<Mermaid caption="The three failure-mode regions on the Hyper-V attack surface. Class A: device emulation paths that surface inside the root partition (vmswitch.sys, NT Kernel Integration VSP). Class B: the hypervisor's own hypercall input-validation path inside hvix64.exe / hvax64.exe. Class C: the VTL0-to-VTL1 secure-call dispatch into the Secure Kernel. The TLFS-visible primitives -- SLAT, SynIC, intercepts -- have not produced a public CVE in any of these three classes since 2018.">
flowchart LR
    subgraph CA["Class A: device emulation (root partition)"]
        Vmswitch["vmswitch.sys -- CVE-2021-28476"]
        Vsp["NT Kernel Integration VSP -- CVE-2025-21333"]
    end
    subgraph CB["Class B: hypercall input validation (hypervisor binary)"]
        UAF["CVE-2024-21407 (UAF)"]
        Input["CVE-2024-30092"]
        Hpcall["CVE-2024-49117"]
    end
    subgraph CC["Class C: VTL0-to-VTL1 (secure call dispatch)"]
        Oob["CVE-2020-0917 (OOB write)"]
        Mdl["CVE-2020-0918 (SkmmUnmapMdl)"]
    end
    Guest["Guest VM"] --> CA
    Guest --> CB
    Vtl0["Privileged VTL0 (kernel)"] --> CC
</Mermaid>

This is the third insight the article is built around. The reader's prior model may have been "hypervisors fail in mysterious, deep ways; the boundary is fragile in unknown places." The new model is "every public Hyper-V escape since 2018 lives in one of three narrow code paths, and the TLFS-visible primitives have produced none." The narrowness of the failure space is itself a security argument. The hypervisor's micro-kernelized design has held; what has not always held are the components Microsoft chose to put *next to* the hypervisor, in the root partition's user mode and kernel mode, by deliberate architectural choice in 2008.

Six worked examples; three classes; one boundary; an unflinching public record. The boundary is alive and producing CVEs at roughly two to four per year. But every CVE so far has lived somewhere the hypervisor itself controls. The interesting question is what lives in places it does not control.

## 11. The Residual Attack Surface -- Beneath, Beside, and Around

The hypervisor enforces a clean boundary against everything *above* it -- the NT kernel, user mode, even other guest VMs. It cannot, by construction, enforce anything against what lives *below* or *beside* it. Three structural classes of residual attack matter. We walk each.

### 11.1 Firmware below the hypervisor

System Management Mode (SMM), the UEFI runtime, the platform Manageability Engine (Intel ME), and the AMD Platform Security Processor (PSP) all run at higher privilege than the hypervisor for parts of boot and runtime. SMM in particular is a CPU mode that is invoked through System Management Interrupts (SMI) and has unrestricted access to all of physical memory, including the hypervisor's own pages. If the OEM-supplied SMM handler contains an exploitable bug, an SMI can run attacker code in a privilege mode strictly above the hypervisor's.

The threat is not hypothetical. The Binarly research team's 2023 LogoFAIL disclosures showed entire classes of image-parser bugs in UEFI firmware reachable from a privileged OS context; BootHole (CVE-2020-10713, a buffer overflow in GRUB2's `grub.cfg` parser) and BlackLotus (CVE-2022-21894, a UEFI Secure Boot bypass) showed that pre-boot bugs in widely-deployed bootloaders could ride past Secure Boot. None of these is a hypervisor bug; all of them are residual attack surface from the hypervisor's point of view.

Microsoft's mitigation is the *dynamic* root of trust for measurement -- System Guard Secure Launch -- which we touched on in section 8. After UEFI Secure Boot has done its static-RTM job, Intel TXT's `SENTER` (or AMD's `SKINIT`) executes a CPU-hardware-rooted late launch: the CPU resets to a known state, runs an Intel- or AMD-signed Authenticated Code Module (ACM), and measures the hypervisor binary into TPM PCRs 17-22 before transferring control to it. The result is that even if pre-boot firmware is compromised, the post-DRTM PCR values reflect the actual hypervisor binary; a compromised UEFI cannot silently substitute a different hypervisor without changing the attestation [@ms-system-guard-secure-launch, @ms-hardware-root-of-trust]. The residual after DRTM: OEMs that don't ship Secure Launch on their motherboards, or that ship buggy SMM handlers that can be invoked after launch.

### 11.2 Hardware side channels

Microarchitectural side-channel attacks cross the VTL boundary at the level of CPU implementation, not at the level of architectural specification. The 2018 Spectre and Meltdown disclosures -- followed by the L1TF, MDS, Retbleed, and CacheWarp families in the years since -- showed that speculatively-executed code on a CPU can leak microarchitectural state across privilege boundaries that the architectural ISA promises to protect [@meltdownattack, @usenix-spectre-2018].

Microsoft's mitigation cadence has been in-tree and aggressive: Kernel Virtual Address Shadow (the Windows equivalent of KPTI) for Meltdown; IBRS, STIBP, and retpolines for Spectre v2; HyperClear for L1TF on Hyper-V hosts. Each Patch Tuesday since 2018 has shipped at least one microarchitectural mitigation; cumulatively the cost has been measurable but bounded.

> **Note:** The microarchitectural ceiling is hardware, not software. Intel TDX and AMD SEV-SNP -- the two confidential-computing architectures that move the trust root from the hypervisor to per-VM hardware encryption -- both explicitly *disclaim* resistance to this class. If the CPU leaks across a Spectre-class side channel, no software-level isolation primitive (VTL, partition, SEAM, SEV-SNP) can fully recover the property. The mitigation is hardware that doesn't leak, and that mitigation arrives one CPU generation at a time.

### 11.3 IOMMU and DMA bypass

The IOMMU -- Intel VT-d, AMD-Vi -- is the hardware that gates DMA from peripheral devices to physical memory. If the IOMMU is configured correctly, a Thunderbolt-attached device cannot read or write arbitrary memory; it can only DMA to regions the OS has explicitly mapped for it. If the IOMMU is disabled, configured permissively, or has firmware bugs of its own, DMA becomes an end-run around every architectural protection above it -- including the hypervisor's.

The threat is again not hypothetical. Bjorn Ruytenberg's Thunderspy disclosure in 2020 documented seven DMA-class vulnerabilities in Thunderbolt 3 firmware, demonstrating that an attacker with physical access could read or modify arbitrary memory on a powered-on system through a malicious peripheral [@thunderspy]. The Microsoft mitigation is Kernel DMA Protection (Windows 10 1803 and later): the hypervisor configures the IOMMU at boot to deny DMA from externally-attached devices outside of explicitly authorized regions, and DMA from any peripheral whose driver has not been loaded under a trusted policy is refused at the IOMMU [@ms-kernel-dma-protection]. The structural residual: pre-boot DMA, before Windows has finished configuring the IOMMU; client motherboards that still ship with VT-d or AMD-Vi disabled in BIOS; OEMs that disable Kernel DMA Protection by default.

### 11.4 Hypervisor downgrade and rollback

Alon Leviev's "Windows Downdate" at Black Hat USA 2024 disclosed a class of attack that the prior three sections do not cover: rollback of the hypervisor binary itself to a previously-vulnerable, but still validly-signed, build [@leviev-bh-2024, @nvd-cve-2024-21302].

The structural argument: UEFI Secure Boot prevents loading an *unsigned* `hvix64.exe`. It does *not* prevent loading an older `hvix64.exe` that is unsigned only in the sense of being unrevoked. If Microsoft fixes a Secure Kernel bug in build N+1 and a VTL0 attacker can convince the system to load build N at the next reboot, the patched bug is alive again. CVE-2024-21302 demonstrated exactly this rollback against both the hypervisor and the Secure Kernel through manipulation of the Windows Update servicing pipeline. The mitigation is mandatory-update servicing combined with proactive revocation list (`dbx`) hygiene -- once an older binary's hash is in the UEFI revocation list, Secure Boot will refuse to load it -- and Microsoft completed mitigations across Windows 10 1507 through Windows Server 2019 in the July 8, 2025 update wave [@nvd-cve-2024-21302, @leviev-bh-2024].

<Mermaid caption="Platform layering and residual attack classes. Each layer carries its own residual attack surface that the hypervisor cannot, by construction, enforce against. Hardware vulnerabilities and microarchitectural side channels live below the architectural ISA the hypervisor uses. Firmware and SMM live above the hypervisor in privilege for parts of runtime. The IOMMU is configured by the hypervisor but can be bypassed if it is disabled or buggy. The Windows Update servicing pipeline can re-introduce patched bugs through binary rollback.">
flowchart TD
    HW["Hardware (CPU, RAM, IOMMU, TPM)"]
    SM["System Management Mode (Ring -2) -- residual: SMM handler bugs"]
    FW["UEFI firmware -- residual: LogoFAIL, BootHole, BlackLotus"]
    DR["DRTM ACM (Intel TXT / AMD SKINIT)"]
    HV["Microsoft Hypervisor (hvix64 / hvax64)"]
    Iommu["IOMMU (VT-d / AMD-Vi) -- residual: Thunderspy, pre-boot DMA"]
    Vtl1["VTL1 (Secure Kernel + trustlets)"]
    Vtl0["VTL0 (NT kernel + user mode)"]
    Side["Microarchitectural side channels -- Spectre / Meltdown / MDS / Retbleed"]
    Update["Windows Update servicing -- residual: hypervisor rollback (CVE-2024-21302)"]
    HW --> SM
    SM --> FW
    FW --> DR
    DR --> HV
    HV --> Iommu
    HV --> Vtl1
    HV --> Vtl0
    Side -.->|"cross all boundaries"| HV
    Update -.->|"can roll hypervisor back"| HV
</Mermaid>

<Aside label="Necessary, not sufficient">
The hypervisor is necessary but not sufficient. The firmware-Secure-Boot-DRTM substrate beneath it, the microarchitectural ceiling above it, the IOMMU configuration beside it, and the Windows Update pipeline that decides which hypervisor build runs next are co-equal members of the same boundary. None of them is the hypervisor; all of them have to do their job for the hypervisor's guarantees to hold. The substrate is real, but the boundary is the combination of the substrate and what holds it up.
</Aside>

Necessary, not sufficient. That phrase is the article's honest answer to the question "how good is the substrate?" The answer is that the substrate is genuine, the boundary is published, the bounty calibration is the highest in the industry, the public CVE record is alive and narrow, and the residual attack surface lives in places the hypervisor cannot by construction control. The substrate is what we have explored in detail; what holds it up is what we have just sketched. The last section turns from theory to practice.

## 12. Practical Guide, FAQ, and Closing

If you have read this far, the natural next question is "is this on, on my machine, and how do I check?" The practical answer is short.

### 12.1 Enabling and verifying VBS

VBS is configurable through several paths: Group Policy (`Computer Configuration > Administrative Templates > System > Device Guard`), Intune, MDM CSPs (`DeviceGuard/EnableVirtualizationBasedSecurity`, `DeviceGuard/ConfigureSystemGuardLaunch`), the Windows Security UI, or directly via `bcdedit /set hypervisorlaunchtype Auto`. Verification is best done with three small commands.

- `msinfo32` -> the Device Guard / Virtualization-based Security row. "Services Configured" lists what policy has requested; "Services Running" lists what is actually active. Kernel DMA Protection and Secure Launch each appear as their own row.
- `Get-CimInstance -ClassName Win32_DeviceGuard` -> `VirtualizationBasedSecurityStatus` (0 = off, 1 = enabled but not running, 2 = running); `SecurityServicesRunning` array (HVCI, Credential Guard, etc.); `RequiredSecurityProperties` (the policy floor).
- `bcdedit /enum` -> `hypervisorlaunchtype Auto` is the default; `loadoptions DISABLE_VBS_*` is how an administrator can opt out (you should not see these flags on a properly-configured machine).

<RunnableCode lang="js" title="VBS health check (logic equivalent to a Get-CimInstance Win32_DeviceGuard run)">{`
// Given a parsed Win32_DeviceGuard object, compute whether VBS is healthy.
// The actual Win32_DeviceGuard schema is on Microsoft Learn; this is the
// decision logic an operator would write against it.
function checkVbsHealth(dg) {
  const result = { ok: false, reasons: [] };

  // VBS itself
  if (dg.VirtualizationBasedSecurityStatus !== 2) {
    result.reasons.push('VBS is not running (status != 2)');
  }

  // HVCI (Memory Integrity)
  if (!dg.SecurityServicesRunning.includes(2)) {
    result.reasons.push('HVCI / Memory Integrity is not running');
  }

  // Credential Guard
  if (!dg.SecurityServicesRunning.includes(1)) {
    result.reasons.push('Credential Guard is not running');
  }

  // Required floor properties (e.g. Secure Boot, DMA protection, SMM mitigation)
  const requiredFloor = [1, 2, 3]; // service codes per Win32_DeviceGuard
  for (const r of requiredFloor) {
    if (!dg.AvailableSecurityProperties.includes(r)) {
      result.reasons.push('Missing required security property: ' + r);
    }
  }

  result.ok = result.reasons.length === 0;
  return result;
}

const example = {
  VirtualizationBasedSecurityStatus: 2,
  SecurityServicesRunning: [1, 2, 3],
  AvailableSecurityProperties: [1, 2, 3, 4, 5],
};
console.log(JSON.stringify(checkVbsHealth(example), null, 2));
// -> { ok: true, reasons: [] }
`}</RunnableCode>

> **Note:** Three commands, in order: `msinfo32` for the human-readable summary; `Get-CimInstance -ClassName Win32_DeviceGuard | Format-List *` for the structured detail; `bcdedit /enum {current}` to confirm `hypervisorlaunchtype Auto` and the absence of `DISABLE_VBS_*` load options. If all three agree that VBS, HVCI, and Credential Guard are running, you are in the configuration this article describes.

### 12.2 Operational pitfalls

Two operational realities are worth flagging. First, HVCI has a *[driver block list](/blog/who-is-this-code----the-quiet-33-year-reinvention-of-app-ide/)* and will refuse to enable Memory Integrity if any incompatible driver is installed; the usual offenders are older anti-cheat drivers, third-party virtualization clients (VMware Workstation pre-2021, VirtualBox pre-6.1), and certain disk-encryption or storage-filter drivers. Microsoft maintains a public block list; the Memory Integrity UI in Windows Security will report the specific blocking driver. Second, nested virtualization is supported for Hyper-V guests on Windows 10/11 client and Server 2016+, and is required by some development workflows (WSL2 with nested containers, certain Visual Studio device emulators). Nested virtualization changes the threat model -- the L0 hypervisor still owns the box, but the L1 guest now runs its own hypervisor with its own VTL split -- so a compromised L1 guest with VBS enabled still does not give an L1 attacker a path to the L0 host.

### 12.3 The substrate cross-reference

This article is the substrate of the Windows security series at paragmali.com. The siblings build on what is here:

- [Secure Boot in Windows](https://paragmali.com/blog/secure-boot-in-windows-the-chain-from-sector-zero-to-userini/) -- the static-RTM half of the boot trust chain that hands off to the hypervisor.
- [VBS Trustlets: What Actually Runs in the Secure Kernel](https://paragmali.com/blog/vbs-trustlets-what-actually-runs-in-the-secure-kernel/) -- the VTL1 internals that the hypervisor's secure-call ABI delivers requests to.
- [NTLMless: The Death of NTLM in Windows](https://paragmali.com/blog/ntlmless-the-death-of-ntlm-in-windows/) -- the Credential Guard story from inside LSAISO.
- [Adminless: Administrator Protection in Windows](https://paragmali.com/blog/adminless-how-windows-finally-made-elevation-a-security-boun/) -- the user-mode admin trust model that the kernel-mode VBS boundary makes possible.
- [Can This Code Do This? Windows Access Control](https://paragmali.com/blog/can-this-code-do-this----twenty-five-years-of-attacks-on-the/) -- the access-control surface that VBS supplements but does not replace.

### 12.4 Frequently asked questions

<FAQ title="Frequently asked questions">
<FAQItem question="Doesn't enabling Hyper-V slow my PC by 10-30 percent?">
The 10-30 percent number is folklore from the pre-SLAT era or from systems running HVCI-incompatible drivers in compatibility mode. For typical workloads on modern hardware (post-2018 CPUs with VT-x or AMD-V and SLAT), the measured overhead of VBS plus HVCI plus Credential Guard sits in the low single digits. Gaming and high-throughput I/O workloads can show larger gaps, especially on systems where the BIOS forces nested virtualization off or where IOMMU is disabled. The trade-off for that overhead is the security-boundary set described in this article.
</FAQItem>

<FAQItem question="Is VBS the same thing as running Windows inside a Hyper-V VM?">
No. VBS is a Virtual Trust Level split *inside* the root partition. There are no extra VMs. The normal Windows install is VTL0; the Secure Kernel plus its trustlets is VTL1. Both VTLs live in the same partition, share the same physical CPU, and are scheduled by the hypervisor as separate VTL contexts -- not as separate VMs. A Hyper-V guest VM, by contrast, is a child partition entirely separate from the root partition. The two architectures share a hypervisor binary but use different parts of it.
</FAQItem>

<FAQItem question="If I am SYSTEM, am I above the hypervisor?">
No. SYSTEM is a high VTL0 user-mode token; the hypervisor sits architecturally above all of Ring 0, which is where SYSTEM-loaded kernel drivers ultimately run. The point of the entire article is that "SYSTEM owns the box" is wrong on a VBS-enabled Windows install. SYSTEM is the most privileged Windows identity; the hypervisor is the most privileged *software*, and the two are not the same thing.
</FAQItem>

<FAQItem question="Does Secure Boot prevent hypervisor rollback?">
No. Secure Boot prevents loading an *unsigned* `hvix64.exe`. It does not prevent loading an older, signed-but-vulnerable `hvix64.exe` that has not been added to the UEFI revocation list. That gap is what CVE-2024-21302 (Windows Downdate) exploited, and the mitigation is mandatory-update servicing combined with prompt revocation-list (`dbx`) hygiene [@nvd-cve-2024-21302].
</FAQItem>

<FAQItem question="Is the Microsoft hypervisor formally verified?">
No. seL4 is formally verified at approximately ten thousand lines of code with a roughly twenty-five-person-year proof effort. The Microsoft hypervisor is unverified at an estimated one to two hundred thousand lines of code. The hypervisor's security argument is operational -- a small TCB, heavy continuous fuzzing, a standing \$5K-\$250K bounty, public servicing criteria, an unflinching public CVE record -- rather than mathematical [@sel4-whitepaper, @ms-msrc-bounty-hyperv].
</FAQItem>

<FAQItem question="Is the hypervisor on Azure the same hypervisor as on Windows 11?">
Yes, in terms of binary identity, servicing criteria, and bounty eligibility. The Microsoft hypervisor that boots on a Windows 11 client laptop and the one that boots on an Azure host server are derived from the same codebase, ship with the same servicing commitments, and qualify for the same Hyper-V bounty. The threat model differs -- Azure adds multi-tenant guest-to-guest isolation, hardware confidential-VM extensions, and a different management surface -- but the substrate is shared.
</FAQItem>
</FAQ>

### 12.5 Closing

The reason SYSTEM on a Windows 11 box cannot read LSASS, load an unsigned driver, or patch `ntoskrnl.exe` is now fully accounted for. An `hvix64.exe` or `hvax64.exe` loaded by `hvloader.efi` before `winload.exe` ever ran. A VTL split inside the root partition, made possible by Hepkin and Kishan's 2013 patent and shipped with Windows 10 RTM in 2015. Per-VTL SLAT enforcement that the NT kernel architecturally cannot touch, because the SLAT tables live in pages the hypervisor never maps into a VTL0 view. A Microsoft-published security boundary and a \$5,000-\$250,000 bounty calibrating the boundary's value, both of which are unique in the industry at this writing. A public CVE record of six worked examples across three narrow classes that the boundary has had to pay out on since 2018. And a residual attack surface -- firmware below, side channels above, IOMMU bypass beside, hypervisor rollback through the update pipeline -- that the substrate cannot, by construction, eliminate.

The hypervisor is what every other article in this series sits on. Now you have the substrate in hand. The Secure Kernel article reads differently when you have walked the per-VTL SLAT yourself. The Credential Guard article reads differently when you know that LSAISO is invoked through a hypercall-mediated secure call. The Secure Boot article reads differently when you know that the hypervisor's DRTM measurement re-establishes the trust root *after* firmware. The Adminless article reads differently when you know that the privilege ceiling on Windows 11 is not Ring 0 but a hardware boundary above it.

Above Ring Zero is not a metaphor. It is an instruction-set state. The Windows hypervisor lives there, owns the page tables that say what the OS can see, and is the architectural reason "SYSTEM-on-Windows-11" cannot do things SYSTEM used to be allowed to do.

<StudyGuide slug="windows-hypervisor-security-primitive" keyTerms={[
  { term: "VBS", definition: "Virtualization-Based Security. A Windows architecture that uses the Hyper-V hypervisor to isolate security-critical code (the Secure Kernel and trustlets) from the regular NT kernel via per-VTL SLAT." },
  { term: "VTL", definition: "Virtual Trust Level. A hypervisor-managed privilege level inside a single partition; each VTL has its own SLAT mapping, register state, and interrupt subsystem. Two VTLs ship today (VTL0 = Normal world, VTL1 = Secure world); the architecture admits up to sixteen." },
  { term: "Hypercall", definition: "A guest-to-hypervisor call issued via vmcall (Intel) or vmmcall (AMD). The hypercall ABI is documented in the TLFS; rcx carries the call code and a control word, rdx/r8 carry parameters (fast) or GPA pointers to parameter pages (slow)." },
  { term: "SynIC", definition: "Synthetic Interrupt Controller. The hypervisor's per-virtual-processor event-delivery surface. SynIC carries VMBus traffic, secure-call signaling, and synthetic timers." },
  { term: "SLAT", definition: "Second-Level Address Translation. Hardware page-table support (Intel EPT, AMD NPT) that lets the hypervisor own a separate mapping from guest-physical to system-physical addresses." },
  { term: "DRTM", definition: "Dynamic Root of Trust for Measurement. A late-launch event (Intel TXT SENTER, AMD SKINIT) that measures the hypervisor binary into TPM PCRs after firmware initialization, re-establishing the trust root post-firmware." },
  { term: "Trustlet", definition: "A user-mode process that runs inside VTL1's Isolated User Mode (IUM). Signed with Signature Level 12 plus the IUM EKU. Inbox trustlets include LSAISO (Credential Guard) and VMSP (vTPM host side)." }
]} questions={[
  { q: "Why is the same-privilege paradox an architectural ceiling rather than an implementation bug?", a: "Because the defender at privilege level P shares an address space with an attacker at the same level. The attacker can locate and edit any state the defender maintains using ordinary load/store instructions. Better defenses at P do not change where the defender lives; only moving the defender to a privilege level above P does." },
  { q: "What 2013 patent describes the per-VTL design that Windows 10 shipped in 2015?", a: "US Patent 9,430,642 B2 by David Hepkin and Arun Kishan, priority date September 17, 2013, granted August 30, 2016. It teaches hierarchical Virtual Trust Levels with per-VTL memory access protections and per-VTL virtual-processor register state." },
  { q: "Name the three classes that all post-2018 public Hyper-V CVEs fall into.", a: "Class A: device emulation in the root partition (vmswitch.sys, NT Kernel Integration VSP). Class B: hypercall input-validation inside the hypervisor binary itself. Class C: VTL0-to-VTL1 secure-call dispatch into the Secure Kernel." },
  { q: "Which hypervisor primitive does HVCI's W^X enforcement ride on?", a: "Per-VTL SLAT. An NT-kernel attempt to mark a writable VTL0 page executable becomes a memory-access intercept routed to VTL1's code-integrity service; the hypervisor only grants the new SLAT entry if VTL1 approves." },
  { q: "Why does Secure Boot not prevent hypervisor rollback?", a: "Secure Boot validates signatures, not freshness. An older, validly-signed-but-vulnerable hypervisor binary that has not been added to the UEFI revocation list (dbx) will still load. Closing this gap requires proactive dbx hygiene plus mandatory-update servicing, which is what mitigated CVE-2024-21302 Windows Downdate." },
  { q: "What is the structural difference between Blue Pill (offense) and VBS (defense)?", a: "Architecturally there is none. Both are thin Type-1 hypervisors that interpose between firmware and OS, own the second-level page tables, and are invisible to the OS unless the OS can attest to what is underneath it. The differences are whose hypervisor it is, whether it was measured at load time, and what it does with its privilege." }
]} />

{/*
---WRITER METRICS---
Word count: 14115 (body) / 15180 (total)
Citations: 67 inline occurrences across 45 distinct refs (all 45 frontmatter refs are cited)
Mermaid diagrams: 7
Definitions: 10
Sidenotes: 7
FAQ questions: 6
---END WRITER METRICS---
*/}
