# AD Is a Graph: How BloodHound Made Defenders Think Like Attackers

> From Lambert's 2015 essay to Microsoft Security Exposure Management in 2024 -- how the attack-path graph became the default model for Active Directory security.

*Published: 2026-05-26*
*Canonical: https://paragmali.com/blog/ad-is-a-graph-how-bloodhound-made-defenders-think-like-attac*
*License: CC BY 4.0 - https://creativecommons.org/licenses/by/4.0/*

---
<TLDR>
**AD is a graph.** In April 2015 John Lambert named the missing model in two sentences. In August 2016 Andy Robbins, Rohan Vazarkar, and Will Schroeder shipped the tool that made it operational: BloodHound treats Active Directory as a directed graph of privilege relationships, queries it with Neo4j's Cypher, and turns weeks of red-team whiteboard work into a 200 ms shortest-path lookup. By November 2024 Microsoft itself shipped the same mental model as Microsoft Security Exposure Management. This article is half tool history (1998 academic attack graphs through OpenGraph 2025) and half graph-theory exposition: property graphs, BFS shortest paths, the edge taxonomy, and the open problem of *weighting* what is currently treated as one BFS hop per privilege.
</TLDR>

## 1. How do I get from this user to Domain Admin?

In 2014, a red-team analyst with a help-desk account inside a 40,000-user Active Directory forest asks the question every red-team analyst asks: *is there a path from here to Domain Admins?* The answer takes two analysts five days of PowerView scripts, hand-drawn whiteboard diagrams, and per-host RDP probing. The same question in 2024 is a ninety-character Cypher query that returns in 200 milliseconds.

What happened in those ten years is the story of one sentence -- *defenders think in lists, attackers think in graphs* -- becoming, in turn, a tool, a discipline, and a Microsoft product.

The 2014 reality was a stack of CSV files. PowerView, the PowerShell enumeration toolkit Will Schroeder first published in August 2014 [@powersploit-repo], could dump every group membership, every [Access Control Entry](/blog/windows-access-control-25-years-of-attacks/), and every active session from a low-privilege account [@powertools-powerview]. The outputs were rows. Hundreds of thousands of rows. Composing them into a coherent attack path was a job for a marker and a whiteboard, and the join keys were Distinguished Names that wrapped twice across an analyst's notebook page. Five days to map a single 40,000-user forest was not unusual. It was the price of doing business.

The 2024 reality is a query. The analyst loads SharpHound's JSON dump into a Neo4j graph, opens the BloodHound web interface in a browser, types

```cypher
MATCH p = shortestPath(
  (u:User {name:'HELPDESK@CORP.LOCAL'})-[*1..]->(g:Group {name:'DOMAIN ADMINS@CORP.LOCAL'})
) RETURN p
```

and clicks Run. The graph renders. The shortest path is highlighted. The pivot points are circled. Time elapsed: 200 milliseconds on the database, plus a second for the browser to draw the SVG [@bloodhound-ce-repo].

Whatever happened in those ten years has to be more than a software release. It has to be a change in how the entire community models the problem. The change has a date and a sentence. Both arrived in April 2015. Let us start with the sentence.

## 2. From ACL lists to graphs -- why the model matters

*An access-control list is not the same as a graph -- and the difference is everything.*

Consider five accounts in a hypothetical `CORP.LOCAL` forest. Bob, a help-desk operator, has been granted `ForceChangePassword` on Carol's account by a long-departed administrator who once needed to delegate password resets. Carol is a member of `Server Operators`. `Server Operators`, by default, can log on locally to Domain Controllers and back up the directory database. The Domain Controller hosts the [`krbtgt` account](/blog/krbtgt-the-account-that-owns-active-directory/).

Three rows in three different audit reports. One attack path.

A per-object Access Control List audit looks at Bob's row, sees a `ForceChangePassword` ACE, and flags it as "an over-broad delegation." It looks at Carol's row, sees that she belongs to `Server Operators`, and flags it as "a privileged group membership." It looks at the Domain Controller and sees that `Server Operators` has logon rights, which is the default. Nothing in the audit composes the three facts. Reachability is not a property the report computes.

<Definition term="Directed graph">
A set of nodes (vertices) and directed edges (arrows) between them. Each edge points from one node to another. A *path* is a sequence of edges that can be traversed in the direction of their arrows. A node B is *reachable* from a node A if some path leads from A to B.
</Definition>

<Definition term="Reachability">
A property of two nodes A and B in a directed graph: B is reachable from A if there exists a sequence of edges leading from A to B, regardless of length. Reachability is fundamentally a graph property and cannot be answered by inspecting any single edge in isolation.
</Definition>

Now draw the same five accounts as a directed graph. Bob is a node. Carol is a node. `Server Operators` is a node. The Domain Controller is a node. The `krbtgt` account is a node. There is an edge from Bob to Carol labelled `ForceChangePassword`. There is an edge from Carol to `Server Operators` labelled `MemberOf`. There is an edge from `Server Operators` to the Domain Controller labelled `CanRDP`. There is an edge from the Domain Controller to `krbtgt` labelled [`DCSync`](/blog/two-checkmarks-and-the-keys-to-the-kingdom-how-active-direct/). Trace the arrows. Bob can reach `krbtgt`.

<Mermaid caption="The same five-account environment as a directed graph: one path from a help-desk account to the krbtgt key">
flowchart LR
    Bob([Bob, help-desk]) -->|ForceChangePassword| Carol([Carol])
    Carol -->|MemberOf| SO([Server Operators])
    SO -->|CanRDP| DC([Domain Controller])
    DC -->|DCSync| Krb([krbtgt])
</Mermaid>

The graph form makes reachability visually obvious. The list form does not. This is not a presentation difference. It is a data-model difference, and it is the difference that decides whether a tool can answer the question *"is there a path from A to B?"* at all.

> **Key idea:** Reachability is a property of the graph that the list does not, in general, express. This is not a UX difference. It is a data-model difference, and it is the entire reason BloodHound exists.

The sentence that named this gap appeared on April 26, 2015. John Lambert, then a Distinguished Engineer in the Microsoft Threat Intelligence Center, published a short essay to a personal GitHub repository [@lambert-2015-defenders-lists-attackers-graphs]. No peer review. No formal venue. Two declarative opening sentences that every defender attack-path product since has cited approvingly.

<PullQuote>
"Defenders don't have a list of assets -- they have a graph. Assets are connected to each other by security relationships. As long as defenders use a list and attackers use a graph, attackers win." -- John Lambert, April 26, 2015.
</PullQuote>

Lambert then enumerated five concrete classes of "security dependencies" that constitute edges in any real network: shared local-admin passwords; logon scripts on file servers; print-driver propagation from print servers; certificate authorities that mint smart-card logon certificates; and database administrators who run code as a privileged DB process. The essay closed with a defender prescription: *"The first step is to visualize your network by turning your lists into graphs."*

If the diagnosis is so obvious in retrospect, why was every mainstream AD audit tool from 2000 through 2015 list-shaped? The answer is that the academic literature had the right model but the wrong substrate, and the operator community had the right substrate but the wrong model. The two did not meet until April 2015.

## 3. Attack graphs before BloodHound, 1998 to 2015

The phrase *attack graph* is older than Active Directory itself.

In September 1998, two researchers at Sandia National Laboratories presented a paper titled *A Graph-Based System for Network-Vulnerability Analysis* at the New Security Paradigms Workshop in Charlottesville, Virginia. The authors, Cynthia Phillips and Laura Painton Swiler, proposed that a network's worst-case attack was a *graph traversal* problem [@phillips-swiler-1998-nspw]. Nodes encoded network states (the set of attacker privileges across the set of hosts). Edges encoded atomic attack steps (an exploit that, given prerequisite privileges, granted new ones). The shortest path from "attacker outside the network" to "attacker has goal asset" was the worst-case attack.

Phillips and Swiler observed in a now-canonical sentence that *"the security of a network is more than the sum of the security of its hosts."* It is the conceptual ancestor of every attack-path tool that followed.

Four years later, at IEEE Symposium on Security and Privacy 2002, Oleg Sheyner (then a CMU PhD student) along with Joshua Haines and Richard Lippmann (MIT Lincoln Lab), Somesh Jha (Wisconsin), and Jeannette Wing (CMU) made the construction automatic [@sheyner-et-al-2002-attack-graphs]. They encoded the network and attacker model as an NuSMV model-checker specification, treated the negation of the security goal as a temporal-logic property, and let the model checker generate every counterexample. The union of counterexamples was the attack graph.

They also proved that the *minimum* set of edges whose removal disconnects the attacker from the goal is NP-hard, but admits an O(log n) approximation -- the first asymptotic bound for the defender-hardening problem.

<Sidenote>Sheyner's companion 2004 thesis -- *Scenario Graphs and Attack Graphs*, CMU-CS-04-122 -- remains the most readable book-length treatment of the academic-attack-graph generation [@sheyner-2004-thesis].</Sidenote>

By 2005 the academic line had a scale solution. Xinming Ou, Sudhakar Govindavajhala, and Andrew Appel at Princeton released **MulVAL** at USENIX Security 2005 [@mulval-usenix-2005], encoding network state and attacker rules as Datalog facts and running them through XSB Prolog. From the paper's abstract: *"Once the information is collected, the analysis can be performed in seconds for networks with thousands of machines."* NetSPA at MIT Lincoln Lab (2006) and TVA / CAULDRON at George Mason (2003 to 2005) achieved similar scale through different mechanisms.

In parallel, on the Windows operator side, Sean Metcalf was running adsecurity.org and documenting AD misconfiguration patterns one writeup at a time [@adsecurity-org]. Microsoft was rolling out its Enhanced Security Administrative Environment ("Red Forest") tiered-administration model -- the predecessor that the Enterprise Access Model later replaced [@ms-enterprise-access-model] -- which was implicitly graph-aware (the entire ESAE prescription is a tier diagram) but never exposed the tier graph as a queryable structure. ESAE was a deployment blueprint, not a tool.

<Mermaid caption="Timeline of attack-path analysis 1998 to 2026: the academic CVE line, the Windows operator line, and Microsoft's defender stack run in parallel for fifteen years before converging">
gantt
    title Three parallel tracks converging on the attack-path graph
    dateFormat YYYY
    axisFormat %Y

    section Academic CVE line
    Phillips and Swiler NSPW   :a1, 1998, 1999
    Sheyner et al. IEEE SP     :a2, 2002, 2003
    MulVAL USENIX              :a3, 2005, 2006
    NetSPA TVA                 :a4, 2006, 2008

    section Windows operator line
    adsecurity.org writeups    :b1, 2014, 2016
    PowerView                  :b2, 2014, 2016
    Lambert essay              :milestone, b3, 2015-04-26, 1d
    BloodHound DEF CON 24      :b4, 2016, 2018
    AzureHound BloodHound 4.0  :b5, 2020, 2023
    BloodHound CE 5.0          :b6, 2023, 2024
    ADCS attack paths          :b7, 2024, 2025
    Butterfly v6.3             :b8, 2024, 2026
    OpenGraph v8.0             :b9, 2025, 2026

    section Microsoft defender stack
    ESAE Red Forest blueprint  :c1, 2015, 2018
    Azure ATP LMPs preview     :c2, 2018, 2020
    Defender CSPM attack paths :c3, 2022, 2024
    MSEM GA                    :milestone, c4, 2024-11-19, 1d
</Mermaid>

The April 2015 essay sat between the two tracks. Lambert was inside Microsoft. He had read the academic literature. He worked next to the threat-intelligence teams who watched real intrusions unfold. The essay was the moment the two communities' vocabularies met. The diagnosis was right. The cure was sixteen months away. It would not come from Microsoft, and it would not come from academia. It would come from a red-team consultancy and a free graph database, and it would debut on a Saturday in August at a hacker convention in Las Vegas.

## 4. Defenders evaluating Active Directory as a list of ACLs

If you were a Microsoft-aligned AD security engineer in 2013, your job was to read ACLs. One object at a time. Down a list.

The mainstream defender toolchain of the era was, almost without exception, list-shaped. Microsoft shipped `dsacls.exe` and PowerShell's `Get-Acl`; both produced row-oriented output that an analyst read sequentially. The commercial AD-audit market -- NetWrix Auditor [@netwrix-auditor], ManageEngine ADAudit Plus [@manageengine-adaudit-plus], Quest ActiveRoles [@quest-activeroles], and others -- produced HTML reports with one section per directory object, one row per non-default Access Control Entry, and severity classifications based on a fixed checklist of "dangerous rights" (`GenericAll`, `WriteDacl`, `WriteOwner`, and a handful of others).

Microsoft's own *Best Practices for Securing Active Directory* document codified this approach. Its central recommendation was *per-object delegation review*: walk the directory tree, evaluate each object's ACL against a hardening checklist, and remediate non-default ACEs that exceed the documented privilege model [@ms-best-practices-ad]. The document is excellent at what it sets out to do. What it does not do -- because the format does not permit it -- is compose multi-hop reachability.

This was the failure mode Lambert described. A help-desk operator with `ForceChangePassword` on a junior service account appears as one row in the audit. The junior service account's `MemberOf Server Operators` membership appears in a different report section. The `Server Operators` group's logon rights on the Domain Controller appear in the Domain Controller's own report. Three findings in three places, with no machinery to compose them. The data model cannot represent the question.

<Aside label="An aside for the academically minded">
A reader trained on Phillips and Swiler, Sheyner et al., MulVAL, NetSPA, and TVA might object that *the field already had* graph-based attack-path analysis a decade before BloodHound. True -- in academia. The academic line solved the scale problem (MulVAL: *"seconds for networks with thousands of machines"*) but spoke the wrong vocabulary. Its atomic attack step was a CVE exploit -- a buffer overflow, a format-string bug, a daemon remote code execution. The dominant AD attack primitive is not a CVE. A user with `WriteDacl` on a group is not exploiting any vulnerability; they are using the system as designed. None of MulVAL, NetSPA, or TVA developed an AD-style privilege-graph input format, and the operator community never adopted them. The academic line was substrate-mismatched and delivered as research-PDF tarballs rather than as `git clone`-able tools.
</Aside>

Two communities, both wrong in different ways. The academic one had the right algorithm with the wrong substrate. The operator one had the right substrate with no algorithm at all. The fix was to fuse them. That happened on August 6, 2016.

## 5. The breakthrough -- BloodHound and Six Degrees of Domain Admin

Saturday, August 6, 2016. 1:00 PM. DEF CON 24, Track 2, Paris and Bally's hotel-casinos in Las Vegas [@defcon-24-archive]. Three speakers from Veris Group's adaptive threat division step on stage: Andy Robbins, Rohan Vazarkar, Will Schroeder. The talk is titled *Six Degrees of Domain Admin: Using Graph Theory to Accelerate Red Team Operations* [@defcon24-bloodhound-slides]. The Veris Group team would spin out as SpecterOps the following year, but the August 2016 attribution -- Robbins (`@_wald0`), Vazarkar (`@CptJesus`), and Schroeder (`@harmj0y`) -- is the canonical one [@bloodhound-legacy-repo].

The talk demonstrated three design decisions that in retrospect look obvious and at the time were not.

First, **model Active Directory as a directed property graph**. Nodes are typed security principals (User, Computer, Group, Domain, OU, GPO). Edges are typed privilege relationships (MemberOf, AdminTo, HasSession, GenericAll, GenericWrite, WriteDacl, ForceChangePassword, and others). This was Lambert's framing made concrete: every ACE that grants a dangerous right becomes an edge from the trustee to the target.

Second, **reuse Neo4j as the engine**. Do not build a custom graph database; piggyback on Cypher's pattern-matching query language. The cost of building a path-finding engine from scratch was non-trivial; the cost of standing up Neo4j was a single Docker container.

Third, **ship a collector that emits typed JSON edges, not raw NTSecurityDescriptors**. The collector's value is the *interpretation* of an ACE as a graph edge -- the mapping from a binary security descriptor to the typed edge that says "trustee X has GenericWrite on object Y" is the hard part, and a defender re-creating that mapping per query would lose. SharpHound (initially `SharpHound.ps1`, then a C# binary) does the interpretation once at collection time and writes the edges to disk [@bloodhound-ce-repo].

<Aside label="The PowerView lineage">
SharpHound's enumeration calls are visibly descended from PowerView. The November 2020 SpecterOps blog announcing BloodHound 4.0 acknowledges the lineage explicitly, naming Schroeder's joint authorship of both projects and crediting PowerView as the data-collection precursor [@specterops-2020-bloodhound-4]. PowerView's August 2014 release was the substrate that made BloodHound's August 2016 synthesis possible. The chain is unbroken: enumeration in 2014, framing in April 2015, graph synthesis in August 2016.
</Aside>

<Sidenote>The five-day-of-whiteboard-work figure comes from the SpecterOps team's own internal benchmark from the original DEF CON 24 talk. The 200 ms query latency is the typical 2024 figure on a mid-size enterprise forest of roughly 10^5 nodes and 10^6 edges, retained as the company's marketing framing across subsequent blog posts.</Sidenote>

> **Key idea:** BloodHound's breakthrough was neither an algorithm nor an architecture. It was the decision to *interpret* Active Directory's access-control data as a typed graph and ship that interpretation as a tool the operator community could actually run.

Crucially, the choice to use Neo4j's free `shortestPath()` function rather than building a custom path-finder was a *delivery* decision as much as a *technical* one. Neo4j already did breadth-first shortest paths. The team did not need to invent anything. The hard work was in the edge taxonomy and the collector, not in the graph database.

The talk made a promise the rest of the article must now cash: that there is a real algorithm under the hood, that the algorithm has a name, and that the name is not Dijkstra.

## 6. The algorithmic core -- property graphs, Cypher, and shortest paths

If you have never written a Cypher query in your life, the next ninety seconds is the entirety of the syntax you need.

<Definition term="Property graph">
A graph in which both nodes and edges are typed and can carry key-value properties. A node might have type `User` and properties `name='BOB@CORP.LOCAL'`, `enabled=true`, `pwdlastset=1719234234`. An edge might have type `GenericWrite` and properties `source='ACE'`, `isacl=true`. This is the data model Neo4j implements and the model BloodHound uses.
</Definition>

A Cypher pattern is a parenthesised node, a bracketed edge, and a parenthesised node, with arrows showing direction. `(u:User)-[:MemberOf]->(g:Group)` reads "find a node `u` of type User connected by a MemberOf edge to a node `g` of type Group." A full query has four parts: `MATCH` for the pattern, optional `WHERE` for filters, `RETURN` for the output. That's it.<MarginNote>Cypher patterns visually mimic ASCII graph drawings: parentheses are nodes, square brackets are edges, and the arrow direction matches the edge direction. The syntax was deliberately designed to look like the diagram you would sketch on a whiteboard.</MarginNote>

<Definition term="Cypher">
The pattern-matching query language originally created for the Neo4j graph database. Cypher's syntax is declarative: you describe the shape of the data you want, and the engine plans the traversal. Since April 2024 Cypher has been the basis of the ISO/IEC 39075:2024 GQL standard -- the first ISO-standardised graph query language [@iso-39075-2024-gql] [@opencypher-home].
</Definition>

Here is the canonical BloodHound query, with annotations:

```cypher
MATCH p = shortestPath(
  (u:User {name:'BOB@CORP.LOCAL'})       // start node: Bob
    -[*1..]->                            // any number of typed edges
  (g:Group {name:'DOMAIN ADMINS@CORP.LOCAL'})  // end node: Domain Admins
)
RETURN p
```

The `shortestPath()` wrapper tells Neo4j to short-circuit at the first solution. The variable-length quantifier `[*1..]` says "one or more edges of any type." The `p =` binds the entire matched path to the variable `p`, which the `RETURN` then emits as a sequence of node-edge-node triples that the BloodHound frontend renders as an SVG.

What does `shortestPath()` actually run? Here is where the misconception that BloodHound uses Dijkstra needs to die. The current Neo4j Cypher manual is explicit that `shortestPath()` runs an unweighted **bidirectional traversal** -- BFS in the classical sense -- between the source and target nodes [@neo4j-cypher-shortest-paths]. Not Dijkstra. Not A*. BFS.

<Definition term="Breadth-first search (BFS)">
A graph-traversal algorithm that explores all nodes at the current depth before proceeding to the next depth. Starting from the source, it visits all 1-hop neighbours, then all 2-hop neighbours, and so on. For unweighted graphs, BFS is guaranteed to find a shortest path (in number of edges) the first time it reaches the target. Worst-case time is O(V + E) where V is the node count and E is the edge count -- the trivial information-theoretic lower bound for any algorithm that must read the input.
</Definition>

<Sidenote>Cypher does support weighted shortest paths via the Neo4j Graph Data Science library, but the BloodHound CE distribution does not enable it. There is no natural cost metric on Active Directory privilege edges in 2026; every edge is treated as one hop.</Sidenote>

Why BFS rather than Dijkstra? Dijkstra is BFS's generalisation to weighted graphs. If your edges have natural costs -- road distances, link latencies, dollar prices -- Dijkstra (with a Fibonacci heap, $O(E + V\log V)$) gives you shortest paths under that cost metric. Active Directory privilege edges do not have a natural cost metric. `MemberOf`, `GenericAll`, and `CanRDP` are all "the attacker can take this step." Some are easier than others, but quantifying *how much* easier is itself an unsolved problem (see Section 11). Treating every edge as one hop is the load-bearing simplification that makes the model tractable.

<Definition term="Transitive closure">
Of a binary relation R: the relation R+ that contains the pair (a, c) whenever there is a chain a R b R ... R c. For a graph, the transitive closure tells you, for every pair of nodes, whether one is reachable from the other. BloodHound's `shortestPath()` queries can be thought of as on-demand evaluation of the transitive closure restricted to one source-target pair.
</Definition>

The per-query complexity is $O(V + E)$, the standard BFS bound from any algorithms textbook. On a mid-size enterprise forest -- roughly 10^5 nodes and 10^6 edges -- a single user-to-group shortest path returns in sub-second wall-clock time.

The variable-length quantifier `[*1..N]` for general path enumeration is a different matter. Cyclic graphs admit exponentially many paths in N, and Neo4j's documentation explicitly warns that quantified path patterns can return exponentially many results in the worst case [@neo4j-cypher-variable-length]. The `shortestPath()` short-circuit avoids this by returning on the first hit; `allShortestPaths()` enumerates only paths tied for shortest; unbounded enumeration is intractable on any non-trivial graph.

A concrete demonstration is in order. The runnable snippet below is a 35-line implementation of unweighted BFS over a six-node toy graph. It returns the shortest path from a `helpdesk` user to the `domain-admin` group. This is, structurally, the same algorithm Neo4j's `shortestPath()` runs. The numerical answer ("path of length 4") is the same number BloodHound would report on the same graph.

<RunnableCode lang="js" title="Unweighted BFS shortest path (the same algorithm Neo4j's shortestPath() runs)">{`
// Edges modelled as a typed adjacency list.
const edges = {
  helpdesk:        [{ to: 'carol',          via: 'ForceChangePassword' }],
  carol:           [{ to: 'serverops',      via: 'MemberOf' }],
  serverops:       [{ to: 'dc01',           via: 'CanRDP' }],
  dc01:            [{ to: 'krbtgt',         via: 'DCSync' }],
  krbtgt:          [{ to: 'domain-admin',   via: 'GoldenTicket' }],
  domain_admin:    []
};

function shortestPath(start, goal) {
  const queue = [[start]];           // queue holds candidate paths
  const seen  = new Set([start]);    // each node enqueued at most once

  while (queue.length) {
    const path = queue.shift();      // BFS = FIFO
    const node = path[path.length - 1];
    if (node === goal) return path;
    for (const e of (edges[node] || [])) {
      if (!seen.has(e.to)) {
        seen.add(e.to);
        queue.push([...path, e.to]);
      }
    }
  }
  return null;                       // unreachable
}

const path = shortestPath('helpdesk', 'domain-admin');
console.log('Path:', path.join(' -> '));
console.log('Length:', path.length - 1, 'hops');
`}</RunnableCode>

The algorithm fits on a single screen. The hard work of BloodHound is not in this loop. The hard work is in deciding *which edges to insert into the graph in the first place*. That decision -- the edge taxonomy -- is what makes BloodHound a security tool rather than a graph-database demo.

<Mermaid caption="How Neo4j's shortestPath() executes a bidirectional breadth-first search: simultaneous frontiers expand from source and target, meeting in the middle">
sequenceDiagram
    participant Client as Cypher client
    participant Planner as Cypher planner
    participant Engine as BFS engine
    participant Graph as Property graph
    Client->>Planner: MATCH shortestPath(u to g)
    Planner->>Engine: plan(start=u, end=g, bidirectional=true)
    Engine->>Graph: expand 1-hop frontier from u
    Engine->>Graph: expand 1-hop frontier from g
    Graph-->>Engine: neighbours of u, neighbours of g
    Engine->>Graph: expand 2-hop frontier (both sides)
    Engine->>Graph: expand 3-hop frontier (both sides)
    Graph-->>Engine: frontiers intersect at node m
    Engine-->>Planner: reconstruct path u to m to g
    Planner-->>Client: return path
</Mermaid>

## 7. The edge taxonomy -- what Active Directory actually looks like as a graph

The BloodHound graph is not *every privilege relationship in Active Directory.* It is the set of relationships that the SpecterOps team has decided -- by iterative discovery, often in response to specific community-reported abuse primitives -- to model. The taxonomy has grown roughly monotonically since 2016; the rate has accelerated since the BloodHound CE 5.0 reboot in August 2023 [@specterops-2023-bloodhound-ce].

Eight families dominate the 2026 graph. They map, family by family, onto the substrates of a modern hybrid enterprise.

**1. Group membership.** `MemberOf` is the simplest edge. Cypher's variable-length quantifier (`[:MemberOf*1..]`) walks transitive memberships in one expression, which is why nested-group reachability is a one-liner.

**2. ACL write-equivalents.** `GenericAll`, `GenericWrite`, `WriteDacl`, `WriteOwner`, `Owns`, `AllExtendedRights`, `ForceChangePassword`, `AddSelf`, `AddMember`, and `AddKeyCredentialLink` (the shadow-credentials primitive). Each names a specific dangerous-right pattern on a directory object's security descriptor. SharpHound's interpreter scans the `nTSecurityDescriptor` attribute and emits one typed edge per matching ACE.

**3. Sessions.** `HasSession` is the dynamic edge that goes stale fastest. SharpHound enumerates active sessions via `NetSessionEnum` and `SAMR`; the resulting edges describe "user U is currently logged into computer C." The graph is whatever the most recent collection captured.

**4. Remote execution rights.** `AdminTo`, `CanRDP`, `ExecuteDCOM`, `CanPSRemote`, `SQLAdmin`. Each describes a code-execution primitive granted to a principal on a computer object.

**5. [Kerberos delegation](/blog/kerberos-in-windows-the-other-half-of-ntlmless/).** `AllowedToDelegate` (constrained delegation), `AllowedToAct` (resource-based constrained delegation, via `msDS-AllowedToActOnBehalfOfOtherIdentity`), unconstrained delegation surfaced as node properties on Computer objects, and -- from BloodHound CE v6.3 in December 2024 [@bloodhound-v6-3-release] -- a `CoerceToTGT` edge that replaces the older `UnconstrainedDelegation` finding for BHE customers.

**6. [ADCS](/blog/certified-pre-owned-ad-cs-and-active-directorys-second-trust/) edges (early-access January 2024).** `ADCSESC1` through `ADCSESC10`, plus the **`CoerceAndRelayNTLMToADCS` edge for ESC8** [@specterops-2024-adcs-bloodhound]. These edges land in BloodHound roughly thirty months after Will Schroeder and Lee Christensen first published the ESC1 to ESC8 catalog in *Certified Pre-Owned: Abusing Active Directory Certificate Services* [@specterops-2021-certified-preowned]. Each ADCS edge in BloodHound is the most complex in the taxonomy because each is composed from multiple raw facts (see the traversable / non-traversable discussion below).

**7. Azure / Entra ID edges** (via AzureHound, November 20, 2020) [@specterops-2020-bloodhound-4]. `AZGlobalAdmin`, `AZRoleAssignment`, `AZContains`, `AZOwns`, `AZUserAccessAdministrator`, `AZAddSecret`, `AZMGAddOwner`, plus AzureRM-side resource roles. Microsoft Entra [Privileged Identity Management (PIM)](/blog/privileged-identity-management-how-a-two-state-role-assignme/) role coverage was added in BloodHound v8.0 in July 2025 [@specterops-2025-opengraph].

**8. OpenGraph custom edges (v8.0, July 29, 2025).** User-defined edges for arbitrary substrates: GitHub, Snowflake, Microsoft SQL Server, ServiceNow, Tailscale, Duo. The schema is intentionally generic so that a community contributor can ship edges for any system whose privilege model can be drawn as a graph [@bloodhound-opengraph-library].

| Family | Representative edges | Underlying AD mechanism | What it gives the attacker |
|---|---|---|---|
| Group membership | `MemberOf` | `member` attribute on group object | Inherits all permissions held by the group |
| ACL write-equivalents | `GenericAll`, `GenericWrite`, `WriteDacl`, `WriteOwner`, `ForceChangePassword`, `AddKeyCredentialLink` | Specific dangerous-right ACE patterns in `nTSecurityDescriptor` | Take control of the target principal (reset password, modify object, plant shadow credentials) |
| Sessions | `HasSession` | `NetSessionEnum` and `SAMR` enumeration on member computers | Pivot via credential theft from the logged-in user's memory |
| Remote execution | `AdminTo`, `CanRDP`, `ExecuteDCOM`, `CanPSRemote`, `SQLAdmin` | Local-admin membership, RDP / DCOM / WinRM / SQL group rights | Run arbitrary code as the target principal on the target host |
| Kerberos delegation | `AllowedToDelegate`, `AllowedToAct`, `CoerceToTGT` | Constrained and resource-based delegation attributes | Forge service tickets and impersonate other accounts |
| ADCS composite | `ADCSESC1` through `ADCSESC10`, `CoerceAndRelayNTLMToADCS` | Certificate template misconfigurations plus CA trust plus enrollment ACEs | Obtain a certificate usable for authentication as a privileged account |
| Azure / Entra | `AZGlobalAdmin`, `AZRoleAssignment`, `AZAddSecret`, `AZOwns`, `AZMGAddOwner` | Entra role assignments, AzureRM RBAC | Cross the on-prem to cloud boundary; pivot via tenant or subscription privileges |
| OpenGraph | User-defined | Any substrate the contributor models | Anything the contributed schema encodes |

The ADCS family deserves a closer look because it introduced an important new modelling vocabulary.

<Definition term="Traversable and non-traversable edges">
A *traversable* edge is one the shortest-path query can step through directly: `MemberOf`, `ForceChangePassword`, `CanRDP`. A *non-traversable* edge is a precondition relationship that is only exploitable when several others appear together. A certificate template's `Enroll` ACE is non-traversable on its own; combined with eight other facts about the template, the issuing CA, and the domain's trust posture, it composes into `ADCSESC1`. The post-processor scans for the full pattern and synthesises a single traversable edge that the BFS can then treat as one hop [@specterops-2024-adcs-bloodhound].
</Definition>

For ESC1 the pattern has nine numbered prerequisites: six template and CA requirements, two enterprise-CA trust facts, and one implicit constraint. None of the nine raw facts is exploitable in isolation. All nine together are. The post-processor's job is to walk the candidate sub-graphs, check every requirement, and write the composed `ADCSESC1` edge when the pattern holds.

This is a non-trivial graph-modelling contribution because it gives the field a vocabulary for "an edge that is real only as a join over several facts." It also generalises beyond ADCS: any future attack primitive composed from a fixed pattern of raw facts can be modelled the same way.

ESC8 -- the [NTLM relay](/blog/ntlmless-the-death-of-ntlm-in-windows/) primitive against an HTTP-enrollment certificate authority -- is the most delicate case, and the one most commonly mis-modelled in early secondary writeups.

> **Note:** The `CoerceAndRelayNTLMToADCS` edge is a **Group(`Authenticated Users`) to Computer(coerced target)** edge, not a Computer to Computer edge as some early secondary writeups described. The relay-target CA and the certificate template are carried as *edge metadata*, not as additional graph nodes. The canonical edge documentation is explicit: *Source: Authenticated Users [Group] / Destination: Computer / Traversable: Yes* [@bloodhound-coerce-relay-edge].

<Sidenote>The schema correction matters because the source-principal choice affects every shortest-path query that crosses ESC8. If the edge is mis-modelled as Computer to Computer, queries that begin from a low-privilege user account miss the path entirely. The Group-to-Computer schema correctly captures that *any authenticated principal* can coerce.</Sidenote>

What the graph does not yet model is also worth naming. The `SpecterOps/TierZeroTable` README states it verbatim: *"DISCLAIMER: The table does not include all Tier Zero assets yet."* [@specterops-tier-zero-table] Several edge classes remain partially or fully out of scope; the full enumeration appears in Section 11 (open problems). Coverage expansion is iterative and community-fed; OpenGraph (Section 9) is the structural answer to "where does the graph end?"

<Mermaid caption="The BloodHound edge families by substrate. Composed ADCS edges and OpenGraph user-defined edges sit alongside the core AD and Entra families">
flowchart TD
    subgraph OnPrem["On-prem Active Directory"]
        Members["MemberOf, Owns"]
        ACL["ACL write-equivalents<br/>GenericAll, WriteDacl, ForceChangePassword,<br/>AddKeyCredentialLink"]
        Sessions["HasSession"]
        Exec["AdminTo, CanRDP, ExecuteDCOM,<br/>CanPSRemote, SQLAdmin"]
        Krb["Kerberos delegation<br/>AllowedToDelegate, AllowedToAct, CoerceToTGT"]
    end
    subgraph Entra["Entra ID and AzureRM"]
        EntraRoles["AZGlobalAdmin, AZRoleAssignment,<br/>AZAddSecret, AZUserAccessAdministrator,<br/>PIM roles"]
        AzureRM["AZContains, AZOwns,<br/>AzureRM resource roles"]
    end
    subgraph ADCS["ADCS composite edges"]
        ESC["ADCSESC1 to ADCSESC10,<br/>CoerceAndRelayNTLMToADCS"]
    end
    subgraph Open["OpenGraph user-defined"]
        Custom["GitHub, Snowflake, SQL Server,<br/>ServiceNow, Tailscale, Duo, custom"]
    end
    OnPrem --> Cypher((Cypher query layer))
    Entra --> Cypher
    ADCS --> Cypher
    Open --> Cypher
</Mermaid>

If this is what one community modelled, the natural question is: what did Microsoft model? And when?

## 8. The defender adoption -- Microsoft catches up, 2018 to 2024

The defender vendor whose product BloodHound was mapping is also a defender vendor with a graph product of its own. Three of them, in fact, shipped in three different years for three different substrates. They are easy to confuse; press releases sometimes do.

The first arrived on November 27, 2018, when Tali Ash (then a Program Manager on the Azure Advanced Threat Protection team) announced a preview feature called *Lateral Movement Paths* (LMPs) in a Microsoft tech-community post [@ms-azure-atp-lmp-2018]. LMPs were a graph-shaped visualisation, but a constrained one: restricted to "sensitive accounts" (a configurable set defaulting to Domain Admins and similar) plus non-sensitive accounts that had shared a session on the same host as a sensitive account.

The portal rendered one- and two-hop credential-theft pivots as a static SVG. There was no Cypher equivalent, no LMP-export API, and no way to write a custom query. Azure ATP was rebranded **Microsoft Defender for Identity** in 2020, and the LMP feature came along under the new name [@ms-mdi-lmp-docs].

<Sidenote>Several secondary sources date Lateral Movement Paths to "June 2019," which corresponds to the general-availability and rebrand window rather than the original preview announcement. The primary Microsoft tech-community post is November 27, 2018; treat the year-not-month for any third-party claim and prefer the November 2018 preview date as the canonical first ship.</Sidenote>

The second arrived in October 2022, when Microsoft Defender for Cloud's Defender CSPM plan added a *cloud security graph* with attack-path analysis (public preview at Ignite October 2022; generally available March 28, 2023) [@ms-defender-cloud-attack-path]. This product is a *cloud* attack-path graph: Azure plus AWS plus GCP asset inventory, with inferred edges for permissions, network reachability, vulnerability presence, and internet exposure. It is explicitly *not* the Active Directory identity graph; it covers the multi-cloud workload surface.

A common mistake conflates this 2022-2023 product (Defender for *Cloud*) with Microsoft Defender for *Identity* (the LMP product from November 2018). The substrate, the team, and the year are all different. Worth flagging here because secondary writeups repeat the confusion often.

<Aside label="An aside on Microsoft naming">
The Microsoft Defender XDR product family is a naming minefield. *Defender for Identity* (MDI) is the on-prem AD identity-threat product; LMPs are its graph view. *Defender for Cloud* (MDC) is the multi-cloud workload-protection product; its CSPM plan ships cloud-security-graph attack-path analysis. *Defender for Endpoint* (MDE) is the EDR product; it does not ship its own attack-path graph but feeds telemetry into MSEM. *Microsoft Security Exposure Management* (MSEM, GA November 19, 2024) is the unified exposure-graph layer that subsumes the others. Four products, four substrates, four ship dates. The naming overlap is unfortunate but the distinctions are real.
</Aside>

The third arrived on November 19, 2024, at the Ignite 2024 keynote in Chicago. Satya Nadella, in the opening keynote, announced that **Microsoft Security Exposure Management** (MSEM) had reached general availability [@ms-ignite-2024-msem]. MSEM is the product whose attack-path model is *structurally equivalent* to BloodHound's: cross-substrate (identity plus endpoint plus multi-cloud), first-class attack-path objects with choke-point and blast-radius dashboards, and continuous data feed via the Defender XDR signal plane.

<Definition term="Microsoft Security Exposure Management (MSEM)">
Microsoft's unified exposure-graph product, generally available November 19, 2024, at Ignite 2024 in Chicago. MSEM ingests telemetry from Defender for Endpoint, Defender for Identity, Defender for Cloud, Entra ID, and the Defender XDR plane into a single graph. Attack paths are first-class objects with three dashboard views: an attack-path list, choke-point analysis (small sets of nodes whose compromise enables disproportionately many downstream paths), and blast-radius (downstream reach of a selected node) [@ms-msem-attack-paths].
</Definition>

The MSEM docs page introduces the model verbatim: *"Attack paths in Microsoft Security Exposure Management help you to proactively identify and visualize potential routes that attackers can exploit using vulnerabilities, gaps, and misconfigurations across endpoints, cloud environments, and hybrid infrastructures."* And, on choke points: *"By focusing on these choke points, you can reduce risk by addressing high-impact assets."*

The query interface is the Defender XDR portal plus KQL (Kusto Query Language), not Cypher. The graph engine is proprietary; Microsoft does not publish per-query latency numbers or the underlying algorithms. But the model -- nodes, typed edges, attack paths as the unit of analysis, choke-point and blast-radius views -- is the model BloodHound shipped at DEF CON 24 in August 2016.

The arc takes eight years. From the August 6, 2016 BloodHound talk to the November 19, 2024 MSEM general-availability announcement is eight years and three months. The defender vendor whose product the original BloodHound was mapping ships a defender product whose attack-path model is structurally equivalent to the one a red-team consultancy shipped in a conference talk eight years earlier.

Microsoft adopted the model. The community kept extending it. By 2026 the frontier is no longer *"does the graph exist?"* It is *"how do we make the graph weighted, complete, and substrate-independent?"* That is the state of the art.

## 9. State of the art -- Tier Zero, ADCS edges, and OpenGraph, 2023 to 2026

By the time MSEM shipped, SpecterOps had already moved past *"is there a graph?"* and was asking three sharper questions. Where does the graph end? How do we model attack primitives that compose from raw facts? And does the AD-specific schema even matter?

The first question is what *"Tier Zero"* means. On June 22, 2023, Jonas Bülow Knudsen, Elad Shamir, and Justin Kohler at SpecterOps published *What is Tier Zero -- Part 1*, which reframed Microsoft's tiered-administration concept -- introduced in the 2012-2014 Securing Privileged Access guidance and renamed the Enterprise Access Model with the "Control Plane" vocabulary in December 2020 [@ms-enterprise-access-model] -- as a property *of the graph* [@specterops-2023-tier-zero].

A Tier Zero asset (see Definition below) reframes Microsoft's tiered concept from *the set of things in the high-privilege tier* to *the set of things from which the high-privilege tier is reachable*. Microsoft's own Tier 0 definition -- *"Direct Control of enterprise identities... and all the assets in it"* -- becomes a graph property. The two formulations are equivalent if and only if the graph is complete. If the graph is incomplete (which it is), the Tier Zero set computed from the graph is the floor, not the ceiling.

<Definition term="Tier Zero">
Any node in the attack-path graph whose compromise lets an attacker reach an administrative privilege in the forest. The companion `SpecterOps/TierZeroTable` GitHub project is the community-maintained inventory; the README discloses that the table is the floor, not the ceiling [@specterops-tier-zero-table].
</Definition>

The Tier Zero definition is the answer to *"shortest path to what?"* -- the target side of every BloodHound shortest-path query. Without a defined Tier Zero set, the question has no endpoint.

The second question is how to model attack primitives that compose. The January 24, 2024 SpecterOps blog by Knudsen formalised this with the traversable / non-traversable edge distinction discussed in Section 7. The mechanism generalises: any attack primitive whose exploitability is a conjunction of raw facts can be encoded as a composed edge that the post-processor synthesises when the pattern is present.

ESC1, walked through in Section 7, is the canonical example: nine numbered prerequisites that the post-processor checks before writing the composed `ADCSESC1` edge [@specterops-2024-adcs-bloodhound]. Subsequent posts in the series extended the same machinery to ESC3, ESC4, ESC6, ESC7, ESC8 (the `CoerceAndRelayNTLMToADCS` edge), ESC9, ESC10, and -- on February 14, 2024 -- ESC13 [@specterops-2024-esc13].

In December 2024 BloodHound v6.3 introduced an early-access "improved analysis algorithm" internally referred to as **Butterfly** [@bloodhound-v6-3-release]. Butterfly is the first production attempt at *bi-directional impact* analysis. Pre-v6.3 BloodHound Enterprise quantified risk as "*who can reach this node?*" (incoming attack-path count). v6.3 also quantifies "*who can this node reach if compromised?*" (outgoing blast radius).

The release notes describe the outcome but not the algorithm: *"Improve risk scoring fidelity for all finding types... Measure risk at each individual finding... Support the inclusion of hybrid paths in risk scoring (Azure assets will now contribute to measured risk in AD and vice versa)."*

The same release also announced that BloodHound Enterprise had begun migrating off Neo4j onto PostgreSQL as the *graph* database, with the release notes reporting *">50% improvement in the time it takes to perform post-processing during the Analysis process."* Cypher continues to be the query language; the engine underneath changed.

The third question -- whether the AD-specific schema matters -- got its answer on July 29, 2025, when SpecterOps released BloodHound v8.0 with **OpenGraph** [@specterops-2025-opengraph]. OpenGraph decouples the graph engine from the AD-specific schema. Users (and SpecterOps partners) define their own node and edge kinds and ingest attack-path data from arbitrary substrates. The initial release included GitHub organisations, Snowflake role hierarchies, Microsoft SQL Server logins, ServiceNow groups, and Tailscale ACLs. Subsequent community contributions extended the library.

<PullQuote>
"BloodHound OpenGraph is a foundational shift toward... identity risk management across the entire enterprise." -- Justin Kohler, SpecterOps Chief Product Officer, July 29, 2025.
</PullQuote>

OpenGraph is the closing observation of an arc that began with Phillips and Swiler in 1998: *the model is the abstraction; the substrate is whatever your enterprise runs.* The same `shortestPath()` that finds Active Directory attack paths now finds attack paths over a GitHub organisation, a Snowflake role hierarchy, or a Microsoft SQL Server login graph, with no engine change. The 2026 BloodHound release (v9.1.0, May 6, 2026, per the public release-notes index [@bloodhound-release-notes-index]) extends OpenGraph and adds incremental edge updates -- the first step toward a streaming graph rather than a snapshot.

## 10. Competing approaches -- BloodHound versus Microsoft versus the alternatives

In 2026 no single product covers every substrate. The field is plural.

A practitioner choosing among attack-path tools answers four questions, in order. What substrate do you need to cover? Self-host or SaaS? Snapshot or continuous? Open query language or vendor portal? The table below assembles the answers on the dimensions a 2026 practitioner actually uses.

| Tool | Substrate | Engine | Query language | Deployment | Licensing | Edge weighting | ADCS coverage | Best fit |
|---|---|---|---|---|---|---|---|---|
| BloodHound CE 9.x | AD + Entra + AzureRM + OpenGraph | Neo4j + Postgres app DB | Cypher | Self-hosted Docker Compose | Apache-2.0 | No (unweighted BFS) | Yes -- ESC1 through ESC10 + ESC8 composite | Authorised offensive testing + DIY blue team |
| BloodHound Enterprise | Same as CE | PostgreSQL-as-graph (in-progress migration off Neo4j) | Cypher | SaaS | Commercial | Bi-directional (Butterfly v6.3+); weighting function not public | Yes | Continuous AD/Entra attack-surface management at enterprise scale |
| Adalanche | AD (on-prem; LDIF or live LDAP) | In-memory Go | AQL (GQL-like) | Single Go binary | Open source | No | Yes (per README) | Offline / air-gapped analysis from LDIF |
| Microsoft Security Exposure Management | Defender XDR signal: identity + endpoint + multi-cloud + Entra | Proprietary | KQL + portal | SaaS | Microsoft licensing | Implicit (filter against exploitability oracle) | Indirect via MDI signals | Hybrid Microsoft-substrate unified exposure |
| MDI Lateral Movement Paths | On-prem AD (sensitive-account paths only) | Proprietary | None -- portal only | SaaS | Microsoft licensing | n/a | Implicit via separate MDI alerts | Default-on credential-hopping detection |
| Defender for Cloud CSPM attack-path analysis | Multi-cloud (Azure + AWS + GCP) | Proprietary | Cloud Security Explorer + KQL | SaaS | Microsoft licensing | Implicit | n/a | Multi-cloud workload protection |
| PingCastle / Semperis DSP / ADAudit Plus | On-prem AD (+ limited Entra) | None -- list-of-findings | None -- HTML / portal | Self-hosted or SaaS | Commercial / mixed | n/a | Single-finding hygiene only | Compliance auditing and change tracking |

A few rows deserve commentary.

> **Note:** BloodHound CE ships under **Apache-2.0** per the current repository [@bloodhound-ce-repo]. The GPL-3.0 license you may see in older treatments applies only to the deprecated BloodHound Legacy v4 repository [@bloodhound-legacy-repo], which was last updated in 2023 and is no longer maintained. The licensing difference is material: GPL-3.0 is copyleft, Apache-2.0 is permissive. Downstream use cases that need permissive licensing should rely on the current CE.

<Sidenote>Older blog posts and conference talks frequently call BloodHound CE GPL-3.0. The CE-Legacy LICENSE block does carry the GPL-3.0 copyright header, which is the source of the confusion. The *current* CE codebase at github.com/SpecterOps/BloodHound is Apache-2.0; the GPL-3.0 LICENSE applies only to the deprecated Legacy v4 repository.</Sidenote>

Adalanche, by Lars Karlslund, is the load-bearing counter-example to the claim that "the graph model requires Neo4j" [@adalanche-repo]. Adalanche reads AD data from an LDIF dump or live LDAP, builds the graph entirely in process memory, and exposes a web GUI plus an Adalanche Query Language (AQL) -- described in the README as *"a GQL-like language that allows for complex queries."*

The README's headline claim is verbatim: *"Adalanche gives instant results, showing you what permissions users and groups have in an Active Directory."* The trade is no continuous monitoring, no multi-user web app, and a smaller community in exchange for zero deployment friction. The model is identical; the engine is replaceable.

MSEM is the closest Microsoft analogue to BloodHound (see Section 8 for substrate and query interface). Reasonable defenders run *both* MSEM and BloodHound (CE or Enterprise) on the same forest. The tools are complementary rather than substitutionary: MSEM brings the EDR plus workload-protection telemetry that BloodHound does not natively ingest, while BloodHound brings the precise AD edge semantics that SpecterOps's research community has validated. Running both is not double-counting.

The hygiene scanners -- PingCastle [@pingcastle], Semperis DSP [@semperis-dsp], ManageEngine ADAudit Plus [@manageengine-adaudit-plus] -- are the surviving descendants of the per-object ACL-inspection generation, with risk-scoring layered on top. They are valuable for compliance auditing and change tracking. They do not expose a queryable attack-path graph. The compliance auditor and the attack-path analyst are different personas with different tools.

If the field is plural and every tool has a gap, what is the shape of the problem that no tool yet solves? The next section is the honest answer.

## 11. Theoretical limits and open problems

Some of the gaps in attack-path analysis are engineering gaps. Others are not.

The single most consequential open problem is **edge weighting**. BloodHound's BFS treats every edge as one hop. In reality, `MemberOf` is effectively free; `ForceChangePassword` requires the attacker to log in as the changed principal afterwards; `AddKeyCredentialLink` requires shadow-credential infrastructure; `CoerceAndRelayNTLMToADCS` requires an active SMB-coercion primitive, NTLM relay tooling, and an ESC8-vulnerable certificate authority. A shortest-*hop* path is not in general the shortest-*exploitation-cost* path.

BloodHound Enterprise v6.3 shipped the Butterfly analysis as the first production attempt to relax this assumption. As the v6.3 release notes acknowledge (see Section 9), Butterfly's weighting function is not publicly documented [@bloodhound-v6-3-release].

Academic intuition suggests weighting edges by an exploitation-success probability and computing *most-likely-exploited* paths via shortest paths under negative-log-probability weights. The complexity ceiling is well-known: Dijkstra (with non-negative weights) runs in $O(E + V\log V)$ time with a Fibonacci heap; Bellman-Ford handles negative weights at $O(VE)$. Either fits comfortably inside the per-query budget BloodHound already operates within.

The unsolved part is the *empirical calibration*: what numerical weight is the right weight on a `ForceChangePassword` edge versus a `CoerceAndRelayNTLMToADCS` edge? There is no published peer-reviewed answer.

The second open problem is **coverage**. The `TierZeroTable` README is the authoritative self-disclosure: file-system ACLs on member servers, fine-grained GPO delegation, on-host service-account permissions, some Entra conditional-access logic, and cross-tenant Entra B2B trust paths remain partially or fully out of scope [@specterops-tier-zero-table]. This is an *engineering* problem -- more collectors, more edge definitions -- rather than an algorithmic one. OpenGraph is the structural answer: shift coverage from "what edges has SpecterOps modelled?" to "what edges has the community contributed to the shared library?" [@bloodhound-opengraph-library].

The third is **graph privacy**. A continuously-collected, complete AD privilege graph shipped to a third-party SaaS backend is, in adversarial hands, a pre-computed attack plan for the customer's forest. Tenant isolation, encryption at rest, SOC 2 and FedRAMP attestation, and customer-managed key encryption do not eliminate the structural risk: a compromised SaaS backend yields the customer's graph regardless of compliance posture.

Cryptographic approaches -- homomorphic graph queries, secure multi-party computation for path enumeration -- exist in the theoretical literature but are not in production attack-path products at time of writing. Adalanche and self-hosted BloodHound CE remain the privacy-preserving options at the cost of forgoing continuous monitoring.

The fourth is the **the graph is alive** problem. Session edges (`HasSession`) go stale in hours. New ACEs, new group memberships, new sessions appear continuously. SharpHound's snapshot model is yesterday's view; continuous collectors (BloodHound Enterprise, MSEM agent streams) trade stealth for freshness. The May 6, 2026 release notes describe BloodHound CE's first move toward a streaming model: *"incremental edge updates that reduce unnecessary writes during post-processing"* [@bloodhound-release-notes-index]. No production attack-path tool yet ships a fully streaming graph.

The fifth is **combinatorial intractability**. These are not engineering gaps; they are complexity-theoretic facts.

- **Counting all attack paths is #P-complete.** Leslie Valiant's 1979 result on the complexity of counting solutions to combinatorial problems applies directly: counting the simple paths between two nodes in a general graph cannot be done in polynomial time unless P = #P [@valiant-1979-permanent]. BloodHound's path-count UI is necessarily an approximation or a length-truncation; this is the theoretical reason why.
- **The minimum-edge-cut "defender hardening" problem is NP-hard.** Choose the smallest set of edges whose removal disconnects the attacker from the goal. Sheyner et al. 2002 proved the result is NP-hard but admits an $O(\log n)$ approximation [@sheyner-et-al-2002-attack-graphs]. BHE choke-point ranking and MSEM choke-point analysis necessarily implement heuristic approximations.
- **Finding regular simple paths is NP-complete.** Mendelzon and Wood 1995 proved that finding a simple path matching a regular expression over edge labels in a graph database is NP-complete [@mendelzon-wood-1995-dblp]. Cypher's `shortestPath()` does not enforce simple-path semantics, which is why it remains in P; quantified path patterns with `DIFFERENT RELATIONSHIPS` semantics (available since Neo4j 5.x and documented in the current Cypher manual [@neo4j-cypher-variable-length]) do enforce simple paths and so cross into the NP-complete regime.

> **Key idea:** BloodHound's per-query algorithm (BFS, $O(V + E)$) is optimal up to constants. The frontier of the field is no longer the algorithm. It is what *question* we ask the algorithm: weighted? regular-path? simple-path? bi-directional? cross-substrate? Each open question is a different model, not a different implementation.

Three further honest framings deserve a mention.

First, the **standardisation of OpenGraph edge taxonomies** is unsettled. Without community convergence on edge naming, different contributors may model the same substrate with incompatible schemas. Historical precedent (MITRE ATT&CK technique IDs, CVE identifiers) suggests that convergence happens when a single high-trust curator becomes the de-facto registry; whether SpecterOps will operate OpenGraph as a vendor-neutral standards body or as a SpecterOps-owned artefact is a governance question, not an algorithmic one.

Second, the **adversarial robustness of the collector** is an open question: SharpHound runs as an authenticated principal, and an attacker with prior compromise can poison the collection. There is no closed-form defence.

Third, the **absence of any public head-to-head benchmark** of BloodHound CE versus BHE versus Adalanche versus MSEM on the same forest under controlled conditions is structural: Microsoft does not publish per-query latency, SpecterOps publishes only relative improvement claims, and the academic line uses 2005-era hardware figures that are not comparable.

## 12. Practical guide -- running BloodHound today

If the previous sections sold you on the model, the next few paragraphs are the minimum you need to stand it up.

Stand up BloodHound CE with the Docker Compose file in the repository [@bloodhound-ce-repo]. The stack is four containers: a PostgreSQL application database (users, roles, sessions, audit logs, saved queries); a Neo4j graph database holding the property graph; a Go REST API; and a React plus Sigma.js single-page frontend. Five minutes to first boot on a developer laptop. The repository README is the authoritative deployment reference.

Run **SharpHound** on a domain-joined Windows host as the collection identity. The default invocation -- `SharpHound.exe --CollectionMethods all,GPOLocalGroup` -- enumerates every group membership, every recognised ACL pattern, every active session, every local-admin relationship, and every Kerberos delegation. Run **AzureHound** with appropriate Entra ID credentials for Entra and AzureRM coverage. Both emit JSON dumps in the same envelope; the BloodHound CE upload tab in the web UI ingests both.

Open the web UI. The stock pre-built queries are a reasonable starting palette: *Find Shortest Paths to Domain Admins*, *Find Principals with DCSync Rights*, *Find Computers with Unsupported Operating Systems*, and *Shortest Paths from Owned Principals to High-Value Targets* (after marking some accounts as owned). Custom Cypher goes in the Cypher tab at the top right; the Section 6 query is a good template.

The most important interpretation discipline: treat the result as a *risk register*, not a vulnerability list. A finding is "Bob can reach Domain Admins via a four-hop path." The *edge* is "Bob has `GenericWrite` on Carol." Closing the edge breaks the finding *and* every other path that passed through it. Edges are the unit of remediation, not findings. SpecterOps's own 2021 customer-anonymous essay *Active Directory Attack Paths -- Is It Always This Bad?* reports findings from hundreds of engagements, and the recurring observation across forests is that a small number of high-blast-radius edges explain most of the discovered paths [@specterops-2021-ad-attack-paths].

A few pitfalls are worth naming. `HasSession` collection generates measurable LDAP and SAMR traffic that Microsoft Defender for Identity alerts on; coordinate with the blue team or expect detections. The stealth collection mode trades coverage for traffic volume.

Unconstrained variable-length Cypher queries (`MATCH p=(a)-[*]->(b)`) can pin Neo4j's heap; CE's "protected Cypher" cost limits help but do not eliminate the problem, so prefer `shortestPath()` or bound the path length explicitly. The wildcard-principal post-processing for `Authenticated Users` and `Everyone` requires v6.0 or later to be correct; older versions miscount these edges [@bloodhound-v6-0-release]. And the `CoerceAndRelayNTLMToADCS` edge is Group to Computer, not Computer to Computer, as discussed in Section 7.

<Spoiler kind="solution" label="A useful starter Cypher: enumerate all Tier Zero entry points">
```cypher
MATCH (n)
WHERE n.system_tags CONTAINS 'admin_tier_0'
WITH collect(n) AS tier_zero
MATCH p = shortestPath((u:User {enabled:true})-[*1..6]->(t))
WHERE t IN tier_zero AND NOT u.system_tags CONTAINS 'admin_tier_0'
RETURN u.name AS source, t.name AS target, length(p) AS hops
ORDER BY hops ASC
LIMIT 25
```

This returns up to 25 enabled non-Tier-Zero users with the shortest paths into Tier Zero. The `[*1..6]` bound prevents the pathological cyclic-graph cost explosion. Bound length aggressively until you have indexed your graph.
</Spoiler>

<Aside label="Authorised use only">
BloodHound is dual-use. Authorised defensive use on your own forest, or contracted penetration testing within written scope, is the standard legal posture. Running it against a directory you are not authorised to assess is unlawful in most jurisdictions: the Computer Fraud and Abuse Act in the United States; the Computer Misuse Act 1990 in the United Kingdom; equivalents in most EU jurisdictions. The dual-use posture is fundamental to the tool; the legal posture depends on you.
</Aside>

The tool is the easy part. The hard part is what you do with the answer.

## 13. Frequently asked questions

The misconceptions worth disposing of, in order of how often they recur.

<FAQ title="Common questions about BloodHound and attack-path graphs">

<FAQItem question="Is BloodHound 'every privilege relationship in AD'?">
No. BloodHound is the SpecterOps-maintained, evolving set of edge families. File-system ACLs on member servers, fine-grained GPO delegation, on-host service-account permissions, and some Entra conditional-access logic remain partially or fully out of scope. The `SpecterOps/TierZeroTable` README is explicit about this limitation [@specterops-tier-zero-table]. Coverage expansion is iterative and community-fed; OpenGraph is the structural answer to scope generalisation.
</FAQItem>

<FAQItem question="Why Neo4j? Why not Postgres or a custom engine?">
The original BloodHound (2016) shipped Neo4j only. The modern BloodHound CE uses *both*: PostgreSQL as the application database (users, roles, sessions, audit logs) and Neo4j as the graph layer [@bloodhound-ce-repo]. BloodHound Enterprise has begun migrating entirely off Neo4j onto PostgreSQL-as-graph (announced in v6.3, December 2024) [@bloodhound-v6-3-release]; Cypher continues to be the query language on the new backend. The model is engine-independent; Adalanche proves the same point by doing it all in process memory in Go [@adalanche-repo].
</FAQItem>

<FAQItem question="Is BloodHound legal?">
Authorised defensive use on your own forest, yes. Contracted penetration testing within written scope, yes. Running it against a directory you are not authorised to assess is unlawful in most jurisdictions: the Computer Fraud and Abuse Act in the United States, the Computer Misuse Act 1990 in the United Kingdom, and equivalents in most EU jurisdictions. The dual-use posture is fundamental to the tool; legal compliance is the operator's responsibility.
</FAQItem>

</FAQ>

## Epilogue

The 2014 analyst with the whiteboard and the 2024 analyst with the Cypher query are doing the same work. The unit of analysis has shifted, and once the unit shifts, the field does not go back.

John Lambert diagnosed it in two sentences in April 2015 [@lambert-2015-defenders-lists-attackers-graphs]. Andy Robbins, Rohan Vazarkar, and Will Schroeder shipped it as BloodHound in August 2016 [@bloodhound-legacy-repo]. SpecterOps extended it through AzureHound in 2020, the CE 5.0 web architecture in 2023, the Tier Zero formalisation in 2023, ADCS composed edges in 2024, Butterfly bi-directional analysis in 2024, and OpenGraph in 2025 [@specterops-2025-opengraph].

Microsoft validated the model with Lateral Movement Paths in 2018, cloud security graph attack-path analysis in 2022 to 2023, and Microsoft Security Exposure Management at Ignite in 2024 [@ms-msem-attack-paths]. The community that shipped the graph won; the community that kept shipping lists is selling compliance reports to the auditors.

The frontier in 2026 is not whether to model attacks as a graph -- that argument is settled. The frontier is how to make the graph weighted (so the shortest path approximates the easiest), how to make it complete (so the unmodelled edges shrink toward zero), and how to make it substrate-independent (so the next enterprise primitive worth modelling -- whatever it turns out to be -- can be ingested without changing the engine). Each of these is a research direction with its own asymptotic ceiling, its own engineering practice, and its own community of contributors.

What started as a sentence in a 1,100-word essay on a personal GitHub repository is now an ISO-standardised query language [@iso-39075-2024-gql], a shipped Microsoft product family, an open-source repository with hundreds of thousands of downloads, and a discipline taught at most major security conferences. The graph wins because the graph is the right model. The right model wins because, eventually, the right model always does.

<StudyGuide slug="bloodhound-attack-path-graph" keyTerms={[
  { term: "Attack-path graph", definition: "A directed graph whose nodes are security principals and resources and whose edges are privilege relationships an attacker can traverse. Reachability in the graph models multi-hop privilege escalation." },
  { term: "Directed property graph", definition: "A graph in which both nodes and edges have types and can carry key-value properties. The data model BloodHound uses." },
  { term: "Cypher", definition: "The pattern-matching query language for Neo4j and now the basis of the ISO/IEC 39075:2024 GQL standard." },
  { term: "Bidirectional BFS", definition: "Breadth-first search executed simultaneously from source and target, meeting in the middle. The algorithm Neo4j's shortestPath() runs and the algorithm BloodHound inherits." },
  { term: "Tier Zero", definition: "Any node in the attack-path graph whose compromise lets an attacker reach an administrative privilege in the forest. The endpoint of every BloodHound shortest-path query." },
  { term: "Traversable / non-traversable edge", definition: "A traversable edge can be stepped through directly by a path query; a non-traversable edge is a precondition fact that, with others, composes into a traversable edge during post-processing. ADCS edges are the canonical example." },
  { term: "OpenGraph", definition: "BloodHound's v8.0 (July 2025) generalisation that decouples the graph engine from the AD-specific schema, admitting user-defined node and edge kinds for arbitrary substrates." }
]} />
