# One Event, Three Portals: How a Single Sysmon Line Becomes a Microsoft Defender XDR Incident

> Trace a single Sysmon ProcessCreate event through six hops -- from Windows kernel emission to a unified Microsoft Defender XDR incident -- and where the convergence stops.

*Published: 2026-06-04*
*Canonical: https://paragmali.com/blog/one-event-three-portals-how-a-single-sysmon-line-becomes-a-m*
*License: CC BY 4.0 - https://creativecommons.org/licenses/by/4.0/*

---
<TLDR>
A single Sysmon ProcessCreate event takes six observable hops to land in a Microsoft Defender XDR incident: kernel ETW emission, agent shipping through a Data Collection Rule, ingestion into a Log Analytics workspace, KQL detection in Microsoft Sentinel, optional alert correlation from Microsoft Defender for Cloud's CWPP plans, and finally entity-graph fan-in inside the unified Defender portal [@ms-learn-sysmon] [@ms-learn-ama-overview] [@ms-learn-mdc-xdr-concept] [@ms-learn-xdr-correlation]. Each hop adds latency, loses fidelity, or introduces a configuration cliff -- and one wrong word in a Data Collection Rule (`Microsoft-Event` instead of `Microsoft-WindowsEvent`) silently drops the entire pipeline [@ms-learn-ama-windows-events]. This article walks the full path with a concrete worked example, names where the convergence actually stops, and gives a six-step recipe to build the pipeline yourself.
</TLDR>

## 1. One event, three portals, nine minutes

At 14:03:17 UTC on a Tuesday, `winword.exe` on the host `MAL-CONTOSO-PRD-04` spawns a child process: `powershell.exe -EncodedCommand JABwAD0AJwBoAHQAdABwADoALwAv...`. Sysmon, which loads early in the boot sequence as a boot-start kernel driver, writes a single ProcessCreate record (Event ID 1) to the Windows event log channel `Microsoft-Windows-Sysmon/Operational` [@ms-learn-sysmon]. The record is roughly two kilobytes of XML with a stable `ProcessGuid` field that uniquely identifies the new process across the host's lifetime [@ms-learn-defrag-tools-sysmon].

At 14:03:21 UTC, the same record appears in the `Event` table of an Azure Log Analytics workspace named `law-contoso-secops` [@ms-learn-event-table]. At 14:05:00 UTC, a Microsoft Sentinel scheduled analytics rule fires its five-minute KQL query, matches a parent-image heuristic (`winword.exe` -> `powershell.exe -EncodedCommand`), and produces a `SecurityAlert` row whose `Entities` JSON column names the host, the parent process, the child process, and the encoded command line [@ms-learn-sentinel-scheduled-rules] [@ms-learn-sentinel-entities]. At 14:07:42 UTC, a Microsoft Defender for Cloud (MDC) **alert** -- emitted by the MDC for Servers cloud workload protection plan, which sits on top of the Microsoft Defender for Endpoint (MDE) sensor on that same host -- shows up in the workspace's `SecurityAlert` table with the title `Suspicious PowerShell command line` [@ms-learn-mdc-defender-servers] [@ms-learn-mdc-mde-integration]. And at 14:09:30 UTC -- nine minutes and thirteen seconds after the kernel call -- a single incident appears in the Microsoft Defender XDR portal at `security.microsoft.com`. Its title: `Multi-stage incident on one endpoint`. Its alert tab lists three rows: one from Sentinel, one from MDC, and (because MDE was also installed) one from Defender for Endpoint's native detection engine [@ms-learn-defender-xdr-incidents] [@ms-learn-xdr-correlation].

Three independent detection systems, three different timestamps, three different alert grammars, one incident. How?

That question is the spine of this article. It is not a marketing question -- "look how unified it is" -- because the convergence is partial and the seams are load-bearing. It is an engineering question: which hops happen where, what does each hop cost in latency and money, and where does the unification actually stop?

> **Key idea:** Microsoft Defender XDR is not a single product. It is a correlation surface that fans in three structurally different pipelines: Sentinel's KQL analytics rules over Log Analytics, Defender for Cloud's cloud-workload-protection (CWPP) alerts from MDC plans (servers, containers, SQL, storage, App Service), and the native Defender stack (Endpoint, Identity, Office, Cloud Apps). The fan-in is real but partial: Sentinel cross-workspace correlation, MDC posture findings, and most third-party connectors stay outside the unified incident graph [@ms-learn-defender-xdr-overview] [@ms-learn-mdc-xdr-concept].

Here is the full path the Sysmon record takes from kernel to portal. Each numbered box is a real component with its own owner team, deployment lifecycle, and failure mode:

<Mermaid caption="The six hops a Sysmon ProcessCreate event takes to reach a Microsoft Defender XDR incident. Hops 1-4 are owned by Azure Monitor and Sentinel; hop 5 is the Defender for Cloud CWPP path; hop 6 is the unified correlation surface in the Defender portal.">
&#123;`flowchart LR
  A["1 Sysmon kernel ETW provider on host"]
  B["2 Azure Monitor Agent + Data Collection Rule"]
  C["3 Log Analytics workspace Event/SecurityEvent tables"]
  D["4 Sentinel scheduled or NRT analytics rule -- KQL"]
  E["5 MDC alert via Defender for Servers + MDE sensor"]
  F["6 Defender XDR correlation engine -- security.microsoft.com"]
  A --> B --> C --> D --> F
  C --> E --> F
  classDef src fill:#e8f4ff,stroke:#2b6cb0,color:#1a365d
  classDef sink fill:#fffaf0,stroke:#dd6b20,color:#7b341e
  class A,B,C src
  class F sink`}
</Mermaid>

The diagram understates how separate these hops are. Box 2 lives on the host. Box 3 is a multi-tenant Azure Data Explorer cluster [@ms-learn-adx-docs]. Box 4 runs on Sentinel's serverless query engine inside the workspace's home region. Box 5 is a Defender for Cloud plan with its own SKU, scoped to an Azure subscription. Box 6 is a separate web portal in a separate Microsoft 365 tenant scope. Each one rolled out at a different time, was renamed at least once, and absorbed a different earlier product. The next section recovers the lineage that explains why.

## 2. Three lineages that became one portal

The three pipelines that converge at hop 6 did not start as siblings. They started as three separate Microsoft product lines aimed at three different buyer personas: an Azure subscription owner who wanted posture scoring, a Windows engineer who wanted endpoint detection, and a SOC analyst who wanted a SIEM. Reading the path right-to-left -- from the unified portal back to its three roots -- is the only honest way to understand why the seams look the way they do.

<Definition term="SIEM (Security Information and Event Management)">
A platform that ingests security-relevant logs from many sources, normalizes them into a queryable schema, runs correlation rules to produce alerts, and groups related alerts into incidents that a SOC analyst triages. Microsoft Sentinel is a SIEM [@ms-learn-sentinel-overview].
</Definition>

<Definition term="SOAR (Security Orchestration, Automation, and Response)">
A platform (often packaged with a SIEM) that runs playbooks in response to alerts -- isolating a host, disabling an account, opening a ticket. In Microsoft's stack, SOAR is implemented as Azure Logic Apps invoked from Sentinel automation rules [@ms-learn-sentinel-soar].
</Definition>

<Definition term="EDR (Endpoint Detection and Response)">
A sensor that runs on a single endpoint, collects rich process / file / network / registry telemetry, applies behavioural detections locally and in the cloud, and exposes response actions (terminate process, isolate machine, collect investigation package). Microsoft Defender for Endpoint is an EDR [@ms-learn-mde-landing] [@ms-learn-mde-eda].
</Definition>

<Definition term="XDR (Extended Detection and Response)">
A correlation layer that fans in alerts and entities from multiple Microsoft-or-vendor detection products (endpoint, identity, email, cloud apps, cloud workloads) and merges related alerts into a single incident graph. Microsoft Defender XDR is Microsoft's XDR; the term was popularized by Palo Alto Networks in 2018 [@ms-learn-defender-xdr-overview] [@pan-blog-xdr-journey].
</Definition>

The CSPM line started first. In **December 2015**, Microsoft put Azure Security Center (ASC) into public preview as a per-subscription posture dashboard that scored Azure resources against a baseline of hardening recommendations [@azure-blog-asc-preview-2015]. ASC went generally available in **July 2016** alongside JIT VM access [@ms-security-blog-asc-ga-2016]. <Sidenote id="S1">Public sources frequently report ASC GA as "October 2015" or "October 2016." The primary Azure blog from December 2015 explicitly says "Azure Security Center -- now in public preview," and the July 2016 Microsoft Security blog announces the GA wave of new capabilities. The December 2015 preview / mid-2016 GA framing matches both authoritative announcements [@azure-blog-asc-preview-2015] [@ms-security-blog-asc-ga-2016].</Sidenote> Over the next five years ASC absorbed runtime protection plans -- Defender for Servers, SQL, Storage, App Service, Containers -- and was renamed **Microsoft Defender for Cloud** at Ignite Fall 2021, the same wave that renamed Microsoft Cloud App Security to Microsoft Defender for Cloud Apps (MDCA) [@ms-learn-mdc-introduction] [@ms-learn-mdca-rename-2021].

The SIEM line is much younger. Microsoft announced **Azure Sentinel** in public preview on **February 28, 2019** as the first cloud-native SIEM from a hyperscaler, built on top of Azure Log Analytics and the Kusto Query Language [@ms-blog-sentinel-preview-2019]. It went GA on **September 24, 2019** [@ms-security-blog-sentinel-ga-2019]. It was renamed **Microsoft Sentinel** in November 2021 (same Ignite wave). Sentinel inherited every Log Analytics integration that Azure Monitor already had, which meant it could ingest Windows event logs, syslog, Office 365 audit, Microsoft Entra ID sign-ins, and anything you could shove into a workspace with a custom collector [@ms-learn-sentinel-data-connectors-ref].

The XDR line landed last. In **September 2020** Microsoft announced "Microsoft unified SIEM and XDR" as a direction, and rolled the Office 365 ATP and Microsoft Defender ATP detection surfaces into a single portal called **Microsoft 365 Defender** [@ms-security-blog-unified-siem-xdr-2020]. The portal was renamed **Microsoft Defender XDR** in early 2024, and the SIEM and XDR portals were merged at Ignite November 2023, with the unified Microsoft security operations platform going generally available in July 2024 [@ms-blog-ignite-2023] [@ms-security-blog-unified-secops-2024]. The Sentinel experience inside the Azure portal will be **retired on March 31, 2027** (a deadline extended from its original July 1, 2026 target); after that date, Sentinel lives only inside `security.microsoft.com` [@ms-learn-sentinel-azure-portal-retiring] [@helpnetsec-sentinel-defender-timeline].

<Mermaid caption="Twenty-five years of the three lineages that converge in today's Defender XDR portal. Note that the Sysmon to Sentinel to Defender XDR path is barely seven years old end-to-end.">
&#123;`gantt
    title Three lineages converging at security.microsoft.com
    dateFormat YYYY-MM
    axisFormat %Y

    section EDR line
    Sysmon v1 (Sysinternals)         :done, 2014-08, 12M
    Microsoft Defender ATP (EDR)     :done, 2016-03, 60M
    Renamed Microsoft Defender for Endpoint :done, 2020-09, 24M

    section CSPM and CWPP line
    Azure Security Center preview    :done, 2015-12, 8M
    Azure Security Center GA         :done, 2016-07, 64M
    Renamed Microsoft Defender for Cloud :done, 2021-11, 36M

    section SIEM line
    Azure Sentinel preview           :done, 2019-02, 7M
    Azure Sentinel GA                :done, 2019-09, 26M
    Renamed Microsoft Sentinel       :done, 2021-11, 24M

    section XDR convergence
    Microsoft 365 Defender portal    :done, 2020-09, 38M
    Sentinel merged into Defender portal :done, 2023-11, 8M
    Unified secops GA                :done, 2024-07, 24M
    Sentinel Azure portal retires    :crit, 2027-03, 1M`&#125;
</Mermaid>

Three things matter about this timeline for the rest of the article. First, the **CSPM/CWPP line is older** than either SIEM or XDR -- which is why the Defender for Cloud team owns its own alert format, its own subscription-scoped permissions model, and its own portal at `portal.azure.com/#blade/Microsoft_Azure_Security`, none of which fully merge into the unified Defender experience even today. Second, **Sentinel inherited Log Analytics**, not the other way around -- so the storage substrate, the agent (Azure Monitor Agent), and the query language (KQL) all predate Sentinel by years and serve far more workloads than security. Third, **the unified portal is the new arrival**, not the foundation. The convergence is grafted on top of three pre-existing pipelines, and that grafting -- not the products themselves -- is what makes the architecture interesting.

## 3. The pre-cloud SIEM bottleneck

To understand why Sentinel was built the way it was, hold the question in mind that every SIEM buyer asked their finance team between roughly 2008 and 2018: *"Why does each new server cost me a license-tier upgrade?"*

Classic on-premises SIEMs -- Splunk Enterprise, ArcSight, QRadar -- priced by **ingested gigabytes per day**, billed as a perpetual or annual license tied to a tier. Crossing a tier boundary triggered a forklift purchase. Storage was on-prem disk, and retention was constrained by how much steel you bought; compute was on the same hardware, so peak query load contended with peak ingest. The cost shape was step-wise, and the constraint that bound it most painfully was peak ingest rate.

| Cost dimension | Classic on-prem SIEM | Cloud-native SIEM (Sentinel) |
|---|---|---|
| Ingest billing unit | License tier (GB/day, stepped) | Per-GB ingest (continuous) [@ms-learn-sentinel-billing] |
| Storage billing unit | Bundled with license tier | Per-GB-month retention (continuous) [@ms-learn-sentinel-billing] |
| Compute billing unit | Bundled / hardware capex | Per-query bytes scanned (serverless) [@ms-learn-adx-docs] |
| Capacity planning | Estimate peak GB/day a year out | None -- pay for what you ingested last hour |
| New data source onboarding | Re-tier and order disks | Add a Data Collection Rule [@ms-learn-dcr-overview] |

The reframe Sentinel proposed -- and that the Kusto/Log Analytics substrate enabled -- was to **separate the three cost axes**: ingest, storage retention, and query compute. Each axis bills continuously and independently. There is no tier to cross. Adding a new data source is a Data Collection Rule edit, not a procurement event. Retaining last quarter's logs another year is a per-GB-month flag, not a disk purchase [@ms-learn-sentinel-billing].

> **Note:** **Aha #1 -- the economic reframe.** What looked like a *pricing* change ("SaaS billing") was actually an *architectural* change. Classic SIEMs bundled ingest, storage, and compute because the hardware bundled them. Once each axis lives on a different cloud service (Event Hubs / DCR for ingest, ADX for storage, KQL serverless query for compute), there is no bundle to defend. The SaaS bill is downstream of the deconstructed architecture, not the cause of it.

This deconstruction is what makes the Sentinel pipeline interesting upstream of the SOC. When ingest is a separately-billed continuous variable, the *Data Collection Rule* becomes the most important security artifact in the deployment: it determines what flows in and therefore both what costs you incur and what you can possibly detect. (The accuracy-report follow-up that drives section 10 hinges on exactly one wrong word in a DCR.) When query compute is serverless and per-byte, a long-running threat hunt over a year of process-creation events is a question of dollars, not of capacity-plan slack. And when storage retention is a per-GB-month flag, the question "should we retain this for compliance?" decouples from "do we have rack space?"

<PullQuote attribution="Microsoft Sentinel pricing documentation">
"Sentinel offers a flexible and predictable pricing model. Pay-as-you-go pricing lets you pay for what you use, while commitment tiers provide guaranteed discounts." [@ms-learn-sentinel-billing]
</PullQuote>

That is the pricing-page sales line. The architectural truth underneath it is that the three pre-cloud bundles unbundled, and once they unbundled, the SIEM was free to grow horizontally with the rest of the cloud workload. That is exactly what happened with Sentinel between 2019 and 2024: it accumulated **300+ data connectors** for every Azure service, every Microsoft 365 surface, every major SaaS log feed, and a long tail of third-party security tools [@ms-learn-sentinel-data-connectors-ref]. None of that catalog would have been economically sane on a per-GB/day license tier.

But the unbundle was not free. The price of separately-billed continuous axes is that you have to *measure* on all three axes. You now need to know your steady-state ingest rate, your retention policy, and your hunt query patterns. The next section steps inside the substrate that makes those measurements -- and the queries on top of them -- possible.

## 4. The cloud-native SIEM substrate: KQL on Log Analytics

Microsoft Sentinel is a thin layer over a much older substrate. That substrate is **Azure Monitor Log Analytics**, which itself is a security-and-multitenancy wrapper around **Azure Data Explorer (ADX)**, the cluster engine that runs **Kusto Query Language (KQL)** [@ms-learn-adx-docs]. Understanding the stack matters because almost everything Sentinel can or cannot do is determined by what Log Analytics and KQL can or cannot do, not by anything Sentinel itself implements.

<Definition term="Log Analytics workspace">
A multi-tenant namespace inside Azure Monitor that stores ingested telemetry in typed tables and exposes them for KQL query. Each workspace lives in a specific Azure region and Azure subscription, has its own access controls, and bills ingest and retention independently. Sentinel "is enabled" on a workspace; the workspace is the storage and query unit [@ms-learn-sentinel-overview].
</Definition>

<Definition term="KQL (Kusto Query Language)">
A read-only, pipe-composed query language for time-series and tabular log data, originally developed for Azure Data Explorer. KQL is the lingua franca of Azure Monitor Logs, Microsoft Sentinel analytics, Microsoft Defender XDR advanced hunting, and several other Microsoft data services [@ms-learn-adx-docs] [@ms-learn-advanced-hunting].
</Definition>

The layering is shown below. Notice that KQL itself spans **four** Microsoft surfaces, of which Sentinel is just one. <MarginNote>KQL's polymorphism -- one query language across Monitor, Sentinel, Defender XDR advanced hunting, and ADX itself -- is the single most under-appreciated decision in the Microsoft security stack. It is also the reason your KQL skills move across teams.</MarginNote>

<Mermaid caption="The Kusto / Log Analytics / Sentinel layering. Sentinel sits on top of a generic monitoring substrate; the same substrate powers Application Insights, infrastructure metrics, and any custom telemetry the customer chooses to send.">
&#123;`flowchart TB
  subgraph L1["Layer 1 -- storage cluster"]
    ADX["Azure Data Explorer (Kusto engine)"]
  end
  subgraph L2["Layer 2 -- managed namespace"]
    LA["Log Analytics workspace -- typed tables, RBAC, regional"]
  end
  subgraph L3["Layer 3 -- query surfaces"]
    AZM["Azure Monitor logs -- ops + perf"]
    SEN["Microsoft Sentinel -- SIEM analytics rules"]
    XDR["Defender XDR -- advanced hunting"]
    ADXQ["ADX direct -- analytics + BI"]
  end
  ADX --> LA
  LA --> AZM
  LA --> SEN
  LA --> XDR
  ADX --> ADXQ
  classDef stor fill:#e8f4ff,stroke:#2b6cb0,color:#1a365d
  classDef ns fill:#fff5d6,stroke:#b7791f,color:#5f370e
  classDef ui fill:#e6fffa,stroke:#319795,color:#234e52
  class ADX stor
  class LA ns
  class AZM,SEN,XDR,ADXQ ui`}
</Mermaid>

The substrate predates Sentinel by years. **Log Analytics** was the rebranded form of *Operations Management Suite (OMS)*, which Microsoft introduced in 2015 as a cloud companion to System Center Operations Manager. The agent that fed OMS -- the **Microsoft Monitoring Agent (MMA)**, sometimes also called the *Log Analytics agent* -- shared its agent lineage with the System Center Operations Manager agent and ran on Windows and Linux servers to ship event logs and performance counters to the workspace [@ms-learn-laa-deprecated] [@lunavi-oms-azure-monitor]. ADX (Kusto) was productised externally in 2018 after years of internal Microsoft use as the engine behind Bing telemetry, Office 365 ops, and Azure monitoring [@ms-learn-adx-docs].

<Aside type="historical" id="A1">
The naming continuity is worth pausing on. *Log Analytics* (2016) replaced *OMS* (2015), which replaced *Application Insights workspaces* (2014), which absorbed parts of *Operations Manager* (2007). The data store underneath was *Kusto* the whole time. By the time Azure Sentinel launched in 2019 [@ms-blog-sentinel-preview-2019], the substrate had been hardened for four years at hyperscale, mostly for non-security workloads. Sentinel did not have to invent the storage; it inherited it. This is also why the same KQL skill maps onto application telemetry and infrastructure metrics, not just security.
</Aside>

Two consequences of the substrate inheritance shape every hop downstream:

1. **Schema is per-table, not per-product.** A Log Analytics workspace exposes typed tables like `Event` (Windows event log records), `SecurityEvent` (Windows Security channel), `Syslog`, `Heartbeat`, `SecurityAlert`, `DeviceProcessEvents` (mirrored from Defender XDR's advanced hunting schema), `Perf`, and any number of `Custom_CL` tables [@ms-learn-event-table] [@ms-learn-securityevent-table]. KQL queries are written against tables, not against products. A Sentinel analytics rule is just a saved KQL query that runs on a schedule and emits a row into `SecurityAlert`.

2. **Cross-workspace and cross-table joins are first-class.** Because the substrate is a real query engine, you can `join` between `SecurityEvent` and `SigninLogs` and `DeviceProcessEvents` in a single rule. You can use `workspace("law-other").Event` to reach into a separate workspace. You can call `externaldata()` to read from a blob. This expressive power is the source of both Sentinel's flexibility and its operational complexity: the rule that worked in test stops working in prod because the test workspace did not have a `SigninLogs` table or because the cross-workspace permission is missing [@ms-learn-sentinel-threat-detection].

For the Sysmon worked example: the kernel record will land in the `Event` table (because Sysmon's channel is treated as a generic Windows event log, not as the `SecurityEvent` Security channel). The detection KQL will live as a Sentinel scheduled analytics rule that reads from `Event`, filters to `Source == "Microsoft-Windows-Sysmon"` and `EventID == 1`, parses the XML payload (the next section will show the exact pattern), and emits a `SecurityAlert` row. That `SecurityAlert` row is what hop 6 ultimately fans in. The substrate did all the heavy lifting; Sentinel just wrote the rule.

## 5. The XDR reframe: from per-product portals to a single incident graph

If the SIEM substrate is "many tables, one query engine," the XDR reframe is "many alert sources, one incident graph." Microsoft Defender XDR exists because by 2019 a typical Microsoft enterprise customer had four or five separate Microsoft security portals -- one for Office 365 ATP, one for Microsoft Defender ATP, one for Microsoft Cloud App Security, one for Azure AD Identity Protection, and the Azure Security Center / Sentinel pair. Each portal had its own alert grammar, its own console, and its own analyst workflow. **The XDR reframe is to keep the alert sources but merge the analyst surface.**

<Definition term="Microsoft Defender XDR">
A correlation surface at `security.microsoft.com` that fans in alerts and entity data from the Microsoft Defender product family (Endpoint, Identity, Office 365, Cloud Apps), Microsoft Sentinel, and Microsoft Defender for Cloud's runtime CWPP plans, then merges related alerts into incidents using shared entity identifiers (user, device, file hash, IP, URL) [@ms-learn-defender-xdr-overview] [@ms-learn-defender-xdr-incidents].
</Definition>

The mechanism the merge uses is the entity graph. When any of the source pipelines emits an alert, it is required to attach a set of typed entities (e.g., `Host = MAL-CONTOSO-PRD-04`, `Process = winword.exe`, `Account = CONTOSO\\jdoe`) to that alert [@ms-learn-sentinel-entities]. The Defender XDR correlation engine reads incoming alerts, normalizes the entity values, and groups alerts whose entities overlap in time and identity into a single incident [@ms-learn-xdr-correlation]. That is the entire trick. It is conceptually simple. Operationally it has many edge cases, which section 8 returns to.

For the worked example, the three alert sources (Sentinel KQL rule, MDC for Servers, MDE) each emit a separate alert. Each alert lists `Host = MAL-CONTOSO-PRD-04` and (for two of the three) `ProcessGuid = {abc-...}`. The correlation engine merges them on the host entity within a sliding time window. Result: one incident with three correlated alerts, not three separate incidents. The temporal fan-out is shown below; the fan-in geometry returns in section 6.6.

<Mermaid caption="Temporal fan-out for the worked example. A single kernel event produces three alerts at three different timestamps via three different pipelines. The Defender XDR correlation engine performs the fan-in at hop 6.">
&#123;`sequenceDiagram
    autonumber
    participant K as Host kernel (Sysmon)
    participant LA as Log Analytics workspace
    participant SEN as Sentinel scheduled rule
    participant MDC as MDC for Servers alert
    participant MDE as MDE native detection
    participant XDR as Defender XDR correlation
    K->>LA: 14:03:21 -- Event row (ProcessGuid abc)
    LA->>SEN: 14:05:00 -- 5-min query fires
    SEN->>XDR: 14:05:04 -- SecurityAlert from KQL
    K->>MDE: 14:03:17 -- local EDR sensor signal
    MDE->>MDC: 14:06:30 -- MDE telemetry surfaces MDC alert
    MDC->>XDR: 14:07:42 -- SecurityAlert from MDC plan
    MDE->>XDR: 14:08:11 -- DeviceAlertEvents direct
    XDR->>XDR: 14:09:30 -- merge on host + ProcessGuid -> Incident I-7842`&#125;
</Mermaid>

Two things in the diagram deserve to be noticed. First, the three alerts arrive in a window that is small but not synchronous: about six minutes from earliest to latest, all gated by the slowest pipeline (Sentinel's five-minute scheduled query). Second, **MDE shows up twice**: once as the source that feeds MDC's CWPP plan (hop 5 in the master diagram), and once as a native Defender XDR alert source. The two are the same sensor data routed through two different alert grammars to the same correlation surface. The fact that the correlation engine deduplicates them on `ProcessGuid` is not accidental -- it is the load-bearing identifier that makes the unification work for endpoint events. For non-endpoint sources (cloud-control-plane alerts from MDC for Storage, for example), there is no equivalent shared identifier, and the deduplication has to fall back on weaker entity matches like account name or IP. That is where the convergence frays.

The next section walks the six hops in order, naming the artifact at each hop and the failure mode that lives there. Hops 1 through 4 are the SIEM lineage. Hop 5 is the CWPP lineage. Hop 6 is the XDR fan-in.

## 6. Walking the six hops

### 6.1 Hop 1 -- The kernel emission

The Sysmon driver -- `SysmonDrv.sys` -- is registered as a Windows **boot-start driver** under `HKLM\SYSTEM\CurrentControlSet\Services\SysmonDrv` with `Start=0`, which means the I/O manager loads it during the early-boot phase before the bulk of user-mode services start; it also registers as an event-tracing-for-Windows (ETW) provider. On every process creation, it hooks the kernel's `PsSetCreateProcessNotifyRoutineEx` callback, builds an event record, and writes it to the Windows event log channel `Microsoft-Windows-Sysmon/Operational` [@ms-learn-sysmon] [@ms-learn-defrag-tools-sysmon]. The record carries roughly thirty fields, including the parent and child image paths, the command lines, the user SID, the integrity level, the hashes (configurable: MD5, SHA1, SHA256, IMPHASH), the parent and child `ProcessGuid` values, and the kernel-side timestamp.

<Sidenote id="S1a">A common slip: Sysmon's driver is *not* an Early Launch Anti-Malware (ELAM) driver. ELAM is a separate, stricter Windows category for anti-malware vendors whose drivers must be certified by Microsoft and registered under `HKLM\SYSTEM\CurrentControlSet\Control\EarlyLaunch`. Sysmon ships as an ordinary boot-start driver (`Start=0` under its `Services\SysmonDrv` key); it loads early enough to observe most user-mode activity from the start, but it does not occupy the ELAM slot. A reader who internalizes the wrong classification will go looking for a `SysmonDrv` entry under `EarlyLaunch` and not find one [@ms-learn-sysmon].</Sidenote>

<Definition term="ProcessGuid">
A 128-bit identifier Sysmon assigns to every new process. Unlike the OS-assigned PID, which the kernel can recycle as processes exit, ProcessGuid is unique across the host's lifetime and lets downstream tooling reassemble a process tree even after PIDs have been reused. The Microsoft Sysmon page documents the property -- "a unique value for this process across a domain to make event correlation easier" -- but does not document how the GUID is constructed; downstream KQL queries and Defender XDR's advanced hunting schema rely only on its uniqueness, not on its internal composition [@ms-learn-sysmon].
</Definition>

<Sidenote id="S2">There is a subtle field nuance worth knowing. Sysmon also emits `LogonGuid`, `LogonId`, and `User` on a ProcessCreate event. These three are *post-impersonation* values -- they reflect the security context the new process was created under, which can differ from the token of the parent. For service-impersonation chains (a service spawning a child under a different account), ignoring this distinction will mislead an analyst on who "owned" the process. KQL detection queries should `project` both parent and child user/SID and reconcile them explicitly.</Sidenote>

For the worked example, the kernel emission at 14:03:17 UTC contains, among other fields:

```text
EventID:       1
TimeCreated:   2026-06-02T14:03:17.412Z
Computer:      MAL-CONTOSO-PRD-04
ProcessGuid:   {62b9c5cf-7c64-67ab-2e00-000000003200}
ProcessId:     8124
Image:         C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe
CommandLine:   powershell.exe -EncodedCommand JABwAD0AJwBoAHQAdABwADoALwAv...
ParentProcessGuid:   {62b9c5cf-7b21-67ab-2c00-000000003200}
ParentProcessId:     6210
ParentImage:   C:\Program Files\Microsoft Office\root\Office16\winword.exe
ParentCommandLine:   "winword.exe" /n "C:\Users\jdoe\Downloads\invoice.docm"
User:          CONTOSO\jdoe
IntegrityLevel: Medium
Hashes:        SHA256=04ED...
```

Nothing further happens at hop 1 until someone reads the channel. The kernel will not push the event off the host; it will only sit in the local event log, rotating by size or age, until an agent picks it up. That is hop 2.

### 6.2 Hop 2 -- Azure Monitor Agent shipping via a Data Collection Rule

The agent that reads the Sysmon channel and ships it to the workspace is the **Azure Monitor Agent (AMA)**. AMA replaced the older **Microsoft Monitoring Agent (MMA)** / **Log Analytics agent**, which Microsoft retired effective **August 31, 2024** [@ms-learn-laa-deprecated]. Customers still running MMA past that date are in unsupported territory, and -- this is the critical operational fact -- AMA does **not** automatically pick up where MMA left off. AMA requires explicit migration: a Data Collection Rule (DCR) describing which events to collect and which workspace to send them to [@ms-learn-ama-migration].

<Definition term="Azure Monitor Agent (AMA)">
A modern Microsoft agent that runs on Windows and Linux servers (Azure VM, Arc-enabled, or on-prem) and ships event logs, performance counters, syslog, and custom text files to one or more Log Analytics workspaces, driven entirely by Data Collection Rule (DCR) configurations [@ms-learn-ama-overview].
</Definition>

<Definition term="Data Collection Rule (DCR)">
An ARM-managed configuration object that names a data source type (e.g., `windowsEventLogs`), an XPath-based subscription (which channels and which event IDs), and one or more destinations (typically a `logAnalyticsWorkspace` + `streams` mapping such as `Microsoft-Event` for the generic `Event` table or `Microsoft-WindowsEvent` for the more recent typed Windows event ingestion path). DCRs are assigned to one or more agents via a Data Collection Rule Association (DCRA) [@ms-learn-dcr-overview] [@ms-learn-ama-windows-events].
</Definition>

<Mermaid caption="The Azure Monitor Agent control flow. The agent reads channels named in the DCR's xPathQueries list, transforms records into the named stream's schema, and POSTs them to the workspace's ingestion endpoint over HTTPS.">
&#123;`flowchart LR
  CH["Windows event channels (XPath subscription)"]
  AMA["Azure Monitor Agent process"]
  DCR["Data Collection Rule (cached locally)"]
  ING["Log Analytics ingestion endpoint -- regional HTTPS"]
  TBL["Workspace table -- Event / SecurityEvent / WindowsEvent"]
  CH --> AMA
  DCR --> AMA
  AMA --> ING
  ING --> TBL
  classDef cfg fill:#fff5d6,stroke:#b7791f,color:#5f370e
  classDef agent fill:#e8f4ff,stroke:#2b6cb0,color:#1a365d
  classDef sink fill:#e6fffa,stroke:#319795,color:#234e52
  class DCR cfg
  class AMA,CH agent
  class ING,TBL sink`}
</Mermaid>

> **Note:** **The MMA-to-AMA silent-miss trap.** A workspace that is still in transition between MMA and AMA can have agents on the same host both running, both shipping the same `Event` row, and producing double counts. Worse, a host that has had MMA uninstalled but a DCR mis-assigned will stop shipping entirely -- and because Sysmon writes to the local event log no matter what, no alert fires on the host itself. The first signal of the gap is silence in the `Event` table for that `Computer` value, which a Sentinel "stale data source" watchdog rule must explicitly detect. Microsoft retired MMA effective August 31, 2024 [@ms-learn-laa-deprecated].

For the Sysmon channel specifically, AMA needs a DCR whose `windowsEventLogs` block names the XPath subscription `Microsoft-Windows-Sysmon/Operational!*[System[(EventID=1)]]` (or a broader filter that includes EventIDs 1, 3, 7, 8, 10, 11). The stream name in the destination block determines which table the record lands in: a DCR that names `Microsoft-Event` ships into the generic `Event` table; one that names `Microsoft-WindowsEvent` ships into the newer `WindowsEvent` table; and naming anything else silently emits nothing [@ms-learn-ama-windows-events] [@ms-learn-sentinel-data-connectors-ref]. The AMA does not log a hard error in this case; the events simply never appear, and the analyst sees a dashboard that is missing the wave.

Hop 2 finishes at about 14:03:19 UTC for the worked example -- two seconds after the kernel emission. The record is now in the workspace's ingest buffer.

### 6.3 Hop 3 -- Workspace ingestion and the table-choice question

The ingestion endpoint validates the record against the named stream's schema, applies any DCR-side transformations, and persists the row into the destination table. From here on the record is queryable via KQL with end-to-end ingestion latency typically in the low minutes [@ms-learn-event-table]. For the Sysmon channel the destination table is almost always `Event`, because the `SecurityEvent` table is the Windows *Security* channel only (the AMA `securityEvents` data source), and the Sysmon channel is a separate operational channel [@ms-learn-securityevent-table].

The table choice matters because it changes the shape of the row and the cost of querying it. The two relevant tables for Windows event data behave as follows:

| Property | `Event` (Microsoft-Event stream) | `WindowsEvent` (Microsoft-WindowsEvent stream) |
|---|---|---|
| Source | AMA `windowsEventLogs` data source [@ms-learn-ama-windows-events] | AMA `windowsEventLogs` data source (newer typed path) [@ms-learn-ama-windows-events] |
| EventData shape | XML in `EventData` column (string) | Pre-parsed JSON in `EventData` (dynamic) |
| Cost characteristic | Standard ingest pricing [@ms-learn-sentinel-billing] | Standard ingest pricing |
| Best for | Mixed sources, simple filters | Channels with deep parsing needs |
| KQL parse pattern | `parse_xml(EventData)` per row | Direct property access |

<Sidenote id="S3">In production, most Sysmon-on-Windows pipelines run on the older `Event` table with a `parse_xml(EventData)` shim. The parse is not cheap -- it allocates per row -- but it is the most common pattern because the older table predates the typed `WindowsEvent` path and customer queries already exist against it. New deployments should consider the newer table if their detection logic touches many fields per row [@ms-learn-ama-windows-events].</Sidenote>

A representative KQL detection that runs against the older `Event` table for the worked example looks like the snippet below. Show it to a SOC analyst and they will read it left-to-right; show it to a Kusto engineer and they will tell you the `parse_xml` is the expensive part.

The KQL that parses a Sysmon event out of the older `Event` table follows a four-step idiom that is worth walking explicitly, because the same shape appears in every detection a SOC writes against XML-shaped Windows event data. **Step one:** `parse_xml(EventData)` reads the entire EventData payload (a string column) and returns a dynamic JSON tree whose root is `DataItem.EventData` and whose interesting children are an array of `<Data Name="...">value</Data>` elements [@ms-learn-kusto-parse-xml]. **Step two:** `mv-expand ev = ...DataItem.EventData.Data` flattens that array so each `<Data>` child becomes its own row -- a long-form representation where one event becomes thirty rows, one per field. **Step three:** `extend Field = tostring(ev["@Name"]), Value = tostring(ev["#text"])` projects the XML attribute and text payload into two typed columns named `Field` and `Value`. **Step four:** `evaluate pivot(Field, take_any(Value), TimeGenerated, Computer)` invokes the Kusto `pivot` plugin, which rotates the long-form (Field, Value) rows back into a wide row with one column per field name -- so after the pivot, `CommandLine`, `Image`, `ParentImage`, and `ProcessGuid` become first-class columns the detection can filter on as if they had been typed all along [@ms-learn-kusto-pivot-plugin]. The same chain adapts to any other EventID (3 / NetworkConnect, 11 / FileCreate, etc.) and, with one less hop, to the typed `WindowsEvent` table where `EventData` is already pre-parsed JSON.

<Sidenote id="S3a">Quick reference, in margin form: `parse_xml(EventData)` -> dynamic JSON tree; `mv-expand ev = ...EventData.Data` -> one row per `<Data>` element; `extend Field/Value` -> typed Field/Value columns; `evaluate pivot(Field, take_any(Value), ...)` -> wide row, one column per field. The pivot step is what turns "thirty long-form rows" into "one wide row with named columns"; without it the detection has to filter on the Field/Value pairs directly, which is much harder to write and to read [@ms-learn-kusto-pivot-plugin].</Sidenote>

<Spoiler summary="Show the working Sentinel scheduled-rule KQL">
```kql
Event
| where TimeGenerated > ago(5m)
| where Source == "Microsoft-Windows-Sysmon" and EventID == 1
| extend ev = parse_xml(EventData).DataItem.EventData.Data
| mv-expand ev
| extend Field = tostring(ev["@Name"]), Value = tostring(ev["#text"])
| evaluate pivot(Field, take_any(Value), TimeGenerated, Computer)
| where ParentImage endswith "winword.exe"
  and Image endswith "powershell.exe"
  and CommandLine contains "-EncodedCommand"
| project
    TimeGenerated, Computer, User,
    ParentImage, ParentProcessGuid,
    Image, ProcessGuid, CommandLine, Hashes
| extend
    HostCustomEntity = Computer,
    AccountCustomEntity = User,
    ProcessCustomEntity = ProcessGuid
```
</Spoiler>

The five lines after the `pivot` are the actual detection: an Office process spawning PowerShell with `-EncodedCommand`. The three `*CustomEntity` columns at the bottom are what wire this alert into the Defender XDR correlation engine at hop 6 -- they become typed entities on the resulting `SecurityAlert` row [@ms-learn-sentinel-entities].

> **Note:** **Why the row of CustomEntity columns matters.** A Sentinel analytics rule that produces a `SecurityAlert` without entity mappings will still alert -- and will still be readable by an analyst -- but it will *not* participate in cross-pipeline correlation at hop 6. The XDR fan-in matches on entity values, and an alert with no entities has nothing to match on. This is a common oversight when migrating older queries into Sentinel from on-prem SIEMs that did not have an equivalent concept.

Hop 3 finishes at about 14:03:21 UTC: four seconds after kernel emission, with the row written to the workspace's `Event` table and indexed for KQL query.

### 6.4 Hop 4 -- Sentinel analytics rule emits a SecurityAlert

Microsoft Sentinel supports several detection-rule shapes. The five that matter for understanding the Sysmon pipeline are summarized below, with the timing characteristics that drive end-to-end latency for hop 4.

<Definition term="Sentinel scheduled analytics rule">
A KQL query that Sentinel runs on a fixed schedule (default 5 minutes, minimum 5 minutes). When the query returns rows, each row -- subject to grouping configuration -- becomes a `SecurityAlert` row in the workspace and an alert object in Sentinel and in Defender XDR [@ms-learn-sentinel-scheduled-rules].
</Definition>

<Definition term="Entity mapping">
The Sentinel-rule configuration that names which output columns of the KQL detection map to which typed entities (Account, Host, Process, IP, URL, FileHash, etc.). Without entity mappings, an alert is "orphan" with respect to the Defender XDR correlation engine [@ms-learn-sentinel-entities].
</Definition>

The five rule shapes and where they fire in the Sysmon path:

| Rule type | Query cadence | Typical end-to-end latency | Sysmon use |
|---|---|---|---|
| Scheduled analytics [@ms-learn-sentinel-scheduled-rules] | Every 5+ min | 5-8 min from ingest | The default for ProcessCreate detections |
| Near-real-time (NRT) [@ms-learn-sentinel-nrt-rules] | Every 1 min | 1-2 min from ingest | High-priority single-event matches |
| Microsoft security (parent-product) | Tied to source product | Sub-minute | Pass-through for MDE / MDC / MDCA alerts |
| Fusion (multistage) [@ms-learn-sentinel-fusion] | ML-driven, continuous | Hours | Cross-source attack-pattern detection |
| Threat-intelligence map [@ms-learn-sentinel-threat-detection] | Continuous | Sub-minute | IOC matching on `Event`-derived hashes |

For the worked example, the detection runs as a **scheduled analytics rule** at five-minute cadence. The rule fires at 14:05:00 UTC, the query returns one row matching `winword.exe -> powershell.exe -EncodedCommand`, and a `SecurityAlert` is emitted at 14:05:04 UTC. The alert carries the `HostCustomEntity`, `AccountCustomEntity`, and `ProcessCustomEntity` mappings that the rule defined.

<RunnableCode language="javascript" caption="Sketch of the entity-to-alert join the Defender XDR correlator does conceptually. Run this in any JS console to see the merge logic in miniature.">
{`// Three alerts arriving from three pipelines, each with entities.
const sentinelAlert = {
  source: 'Sentinel',
  time: '14:05:04Z',
  entities: { Host: 'MAL-CONTOSO-PRD-04',
              Process: '{62b9c5cf-7c64-67ab-2e00-000000003200}' }
};
const mdcAlert = {
  source: 'MDC for Servers (via MDE)',
  time: '14:07:42Z',
  entities: { Host: 'MAL-CONTOSO-PRD-04',
              File: 'powershell.exe' }
};
const mdeAlert = {
  source: 'MDE native',
  time: '14:08:11Z',
  entities: { Host: 'MAL-CONTOSO-PRD-04',
              Process: '{62b9c5cf-7c64-67ab-2e00-000000003200}' }
};
function correlate(alerts, windowMin = 30) {
  const byHost = new Map();
  for (const a of alerts) {
    const k = a.entities.Host;
    if (!byHost.has(k)) byHost.set(k, []);
    byHost.get(k).push(a);
  }
  return [...byHost.entries()].map(([host, alts]) => ({
    incidentKey: 'host:' + host,
    alerts: alts.map(a => a.source)
  }));
}
console.log(correlate([sentinelAlert, mdcAlert, mdeAlert]));
// -> [{ incidentKey: 'host:MAL-CONTOSO-PRD-04',
//        alerts: ['Sentinel','MDC for Servers (via MDE)','MDE native'] }]
`}
</RunnableCode>

The toy correlator above only keys on `Host`. The real one also keys on `Process` (ProcessGuid where present), `Account`, `IP`, `URL`, and `FileHash`, and uses a sliding window plus a confidence-weighted merge that allows weak entities (file name) to participate when strong entities (ProcessGuid) overlap [@ms-learn-xdr-correlation]. The result is the same: three alerts in, one incident out.

<Sidenote id="S4">Two other Sentinel detection paths deserve a mention even though they did not fire for this specific worked example. **UEBA anomalies** -- when enabled, Sentinel writes per-user and per-host baselines into `BehaviorAnalytics` and `IdentityInfo` tables; analytics rules can `join` these to flag a normally-quiet jdoe spawning encoded PowerShell as anomalous independent of any specific signature [@ms-learn-sentinel-threat-detection]. **Fusion** is an ML-driven multistage detector that operates over the broader alert + event corpus and emits Fusion-named incidents when it sees a chain that resembles an attack pattern (e.g., a phishing alert followed by a credential-access alert followed by a process-spawn anomaly within an hour on the same identity) [@ms-learn-sentinel-fusion]. Fusion's strength is correlation across products you would not have thought to correlate manually; its weakness is opacity, which §9 returns to.</Sidenote>

There is one further detection family worth introducing here because §10's recipe will explicitly avoid it: **Defender XDR Custom Detections**. These are KQL queries authored not in Sentinel but in the unified portal's advanced hunting surface, and they emit alerts directly into Defender XDR rather than via the SIEM analytics-rule pipeline [@ms-learn-sentinel-custom-detections]. Custom detections can read `DeviceProcessEvents` and the rest of the Defender advanced hunting schema, which is fed by the MDE sensor independent of Sysmon. For the worked example, a Custom Detection equivalent to the Sentinel scheduled rule would also have fired -- but it would have fired against MDE's `DeviceProcessEvents` table, not against Log Analytics `Event`. The two paths are not interchangeable. Microsoft's documentation is explicit that custom detections operate over the Defender XDR-internal advanced hunting schema, not over arbitrary Log Analytics tables [@ms-learn-sentinel-custom-detections] [@ms-learn-advanced-hunting].

<PullQuote attribution="Microsoft Defender XDR documentation -- 'Manage custom detection rules'">
"Custom detection rules are rules you can design and tweak using advanced hunting queries. These rules let you proactively monitor various events and system states, including suspected breach activity and misconfigured endpoints." [@ms-learn-sentinel-custom-detections]
</PullQuote>

That is the policy line that decides where to put a new rule: if your query reads from `DeviceProcessEvents` (MDE feed), it belongs as an advanced-hunting custom detection inside Defender XDR; if your query reads from Sentinel `Event` or `SecurityEvent` (Log Analytics feed), it belongs as a Sentinel analytics rule. The recipe in §10 picks the Sentinel side because the worked example begins in Sysmon, not in MDE -- and Sysmon flows to Log Analytics, not to the MDE advanced-hunting schema.

### 6.5 Hop 5 -- Microsoft Defender for Cloud as the CWPP alert source

This hop is the most architecturally interesting and the most operationally misunderstood. It is also where the previous iteration of this article had to be corrected on its single most load-bearing detail, so the framing here is deliberate.

> **Key idea:** **Only Microsoft Defender for Cloud's CWPP alerts flow into Defender XDR -- not its CSPM posture findings.** A Secure Score recommendation that "VMs should have endpoint protection installed" or "Storage accounts should restrict public access" is a *posture finding*. A "Suspicious PowerShell command line detected on MAL-CONTOSO-PRD-04" emitted by the Defender for Servers runtime plan is an *alert*. Defender XDR ingests the alerts; the posture findings stay in the MDC blade [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest].

The vocabulary first, because everything in this section depends on it.

<Definition term="CSPM (Cloud Security Posture Management)">
Continuous assessment of cloud-resource configuration against a baseline of best practices (Microsoft cloud security benchmark, CIS, NIST 800-53, etc.). Output is *recommendations* and a *Secure Score*. CSPM does not see runtime telemetry. In Microsoft's stack, CSPM is the foundational layer of Microsoft Defender for Cloud and is free to enable [@ms-learn-mdc-introduction] [@ms-learn-secure-score].
</Definition>

<Definition term="CWPP (Cloud Workload Protection Platform)">
Runtime detection on a deployed cloud workload -- a VM, a container, a SQL database, a storage account, an App Service. CWPP sees actual events (process spawns, network connections, control-plane API calls) and emits *alerts*. In MDC, CWPP is delivered as paid plans: Defender for Servers, Containers, SQL, Storage, App Service [@ms-learn-mdc-introduction] [@ms-learn-mdc-cwpp-features].
</Definition>

<Definition term="MCSB (Microsoft cloud security benchmark)">
The default CSPM control framework that ships with Microsoft Defender for Cloud. MCSB is Microsoft's interpretation of CIS, NIST 800-53, and PCI DSS controls mapped to Azure, AWS, and GCP resource types. Recommendations are scored against MCSB by default; other frameworks can be added [@ms-learn-mcsb-overview].
</Definition>

The CSPM-versus-CWPP distinction has direct operational consequences for what shows up at hop 6:

| What MDC emits | Where it lives | Flows to Defender XDR? |
|---|---|---|
| **Recommendation** (CSPM) -- e.g., "Endpoint protection should be installed" | Recommendations blade in MDC + `SecurityRecommendation` table | **No** [@ms-learn-mdc-xdr-concept] |
| **Secure Score** (CSPM) -- aggregate over recommendations | Secure Score blade in MDC | **No** [@ms-learn-secure-score] |
| **Compliance assessment** (CSPM) -- per-framework rollup | Regulatory compliance blade | **No** |
| **Alert** (CWPP) -- e.g., "Suspicious PowerShell command line" | Alerts blade in MDC + `SecurityAlert` table | **Yes** [@ms-learn-mdc-xdr-ingest] |
| **Container runtime alert** -- e.g., "Web shell detected in pod" | MDC Alerts + `SecurityAlert` | **Yes** [@ms-learn-mdc-containers] |
| **Storage runtime alert** -- e.g., "Anomalous access from Tor IP" | MDC Alerts + `SecurityAlert` | **Yes** [@ms-learn-mdc-storage] |

The CWPP alerts come from MDC's five priced runtime plans. Each plan has its own data path, but they all converge on the same `SecurityAlert` table in Log Analytics and on the same XDR ingestion path:

| MDC plan | Workload | Data source | Reference |
|---|---|---|---|
| Defender for Servers | Windows / Linux VMs, Arc | MDE sensor + agent telemetry | [@ms-learn-mdc-defender-servers] [@ms-learn-mdc-mde-integration] |
| Defender for Containers | AKS, EKS, GKE pods | runtime sensor + Kubernetes audit | [@ms-learn-mdc-containers] |
| Defender for SQL | Azure SQL, Arc SQL | Azure SQL Advanced Threat Protection signals | [@ms-learn-mdc-sql] [@ms-learn-azuresql-atp] |
| Defender for Storage | Storage accounts | Control plane + blob access patterns | [@ms-learn-mdc-storage] |
| Defender for App Service | App Service apps | Process + network signal from the worker | [@ms-learn-mdc-appservice] |

For the worked example, the relevant plan is Defender for Servers. Because MDE is installed on the host (Defender for Servers Plan 2 includes the MDE license), the MDE sensor's runtime telemetry feeds into MDC's detection engine and emits the `Suspicious PowerShell command line` MDC alert at 14:07:42 UTC [@ms-learn-mdc-mde-integration] [@ms-learn-mde-onboard-windows]. That alert flows to Defender XDR via the MDC-to-XDR alert-ingestion integration that reached general availability in **March 2024** (specifically March 13, 2024) [@ms-learn-mdc-xdr-ingest] [@ms-learn-mdc-xdr-concept].

> **Note:** **Do not assume MDC posture findings will appear in your Defender XDR incident.** The MDC-to-XDR integration ingests **alerts only**, not recommendations and not Secure Score deltas. If a SOC analyst wants posture context on an incident-affected host (e.g., "was this host's endpoint protection missing per Secure Score?"), they must pivot to the MDC blade or join `SecurityRecommendation` from KQL. There is no automatic incident-side enrichment for posture findings as of the documented integration scope [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest].

The CSPM/CWPP separation also explains the multi-cloud story. MDC's CSPM scope spans Azure, AWS, and GCP via cloud connectors -- you can onboard an AWS account with `aws-onboarding` and see your S3 buckets in the Secure Score [@ms-learn-mdc-onboard-aws]. The CWPP plans for non-Azure clouds are narrower: Defender for Servers works on AWS EC2 and on-prem via Azure Arc, Defender for Containers works on EKS and GKE, but several plans (Storage, App Service) are Azure-only. The result is a posture surface that is genuinely multi-cloud and a runtime surface that is mostly Azure-plus-Arc -- which is the layer that actually flows to XDR at hop 6 [@ms-learn-mdc-introduction].

### 6.6 Hop 6 -- The Defender XDR correlation engine and the fan-in

The last hop is the merge. The Defender XDR correlation engine reads incoming alerts from all source pipelines, normalizes the entity values they carry, and groups alerts whose entities overlap within a sliding time window into a single incident. The grouping is asymmetric: a higher-confidence alert (e.g., an MDE process-tree alert with a strong `ProcessGuid`) can pull in lower-confidence alerts (e.g., a Sentinel rule whose only entity is `Host`), but not vice-versa [@ms-learn-xdr-correlation].

<Definition term="Correlation engine (Defender XDR)">
The server-side service that reads alerts from connected sources, computes entity overlap and temporal proximity, and merges related alerts into incidents. The engine is not user-configurable in detail; merge thresholds, time windows, and entity-priority rules are Microsoft-managed defaults [@ms-learn-xdr-correlation] [@ms-learn-defender-xdr-incidents].
</Definition>

The geometry of the fan-in for the worked example is the mirror image of the fan-out in section 5. The same three alerts that arrived at three different timestamps now converge on a single incident object I-7842:

<Mermaid caption="Hop 6 fan-in. Three alerts arrive at the Defender XDR correlation engine from three pipelines, share a Host entity (and ProcessGuid where available), and merge into one incident.">
&#123;`sequenceDiagram
    autonumber
    participant SEN as Sentinel SecurityAlert
    participant MDC as MDC SecurityAlert
    participant MDE as MDE DeviceAlertEvents
    participant COR as Defender XDR correlator
    participant INC as Incident I-7842
    SEN->>COR: Host MAL-... ProcessGuid abc at 14:05:04
    MDC->>COR: Host MAL-... File powershell.exe at 14:07:42
    MDE->>COR: Host MAL-... ProcessGuid abc at 14:08:11
    Note over COR: match window ≤ 30 min
    COR->>INC: open incident, attach Sentinel alert
    COR->>INC: merge: MDE matches on ProcessGuid
    COR->>INC: merge: MDC matches on Host within window
    INC-->>SEN: backlink to source alert
    INC-->>MDC: backlink to source alert
    INC-->>MDE: backlink to source alert`&#125;
</Mermaid>

Three things deserve explicit attention in this fan-in:

1. **The strong-entity priority.** The MDE alert and the Sentinel alert share `ProcessGuid`. Microsoft documents that field as a unique value designed to make event correlation easier across hosts and domains [@ms-learn-sysmon]. The merge between them is unambiguous. The MDC-from-Servers alert only carries `Host` and `File` -- the MDC plan's alert grammar does not necessarily emit `ProcessGuid` even though the underlying MDE sensor knows it. The MDC alert merges into the incident on the weaker `Host` match within the time window.

2. **The Microsoft-managed thresholds.** The correlation window, the entity-priority rules, and the merge logic are not exposed for customer tuning. They are documented at the policy level -- "alerts that share entities within a time window" -- but the exact heuristics are part of the Defender XDR service [@ms-learn-xdr-correlation]. §9 returns to this opacity as an open problem.

3. **What does NOT merge.** Some categories of source data stay outside the incident graph even when they ought to: cross-workspace Sentinel rules (alerts in a workspace other than the Defender-XDR-connected "primary" one), third-party connector alerts that lack entity mappings, and -- as already underlined -- MDC posture findings of every kind [@ms-learn-mdc-xdr-concept].

<Sidenote id="S5">The "primary workspace" constraint matters for multi-workspace customers. A Defender XDR tenant connects to exactly one Sentinel primary workspace for the unified secops experience. Sentinel alerts from secondary workspaces still exist as alerts, can still trigger automation rules, and are still queryable via cross-workspace KQL -- but they do not appear in the unified incident graph at security.microsoft.com [@ms-learn-unified-secops] [@ms-learn-move-to-defender]. Customers with regional workspace topologies (e.g., one per Azure region for data-residency reasons) need to plan which workspace is the XDR-connected one.</Sidenote>

For the worked example, hop 6 completes at 14:09:30 UTC: the SOC analyst sees a single incident in their queue, titled `Multi-stage incident on one endpoint`, with three correlated alerts on its alerts tab, a unified entity graph showing the host, the user, the parent and child processes, the file hash, and the URL embedded in the encoded command line, and one-click pivots to the MDE timeline, the Sentinel investigation graph, and the MDC alert detail. Three pipelines, one analyst surface, nine minutes thirteen seconds end-to-end.

That is the full path. The next three sections compare it to what other vendors do, name the theoretical limits any such pipeline has to live with, and walk the open problems that even the best-tuned version of this pipeline still faces.

## 7. Competing approaches: inside and outside the Microsoft fence

The architecture in §6 is one answer to "how do I turn endpoint telemetry into a SOC incident." It is not the only answer. Other detection engines exist both inside Microsoft and outside, with materially different design choices that are useful to compare side-by-side.

Inside Microsoft, six detection engines run on roughly the same data over the same workspace -- and an architect picking where to put a new detection has to know what each one optimizes for.

| Engine | Where the query runs | Latency | Best fit |
|---|---|---|---|
| Sentinel scheduled rule | Log Analytics KQL, every 5+ min | 5-8 min | Cross-source SIEM detections, free-form KQL [@ms-learn-sentinel-scheduled-rules] |
| Sentinel NRT rule | Log Analytics KQL, every 1 min | 1-2 min | High-priority single-row detections [@ms-learn-sentinel-nrt-rules] |
| Sentinel Fusion | ML, multi-source | Hours | Multistage attack patterns, low-signal corroboration [@ms-learn-sentinel-fusion] |
| Defender XDR custom detection | Advanced hunting KQL, periodic | 5-30 min | Detections over `DeviceProcessEvents` / MDE schema [@ms-learn-sentinel-custom-detections] |
| MDE built-in detections | In-product behavioural | Seconds-to-minutes | Endpoint-local process / file / network signatures [@ms-learn-mde-landing] |
| MDC plan built-in detections | Per-plan engines | Seconds-to-minutes | Per-workload runtime detection (containers, SQL, storage) [@ms-learn-mdc-introduction] |

The takeaway is that **Sentinel and Defender XDR custom detections are not interchangeable**. They read from different schemas (Log Analytics tables vs MDE advanced-hunting tables), they have different governance models (Azure RBAC vs Defender role-based access), and they emit alerts via different paths. The right engine depends on where your telemetry lives. For the worked example, Sysmon in `Event` is reached by Sentinel, not by Custom Detections; MDE's `DeviceProcessEvents` for the same host is reached by Custom Detections, not by Sentinel scheduled rules.

Outside Microsoft, the six widely-deployed alternative stacks each make different trade-offs:

| Stack | Storage | Query language | Strength | Cost shape |
|---|---|---|---|---|
| Splunk Enterprise Security | Splunk indexers | SPL | Long-installed, deep app catalog, mature SOAR | License-tier (GB/day) or workload-based |
| Splunk Cloud + ES | Splunk-managed cloud | SPL | Same SPL, SaaS-managed | Per-ingest workload-priced |
| Elastic Security | Elasticsearch | EQL + ES|QL | Open-source community, full-text strength | Per-node infra + paid features |
| Google SecOps (Chronicle) | Google-internal columnar | YARA-L 2 + UDM | Petabyte-scale retention, fixed bytes-per-employee pricing | Per-employee (no per-GB) |
| AWS Security Lake + Athena | S3 + OCSF | Athena SQL | Open-schema, bring-your-own-detection | Per-ingest + per-query |
| Sigma + open-source SIEM | Vendor-neutral rule format, translates to many SIEMs | Sigma YAML | Portable detection rules | Free format; SIEM cost varies |

<Sidenote id="S6">**Sigma** deserves a special mention because it is a *rule format*, not a SIEM. Sigma rules describe detections in a vendor-neutral YAML schema and are translated by a converter (`sigmac`) into the target SIEM's native query language -- KQL for Sentinel, SPL for Splunk, ES|QL for Elastic, YARA-L for Google SecOps [@sigmahq-sigma]. The result is that a single Sigma rule for "Office process spawns PowerShell with encoded command" can be deployed across multiple SIEMs without rewriting. The trade-off is that Sigma compiles to the lowest common denominator of expressiveness; complex multi-table joins do not translate cleanly. Microsoft Sentinel supports Sigma rule import via the analytics-rule wizard [@sigmahq-sigma].</Sidenote>

The structural difference that matters most across these stacks is **where the storage and query engine live**. Splunk on-prem owns its full stack and bills on ingest. Elastic gives you the stack and lets you self-host or buy SaaS. Google SecOps removes the per-GB axis entirely and bills per employee, betting that the value of the SOC is the analyst's time, not the byte count. AWS Security Lake decomposes further than Microsoft does, exposing S3 directly so you can bring any analytics engine. Microsoft's design point -- KQL over Log Analytics with grafted XDR correlation -- sits in the middle: more managed than AWS, more opinionated than Elastic, billed per-GB like Splunk but with separable axes.

There is also a **migration option** worth knowing about. Microsoft introduced a Sentinel SIEM migration experience in 2024 that uses generative AI to translate detection rules from Splunk SPL to KQL [@ms-learn-sentinel-siem-migration]. The tool is not a complete replacement for human review of every translated rule, but it materially shortens the migration spike that has historically blocked SOCs from switching SIEMs. The existence of such a tool is itself evidence that the SIEM market is becoming more substitutable than it once was -- a SOC's investment in detection logic is no longer locked to one vendor's query language.

For the worked example specifically, every one of the alternative stacks could in principle deliver the same end result -- one incident for a parent-child process-spawn detection. The differences are in the operating model: who owns the storage, who owns the agent, who priced the ingest, and how easily the analyst can pivot from the incident into raw telemetry. Microsoft's pitch with the unified secops platform is that "all of the above are in one portal." The honest reading is "the Microsoft-side ones are in one portal, and the third-party feeds you stream into Sentinel still participate via the same `SecurityAlert` table."

## 8. Theoretical limits

The six-hop pipeline is mostly an engineering object. But it inherits a few honestly theoretical limits that no amount of clever product design can defeat. Naming them sharply is the difference between an architect who knows what the system cannot do and a buyer who is surprised.

<Definition term="Entity resolution">
The general problem of deciding when two records in different data sources refer to the same real-world entity. In the SIEM context, the entities are users, hosts, files, processes, IPs, URLs, and email recipients. Strong identifiers (a hardware-rooted DeviceId, a Microsoft Entra ObjectId, a SHA256 hash) make the problem tractable; weak identifiers (an account name, an IP address, a file name) make it probabilistic [@ms-learn-sentinel-entities].
</Definition>

The first hard limit is that **entity resolution across pipelines is structurally probabilistic** whenever the strong identifiers are missing. The Defender XDR correlator depends on entity overlap; the worked example merged cleanly because `ProcessGuid` was shared between MDE and Sentinel. Take that identifier away and the merge falls back on `Host`, which is shared but ambiguous (hostnames are reused, machine accounts get recycled), and ultimately on weaker identifiers like file name or command-line substring. The table below names what identifiers each source pipeline can be relied upon to carry.

| Entity type | Strong identifier (when available) | Weak fallback | Pipelines that emit the strong form |
|---|---|---|---|
| Host | DeviceId (MDE GUID), Azure resourceId | Hostname, FQDN | MDE, MDC for Servers, Sentinel (if mapped) |
| Process | ProcessGuid (Sysmon/MDE) | Image path + start time | Sysmon, MDE, advanced hunting |
| Account | Microsoft Entra ObjectId | UPN, samAccountName | Microsoft Entra ID logs, MDI |
| File | SHA256 | Filename, MD5 | MDE, Sentinel rules that include hash |
| IP | n/a (probabilistic by definition) | IP literal | All |
| URL | Normalized URL with scheme | Bare host | MDE, Defender for Office, threat-intel feeds |

> **Note:** **Aha #3 -- entity resolution is information-theoretic, not engineering.** Two records refer to the same entity if and only if their identifiers carry enough joint information to pick that entity out of the space of all entities. When the entity space is small (a few thousand hosts) and the identifier is strong (a DeviceId), the match is determined. When the entity space is large (every IP on the public internet) and the identifier is weak (the bare IP), the match is probabilistic and false-positives accumulate. No correlation engine, however clever, can manufacture information that the source pipeline did not record. The architectural lesson is to *invest in strong identifiers upstream* -- in agents, in DCR schemas, in alert grammars -- not to lean on correlator cleverness downstream.

The second hard limit is **normalization lossiness**. ASIM (Advanced Security Information Model), Microsoft's effort to normalize Sentinel data into common schemas like `_Im_ProcessCreate`, makes cross-source queries dramatically easier -- but the normalization is lossy. Fields that exist only in Sysmon (such as the Sysmon-specific `IntegrityLevel` value, or the `OriginalFileName` from the PE manifest) get dropped on the way into the normalized schema [@ms-learn-sentinel-asim-normalization]. The trade-off is honest and inescapable: a normalized schema is a projection from a richer per-source schema, and projections lose data by construction.

We can sketch this formally. If $$S$$ is the per-source schema (a set of fields), $$N$$ is the normalized schema, and $$\pi: S \to N$$ is the projection (the ASIM mapping), then the information loss on a single record $$r$$ is

$$
L(r) = H(r) - H(\pi(r))
$$

where $$H$$ is the entropy (number of bits) of the record. For a Sysmon ProcessCreate row, $$H(r)$$ is roughly $$\log_2 |S|$$ bits over a thirty-field schema (call it ~150-200 bits of effective entropy after compression of correlated fields); $$H(\pi(r))$$ is around half that after mapping into the much smaller normalized `_Im_ProcessCreate` schema. The dropped bits are exactly the fields you cannot query in the normalized form. ASIM is good for cross-source detections that need only common fields; per-source detections that need the long tail of source-specific fields must query the raw source table directly.

The third limit is **temporal alignment**. Each pipeline has its own clock: Sysmon timestamps come from the host kernel, MDC alerts from the MDC service back-end, Sentinel `TimeGenerated` from the workspace ingestion. Within a single host these clocks are usually close (NTP-synced), but across hosts and across pipelines they can drift by seconds or minutes. The correlator's "within a time window" merge has to tolerate this drift, which means the window has to be larger than the worst-case clock skew. A larger window means more false-positive merges. There is no way out of this trade-off; only operational tuning between sensitivity and specificity.

The fourth limit is **rule expressiveness ceiling**. KQL is Turing-complete in the sense that any computable detection can be expressed if you are willing to write enough of it -- but Sentinel scheduled rules cap query duration, query result size, and join cardinality. Detections that conceptually want to scan a year of data and join against a separately-changing IOC list are *expressible* in KQL but *not runnable* under Sentinel rule limits. Custom ADX clusters or Spark-on-Synapse can run such queries, at the cost of leaving the unified portal entirely.

These are the limits any honest architecture has to live with. The Microsoft pipeline does well on the first (when strong identifiers exist), is honest about the second (ASIM is documented as a normalization, not a transparent overlay), tolerates the third (windowed merge), and surfaces the fourth as a Sentinel pricing-and-scope conversation. None of them is a Microsoft-specific defect. They are properties of the problem.

## 9. Open problems

The pipeline is fast enough, accurate enough, and -- in the worked example -- correct. It is not finished. Seven open problems remain, in roughly decreasing order of how much they hurt a working SOC today.

**1. The Sentinel-Azure-portal cutover is on a hard date.** Microsoft has announced the retirement of the Microsoft Sentinel experience in the Azure portal effective **March 31, 2027** (extended from the original July 1, 2026 target) [@ms-learn-sentinel-azure-portal-retiring] [@helpnetsec-sentinel-defender-timeline]. After that date, Sentinel can only be operated through the unified Defender portal at `security.microsoft.com`. The cutover affects analytics-rule authoring (the Azure-portal rule wizard goes away), automation rules, watchlists, and the investigation graph. Customers with custom dashboards, ARM templates, or automation that targets the Azure-portal Sentinel surface must port them. This is the most concrete migration deadline in this article.

**2. CWPP-to-XDR coverage is still expanding.** As of the documented integration scope, MDC for Servers, Containers, SQL, Storage, and App Service alerts flow to Defender XDR [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest]. New CWPP plans (e.g., Defender for APIs as it matures) tend to land first in the MDC blade and only later in the unified incident graph. Customers operationalizing a new MDC plan should check the integration documentation for that specific plan rather than assuming XDR ingestion is automatic.

**3. Posture-finding context still lives in a separate blade.** As §6.5 established, MDC posture findings do not flow to Defender XDR. A SOC analyst looking at an incident on a host has no incident-side way to see "this host also has a CSPM finding for missing endpoint protection." The workaround is to `join` `SecurityRecommendation` against the incident-affected resources via KQL, or to pivot manually to the MDC blade. A first-class "posture context on incident" feature does not exist as of the documented surface area.

**4. The correlation engine's heuristics are not user-tunable.** The Defender XDR correlation engine merges alerts using a Microsoft-managed set of thresholds: time window, entity priority, confidence weighting [@ms-learn-xdr-correlation]. These are not exposed for customer override. A SOC that wants to widen the merge window (because their telemetry has long ingest tails) or tighten the entity-priority (because they distrust hostname matches for shared-name VMs) has no knob to turn. The correlation behaviour is whatever Microsoft ships; tuning happens by raising support cases against perceived false-merges or false-splits.

**5. Custom detection semantics are subtly different from Sentinel rule semantics.** A KQL detection authored as a Defender XDR Custom Detection runs over the advanced-hunting schema (`DeviceProcessEvents`, `DeviceFileEvents`, etc.), not over Log Analytics tables [@ms-learn-sentinel-custom-detections] [@ms-learn-advanced-hunting]. The two schemas overlap (you can write conceptually similar detections over both), but the field names, the freshness windows, and the result-size caps differ. An organization with parallel teams authoring detections in both surfaces can end up with two near-duplicate detections that drift apart over time. There is no first-class deduplication or "promote this Sentinel rule to a Custom Detection" workflow.

**6. Logic Apps write-back from Sentinel to MDC has rough edges.** Sentinel automation rules can invoke Logic Apps playbooks to take response actions [@ms-learn-sentinel-logic-apps-playbooks] [@ms-learn-sentinel-soar]. Writing back to MDC -- for example, suppressing an alert in MDC or creating a Defender for Cloud assessment programmatically -- is possible but requires the playbook to call the MDC REST API directly [@ms-learn-mdc-assessments-rest]. There is no native "MDC action" connector with the breadth of the MDE actions connector. Customers building bidirectional response automation between Sentinel and MDC end up writing HTTP-action playbooks by hand. <Sidenote id="S7">The MDC REST API for assessments lets you create and update assessment results programmatically, but the surface area for *writing back* to MDC (e.g., dismissing or recategorizing an alert) is smaller than the read API and is not symmetric with Sentinel's native alert-lifecycle actions [@ms-learn-mdc-assessments-rest] [@ms-learn-mdc-custom-recs]. Closing this gap with a first-class connector is on most enterprise customers' wish lists.</Sidenote>

**7. Multi-workspace and multi-tenant topologies remain awkward.** The unified secops experience connects Defender XDR to exactly one Sentinel primary workspace per tenant. Customers with multiple workspaces -- common in regulated industries with data-residency boundaries -- must choose which workspace is the XDR-connected one, and accept that the other workspaces' alerts are visible only inside Sentinel, not in the unified incident graph [@ms-learn-unified-secops] [@ms-learn-move-to-defender]. Multi-tenant MSSPs and customers with subsidiaries on separate Azure tenants face an even harder design problem: there is no single pane across tenants in the unified portal, only the cross-workspace KQL pattern from §4.

**8. Multicloud entity resolution: the EC2-on-AWS case.** A Windows VM running as an AWS EC2 instance can be brought into Microsoft's stack through two layers, neither of which produces a single shared identifier. Defender for Cloud's multicloud connector ingests AWS CloudTrail and EC2 metadata into MDC's posture surface (CSPM coverage) [@ms-learn-mdc-onboard-multicloud]; Defender for Servers' Arc-based provisioning then installs Azure Monitor Agent and Microsoft Defender for Endpoint on the EC2 host, projecting the box into the Azure tenant's resource graph as an `Microsoft.HybridCompute/machines` Arc resource. Three identifiers therefore describe the same physical workload but never coincide on a single strong identifier: (1) the EC2 ARN `arn:aws:ec2:<region>:<account>:instance/<instance-id>`, which is what AWS CloudTrail and the AWS console use; (2) the Arc machine resource ID `/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.HybridCompute/machines/<arc-machine-name>`, which is what the Log Analytics `_ResourceId` column carries when AMA forwards the Sysmon event; (3) the MDE `DeviceId`, a GUID assigned at MDE first-onboarding, which is what the Defender for Servers CWPP alert and the `DeviceInfo` advanced-hunting table key on. Bridging the three at query time requires bespoke KQL: lift the Arc machine name from `_ResourceId` via `extend ArcMachine = tostring(split(_ResourceId, "/")[-1])`, look up the corresponding `DeviceId` in `DeviceInfo` keyed by `DeviceName`, and `join` to a customer-maintained `Watchlist` (or external CMDB) that maps Arc machine name -> EC2 instance-id -> EC2 ARN. The pattern works, but every join is a place where the inventory can drift; a renamed EC2 instance or a reimaged host that picks up a new MDE `DeviceId` will silently break correlation until the watchlist is refreshed.

<Aside type="scope" id="A2">
The EC2 sub-example above is the tip of the iceberg. Multi-cloud is its own open problem and worth a separate article. MDC's CSPM and parts of the CWPP plans (Servers, Containers) cover AWS and GCP via Azure Arc and cloud connectors, but the depth of integration for non-Azure workloads in the unified XDR experience is less than for native Azure workloads. The honest summary is "Azure-first, AWS/GCP-supported, on-prem via Arc." Designs that are AWS-primary should evaluate AWS Security Lake + a SIEM (Sentinel, Splunk, or Athena) against MDC-on-AWS specifically; the choice is not obvious.
</Aside>

None of these problems is fatal to the architecture. Each is the kind of structural friction that comes from grafting three pre-existing pipelines into one analyst surface in fewer than three years. The cutover date is the only one with a deadline; the rest are roadmap items.

## 10. Recipe: building the pipeline yourself in six steps

This section walks the six setup steps that produce the worked example end-to-end, in the order an engineer should actually do them. Each step names the artifact, the documentation reference, and the single most common mistake that will silently break the step.

### Step 1 -- Install Sysmon with a curated configuration

Install Sysmon on the host (Azure VM, Arc-enabled server, or on-prem Windows) with a configuration that emits the events you actually need [@ms-learn-sysmon]. The default Sysmon config is essentially empty; a curated config is what makes it useful. Many teams start with the SwiftOnSecurity `sysmon-config` or Olaf Hartong `sysmon-modular` public baselines and prune from there [@swiftonsecurity-sysmon-config] [@hartong-sysmon-modular].

> **Note:** **Don't reinvent the Sysmon config.** Two community-maintained baselines do most of the work: the SwiftOnSecurity `sysmon-config` template ("a Sysmon configuration file for everybody to fork...with default high-quality event tracing") and Olaf Hartong's `sysmon-modular` framework ("a Sysmon configuration repository for everybody to customise") cover the common cases with years of community tuning [@swiftonsecurity-sysmon-config] [@hartong-sysmon-modular]. Pick one, version-control it in your config-management tool (DSC, Ansible, Chef), and ship it via your existing host-config pipeline. The single most common mistake is shipping a default Sysmon install and then wondering why detections fire on noise.

Validate that Sysmon is emitting by reading the local event log on the host: `Get-WinEvent -LogName "Microsoft-Windows-Sysmon/Operational" -MaxEvents 5`. If you see ProcessCreate (Event ID 1) records, hop 1 works.

### Step 2 -- Deploy the Azure Monitor Agent with a Data Collection Rule

Install AMA on the host (via Azure Policy for Azure VMs, the Arc agent for non-Azure, or the standalone installer) [@ms-learn-ama-overview]. Then create a Data Collection Rule that names the Sysmon channel and ships it to your Sentinel-enabled workspace. The ARM snippet below is the load-bearing artifact: the `streams` value must be exactly `Microsoft-WindowsEvent` (or, for the older `Event` table path, `Microsoft-Event`), not a variant. **This is the silent-failure cliff §6.2 named: get this string wrong and the agent ships nothing, returning no error.**

```json
{
  "type": "Microsoft.Insights/dataCollectionRules",
  "apiVersion": "2022-06-01",
  "name": "dcr-sysmon-to-sentinel",
  "location": "eastus",
  "properties": {
    "dataSources": {
      "windowsEventLogs": [
        {
          "name": "sysmonOperational",
          "streams": ["Microsoft-WindowsEvent"],
          "xPathQueries": [
            "Microsoft-Windows-Sysmon/Operational!*[System[(EventID=1 or EventID=3 or EventID=7 or EventID=10 or EventID=11)]]"
          ]
        }
      ]
    },
    "destinations": {
      "logAnalytics": [
        { "name": "lawDest",
          "workspaceResourceId":
            "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.OperationalInsights/workspaces/law-contoso-secops" }
      ]
    },
    "dataFlows": [
      { "streams": ["Microsoft-WindowsEvent"],
        "destinations": ["lawDest"] }
    ]
  }
}
```

The silent-miss bug is real: a DCR that names `"Microsoft-Event"` ships into the older `Event` table; a DCR that names `"Microsoft-WindowsEvent"` ships into the newer typed `WindowsEvent` table; **a DCR that names anything else (typo, copy-paste from another data source, or a name that does not exist) emits nothing, returns no validation error at deploy time, and produces a silent dashboard hole** [@ms-learn-ama-windows-events] [@ms-learn-sentinel-data-connectors-ref]. The fix is to validate post-deploy by checking that rows are arriving in the destination table within ~5 minutes.

Validation KQL to run in the workspace:

```kql
Event
| where TimeGenerated > ago(10m)
| where Source == "Microsoft-Windows-Sysmon" and EventID == 1
| summarize count() by Computer
```

If you see a row per Sysmon-emitting host, hop 2 and hop 3 work.

### Step 3 -- Author the Sentinel scheduled analytics rule

Inside the Defender portal's Sentinel section (or the Azure-portal Sentinel blade until the March 31, 2027 cutover), create a new scheduled analytics rule [@ms-learn-sentinel-scheduled-rules] [@ms-learn-sentinel-azure-portal-retiring]. Paste the KQL from the Spoiler in §6.3. Configure entity mappings: `Host` from `Computer`, `Account` from `User`, `Process` from `ProcessGuid`. Schedule: run every 5 minutes over the last 5 minutes. Severity: Medium. Tactic: `Execution` (MITRE ATT&CK T1059.001).

The single most common mistake at this step is **omitting the entity mappings**. The rule will fire and produce a `SecurityAlert` row, but the alert will not participate in cross-pipeline correlation at hop 6 because there are no entities to merge on. Always configure at least Host, Account, and -- when available -- Process or FileHash entity mappings on a Sentinel rule.

<Aside type="scope" id="A3">
This recipe sets up the Sysmon-to-Sentinel-to-XDR path only. Adjacent surfaces -- Microsoft Defender for Office 365 for email alerts, Microsoft Defender for Identity for on-prem AD signals, Microsoft Defender for Cloud Apps (MDCA) for SaaS-app signals -- have their own onboarding paths and are out of scope for this six-step recipe. The convergence point in Defender XDR is the same; the upstream setup differs per source.

Five other adjacent surfaces are worth knowing about as a map of the broader Microsoft SecOps surface, even though this article does not walk any of them:

- **Sentinel watchlists** -- name-value reference tables (e.g., critical-asset inventory, terminated-user list, custom IOC list) stored in the `Watchlist` table and cached for low-latency enrichment joins in KQL analytics rules and hunts [@ms-learn-sentinel-watchlists].
- **Sentinel threat intelligence integration** -- ingest IOCs from TAXII feeds, Microsoft Defender Threat Intelligence, MISP, or platform connectors into the `ThreatIntelligenceIndicator` table, and use the built-in TI map rule type to fire on matches against your telemetry [@ms-learn-sentinel-threat-intel].
- **MSTICPy + Sentinel Jupyter notebooks** -- the Microsoft-maintained MSTICPy Python library plus Sentinel's notebook integration give hunters a programmable workspace for incident investigation, IOC pivoting, and ML-driven analysis on Sentinel data outside the rule-authoring surface [@ms-learn-sentinel-notebooks].
- **Sentinel Content Hub and the solutions marketplace** -- the in-product distribution surface for prepackaged detections, parsers, workbooks, hunting queries, and playbooks delivered as Microsoft-signed or partner-signed solutions [@ms-learn-sentinel-solutions].
- **Microsoft Defender External Attack Surface Management (Defender EASM)** -- the adjacent posture surface that discovers and maps an organization's internet-facing assets from the outside-in; explicitly out of scope for this article's CSPM/CWPP/SIEM/XDR spine, but worth knowing exists [@ms-learn-defender-easm].
</Aside>

### Step 4 -- Enable Defender for Servers (Plan 2) for MDC alerts

On the Azure subscription that owns the VM (or the Arc-enabled resource group), enable Microsoft Defender for Cloud's Defender for Servers Plan 2 [@ms-learn-mdc-defender-servers]. Plan 2 includes the MDE license and the runtime detection engine that emits the MDC alert at hop 5. Enabling the plan automatically deploys MDE to the in-scope hosts and configures the MDC-to-XDR alert-ingestion integration that reached general availability in March 2024 [@ms-learn-mdc-mde-integration] [@ms-learn-mdc-xdr-ingest].

Validation: trigger a benign test pattern (e.g., `powershell -EncodedCommand` of a harmless script) on a test host. Within ~5 minutes, you should see an MDC alert in the MDC Alerts blade titled `Suspicious PowerShell command line` (or similar), and a corresponding alert in `security.microsoft.com`.

### Step 5 -- Connect Sentinel to the unified Defender portal

Inside the Defender portal, enable the Sentinel connection that designates your Log Analytics workspace as the **primary workspace** for the unified secops experience [@ms-learn-sentinel-defender-portal] [@ms-learn-move-to-defender]. This step is what makes the Sentinel `SecurityAlert` rows flow into the Defender XDR incident graph at hop 6 and become merge candidates with the MDC and MDE alerts.

One-tenant, one-primary-workspace constraint: as §6.6 noted, a Defender XDR tenant has exactly one primary Sentinel workspace. If you have multiple workspaces (regional residency reasons, MSSP topology, etc.), choose deliberately which one is the XDR-connected one. Alerts in secondary workspaces remain queryable via Sentinel but do not participate in the unified incident graph.

### Step 6 -- Write a watchdog rule that fires on telemetry silence

The pipeline can fail silently in multiple places: AMA stops on a host, the DCR is removed or mis-edited, Sysmon is uninstalled, the workspace fills its daily cap. None of these failures produce an alert by themselves. Write a Sentinel scheduled rule that fires when **expected** telemetry is *absent*: for each host in your inventory, alert if `Event` table rows from that host stop appearing for more than N minutes.

<RunnableCode language="python" caption="Inventory drift detector. Compares the set of hosts that emitted Sysmon events in the last 24 hours against an expected inventory; reports hosts that have gone silent.">
{`# Run via Azure Monitor REST API or the az monitor cli; here we simulate
# the comparison logic that an analytics rule would express in KQL.

EXPECTED_INVENTORY = {
    'MAL-CONTOSO-PRD-01',
    'MAL-CONTOSO-PRD-02',
    'MAL-CONTOSO-PRD-03',
    'MAL-CONTOSO-PRD-04',
    'MAL-CONTOSO-PRD-05',
}

# In a real deployment this list comes from KQL against the Event table:
#   Event | where TimeGenerated > ago(24h)
#         | summarize by Computer
RECENTLY_EMITTING_HOSTS = {
    'MAL-CONTOSO-PRD-01',
    'MAL-CONTOSO-PRD-02',
    # PRD-03 absent: agent down? DCR removed?
    'MAL-CONTOSO-PRD-04',
    'MAL-CONTOSO-PRD-05',
}

silent_hosts = EXPECTED_INVENTORY - RECENTLY_EMITTING_HOSTS
if silent_hosts:
    print(f"ALERT: telemetry silence on {len(silent_hosts)} host(s):")
    for h in sorted(silent_hosts):
        print(f"  - {h}")
else:
    print("OK: all expected hosts emitted Sysmon events in the last 24h.")
`}
</RunnableCode>

The equivalent in Sentinel is a scheduled rule that joins a static `Watchlist` of expected hosts against `Event | summarize by Computer` over the last 24 hours and alerts on the set difference. This watchdog is the only thing standing between an architectural diagram of perfect convergence and the operational reality of one host's silent agent.

<Aside type="scope" id="A4">
This recipe addresses detection-and-response only. Compliance framing -- mapping detections to MITRE ATT&CK tactics, mapping posture findings to MCSB controls, reporting against PCI-DSS or NIST 800-53 -- is a separate concern handled by MCSB and the MDC regulatory-compliance blade [@ms-learn-mcsb-overview]. Most enterprise SOCs end up doing both, but a working detection pipeline can ship without the compliance layer attached.
</Aside>

With these six steps the Sysmon record from §1 reaches `security.microsoft.com` in roughly nine minutes, three alerts merged into one incident. The pipeline is real. The next section addresses the questions that show up in every architecture-review meeting once the pipeline is built.

## 11. FAQ

<FAQ>
  <FAQItem question="If I already have Defender for Endpoint, do I still need Sentinel?">
It depends on whether you need a SIEM. MDE alone gives you endpoint detection, response actions on the endpoint, and a native incident view inside Defender XDR. It does not give you a place to ingest non-endpoint log sources (firewall, identity provider that is not Microsoft Entra, custom application logs) and run cross-source correlation against them. Sentinel is the SIEM substrate that does that [@ms-learn-mde-landing] [@ms-learn-sentinel-overview]. A small organization whose telemetry is entirely MDE-and-Microsoft-365 can run without Sentinel; one whose threat model includes anything outside that envelope generally needs it.
  </FAQItem>

  <FAQItem question="Posture findings from MDC -- do they appear in Defender XDR incidents?">
No. As §6.5 established and as the Microsoft documentation is explicit about, only MDC's **CWPP alerts** (from Defender for Servers, Containers, SQL, Storage, App Service plans) flow into the Defender XDR incident graph [@ms-learn-mdc-xdr-concept] [@ms-learn-mdc-xdr-ingest]. CSPM-side artifacts -- recommendations, Secure Score deltas, regulatory-compliance findings -- stay in the Microsoft Defender for Cloud blade. If you want posture context attached to an incident, you have to pivot manually to MDC or join `SecurityRecommendation` against the incident's affected resources via KQL.
  </FAQItem>

  <FAQItem question="What is the actual end-to-end latency from kernel event to analyst inbox?">
For the documented Sysmon-to-Sentinel-to-XDR path: roughly 5 to 10 minutes typical. The dominant factor is the Sentinel scheduled-rule cadence (minimum 5 minutes) [@ms-learn-sentinel-scheduled-rules]. NRT rules cut it to 1-2 minutes for single-row matches [@ms-learn-sentinel-nrt-rules]. MDE's native path through Defender XDR is sub-minute for the endpoint detection itself; the cross-pipeline merge happens in the correlation engine within a sliding window after the slowest pipeline reports. Don't promise sub-minute for the SIEM path; do promise sub-minute for the EDR-direct path.
  </FAQItem>

  <FAQItem question="Why does the same host show up twice with different identifiers?">
Because each pipeline names hosts in its own grammar. MDE uses a `DeviceId` (a Microsoft-generated GUID). Sentinel uses `Computer` (the hostname as Windows reports it). MDC uses the Azure `resourceId` for the underlying VM. Microsoft Entra ID uses a directory `ObjectId`. The Defender XDR correlation engine normalizes these where it can [@ms-learn-xdr-correlation] [@ms-learn-sentinel-entities], but in raw KQL queries you have to `join` across the identifier spaces explicitly. The `IdentityInfo` and `DeviceInfo` tables are the join helpers; the entity-resolution problem from §8 is what makes this non-trivial.
  </FAQItem>

  <FAQItem question="When does the Sentinel-in-Azure-portal experience actually go away?">
**March 31, 2027** (extended from the original July 1, 2026 target). After that date, Microsoft Sentinel can only be accessed via the unified Defender portal at `security.microsoft.com` [@ms-learn-sentinel-azure-portal-retiring] [@helpnetsec-sentinel-defender-timeline]. Customers with custom dashboards, automation, or ARM templates targeting the Azure-portal Sentinel surface need to plan migration. The underlying Log Analytics workspace and KQL queries do not change; the analyst UI does.
  </FAQItem>

  <FAQItem question="Custom Detections vs Sentinel scheduled rules -- which should I use?">
It depends on where your telemetry lives. **Sentinel scheduled rules** read from Log Analytics tables (`Event`, `SecurityEvent`, `Syslog`, custom tables) and are the right answer when your detection covers data ingested via DCRs or Sentinel connectors. **Defender XDR Custom Detections** read from the advanced-hunting schema (`DeviceProcessEvents`, `DeviceFileEvents`, `EmailEvents`, etc.) and are the right answer when your detection covers MDE / Defender for Office / Defender for Identity-native telemetry [@ms-learn-sentinel-custom-detections] [@ms-learn-advanced-hunting]. The two are not interchangeable; the field names and result-size caps differ. A common operational pattern is "Sentinel for everything Sysmon and third-party, Custom Detections for everything MDE-native."
  </FAQItem>

  <FAQItem question="Can I write back from Sentinel to MDC -- e.g., suppress an MDC alert from a Sentinel automation rule?">
Partially, and only by hand. Sentinel automation rules invoke Azure Logic Apps playbooks, and those playbooks can call the Microsoft Defender for Cloud REST API directly to take actions like creating an assessment or (with limited surface area) acknowledging an alert [@ms-learn-sentinel-logic-apps-playbooks] [@ms-learn-mdc-assessments-rest] [@ms-learn-mdc-custom-recs]. There is no first-class "MDC alert action" Logic Apps connector with the same breadth as the MDE connector. Customers building bidirectional Sentinel-MDC response automation write HTTP-action playbooks against the MDC REST API and accept that the integration is less native than the MDE side.
  </FAQItem>
</FAQ>

<StudyGuide
  terms={[
    'SIEM',
    'SOAR',
    'EDR',
    'XDR',
    'CSPM',
    'CWPP',
    'MCSB',
    'KQL',
    'Log Analytics workspace',
    'Azure Monitor Agent (AMA)',
    'Data Collection Rule (DCR)',
    'ProcessGuid',
    'Sentinel scheduled analytics rule',
    'Entity mapping',
    'Defender XDR correlation engine',
  ]}
  questions={[
    'Trace a single Sysmon ProcessCreate event through the six hops named in §6. At each hop, state the artifact that does the work and the most common silent-failure mode.',
    'Why do MDC posture findings not appear in Defender XDR incidents, while MDC CWPP alerts do? Cite the architectural reason, not just the documented behaviour.',
    'You inherit a Sentinel deployment whose Sysmon detections "used to work." The Event table is empty for half the inventory. Name three places to check, in priority order.',
    'Compare Sentinel scheduled rules and Defender XDR Custom Detections along three axes: schema read, latency, governance. When would you choose each?',
    'A SOC analyst says "the unified incident graph is missing alerts from our European workspace." What is the most likely cause, and what is the workaround?',
    'Explain why the AMA DCR streams value "Microsoft-Event" vs "Microsoft-WindowsEvent" vs a typo all produce different outcomes, and what validation step catches the silent miss.',
  ]}
/>
