DumpDetective.Cli 2.2.0

There is a newer version of this package available.
See the version list below for details.
dotnet tool install --global DumpDetective.Cli --version 2.2.0
                    
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
                    
if you are setting up this repo
dotnet tool install --local DumpDetective.Cli --version 2.2.0
                    
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=DumpDetective.Cli&version=2.2.0
                    
nuke :add-package DumpDetective.Cli --version 2.2.0
                    

DumpDetective

A command-line tool for understanding .NET production incidents from dumps and traces.

DumpDetective analyzes .dmp / .mdmp memory dumps and .nettrace / .etl traces, then generates human-readable reports that help you answer practical questions quickly:

  • Why is memory growing?
  • What is retaining objects?
  • Are we seeing thread pool pressure, deadlocks, or heavy contention?
  • Are exception rates, allocations, or GC pauses abnormal?

Every command writes an HTML report alongside the dump file by default. Use --output report.bin to save a compact structured report, then DumpDetective render report.bin to convert it to any format at any time without re-opening the dump.

Features

  • One-command health report (analyze) with a score and prioritized findings.
  • Deep memory diagnostics (memory-leak, high-refs, gc-roots, object-inspect).
  • Combined trace diagnostics (trace-analyze) plus focused trace commands (cpu-trace, alloc-trace, gc-trace, contention-trace, exceptions-trace, threadpool-starvation).
  • Multi-dump trend analysis for comparing behavior over time.
  • Interactive HTML reports with grouped navigation, charts, dark mode, and paged tables.
  • Export and replay support across HTML, Markdown, text, JSON, and compressed binary.
  • Optional BFS cache (.bfs.idx) for faster repeated retained-size analysis on large heaps.

Start Here

If you are new, use this path:

  1. Run one full report on your dump: DumpDetective analyze app.dmp --full.
  2. Open the generated HTML and check top findings, memory-leak, and high-refs sections first.
  3. If needed, zoom in with targeted commands like object-inspect, gc-roots, or trace commands on .nettrace / .etl files.

Contents


Requirements

Software

Requirement Version
.NET SDK 10.0+
Target dump runtime .NET Framework 4.x / .NET Core / .NET 5+
OS Windows (WinDbg-style dumps)

Hardware

Hardware requirements scale with the dump you are analysing. The numbers below are based on measured runs.

Minimum (small dumps, < 4 GB)

Component Minimum
RAM 4 GB free
Storage SSD required — dump is memory-mapped with random I/O patterns; HDD will be severely slow
CPU 4 physical cores (8 logical) — the heap walk uses 8 parallel workers; fewer cores will time-slice them and increase wall-clock time significantly

Recommended (production dumps, 4–30 GB)

Component Recommended Why
RAM 16 GB free minimum, 24 GB preferred for analyze --full on very large dumps Heap walk, BFS cache load, and the heaviest sub-reports can temporarily push peak working set into the 15-17 GB range on 100M+ object dumps; newer BFS caching intentionally trades more RAM for less repeated retained-size work
Storage NVMe SSD Random I/O across entire dump file; faster SSD = faster heap walk
CPU 8 physical cores (16 logical) Heap walk uses 8 workers; a second concurrent walk (event-analysis or heap-fragmentation) can spin up another 8 — 16 logical cores prevents contention

Rule of thumb: free RAM should scale with both dump size and analysis mode. Lightweight or single-command runs are usually much cheaper than analyze --full. For very large dumps, keep at least 16 GB free; 24 GB+ is safer if you want all sub-reports, BFS-heavy retention analysis, and fragmentation metrics in one run.

SSD vs HDD: ClrMD memory-maps the dump and accesses it with highly random I/O during the heap walk, BFS, and fragmentation scan. An NVMe SSD completes a 25 GB / 110 M object dump in ~6–8 minutes. A spinning disk will typically take 20–40 minutes for the same dump and may cause the OS to thrash swap.


Installation

Install as a global .NET tool from NuGet.org:

dotnet tool install --global DumpDetective.Cli --version 2.2.1

Once installed, the tool is available as:

DumpDetective <command>

To update to the latest version:

dotnet tool update --global DumpDetective.Cli

To uninstall:

dotnet tool uninstall --global DumpDetective.Cli

Build

dotnet build

For a self-contained, AOT-compiled single executable:

dotnet publish DumpDetective.Cli -r win-x64 -c Release

The output is a single native binary: DumpDetective.Cli.exe.


Quick Start

Choose the path that matches your input type.

Memory Dump Quick Start (.dmp, .mdmp)

  1. Run one full report: DumpDetective analyze app.dmp --full
  2. Open the generated HTML report.
  3. Use targeted dump commands only where needed.
# Optional: set a default dump path
$env:DD_DUMP = "C:\dumps\w3wp.dmp"

# Fast first run
DumpDetective analyze app.dmp --full

# Include peak memory diagnostics
DumpDetective analyze app.dmp --full --debug

# Save as .bin for replay later (recommended)
DumpDetective analyze app.dmp --full --output report.bin
DumpDetective render report.bin

# Write both HTML and .bin in one pass
DumpDetective analyze app.dmp --full -o report.html -o report.bin

# Run a focused dump command
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained

# Trend across multiple dumps
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --full --output week1.bin
DumpDetective trend-analysis d4.dmp d5.dmp d6.dmp --full --output week2.bin
DumpDetective diff week1.bin week2.bin -o delta.html

.NET Trace Quick Start (.nettrace, .etl)

  1. Start with a combined trace report: DumpDetective trace-analyze app.nettrace
  2. Open the HTML report to identify hotspots.
  3. Run a single trace command for deeper drill-down.
# Combined trace analysis (CPU + alloc + GC + exceptions + contention + starvation)
DumpDetective trace-analyze app.nettrace

# Focused trace commands
DumpDetective cpu-trace app.nettrace --output cpu-report.html
DumpDetective gc-trace perf.etl --process w3wp --top 50 --output gc-report.html
DumpDetective threadpool-starvation perf.etl --top 50 --output starvation.html

Environment Variables

Variable Description
DD_DUMP Default dump file path. Used automatically when no .dmp argument is provided.

Commands

Detailed command references:

Command Families

Family Commands Input
Health / orchestration analyze, trend-analysis dump files
Report replay / comparison render, diff saved .json / .bin
Memory dump analysis heap-stats, gen-summary, memory-leak, gc-roots, object-inspect, build-bfs, and related dump commands .dmp, .mdmp
Trace analysis trace-analyze, cpu-trace, alloc-trace, gc-trace, contention-trace, exceptions-trace, threadpool-starvation .nettrace, .etl

The sections below follow that same split: high-level workflows first, then dump-only commands, then trace-only commands.

analyze

Scored health report for a single dump.

DumpDetective analyze <dump-file> [options]

Options:
  --full               Full combined report (scored summary + all sub-reports in parallel)
  --debug              Print peak working set / managed heap / private bytes at exit
  -o, --output <file>  Write report to file (.html / .md / .txt / .json / .bin)
                       Repeatable: -o report.html -o report.bin  writes both files
  --format <fmt>       Output format shorthand: html|md|json|bin
                       Repeatable: --format html --format bin  writes both files
                       Combined: -o report.html --format bin  auto-adds report.bin
  Default: writes <dump-name>.html alongside the dump

What it covers:

  • Health score (0-100) with per-finding deductions
  • Findings grouped as Critical / Warning / Info with recommendations
  • Memory: heap by generation (SOH / LOH / POH), fragmentation
  • Threads: blocked, async backlog, thread pool saturation
  • Exceptions, finalizer queue, GC handles (pinned / strong / weak)
  • Event handler leaks, string duplication, timer objects
  • WCF channels, DB connections, top types by size

Examples:

DumpDetective analyze app.dmp
DumpDetective analyze app.dmp --full
DumpDetective analyze app.dmp --full --output full-report.html
DumpDetective analyze app.dmp --full --output full-report.html --format bin
DumpDetective analyze app.dmp --full --output full-report.html --debug
DumpDetective analyze app.dmp --format bin     # Brotli-compressed output

trend-analysis

Cross-dump trend report comparing two or more snapshots over time.

DumpDetective trend-analysis <dump1> <dump2> [<dump3>...] [options]
DumpDetective trend-analysis <directory> [options]
DumpDetective trend-analysis --list <file.txt> [options]

Options:
  --full                   Full collection per dump (event leaks, string duplicates,
                           and per-dump sub-reports embedded in .json/.bin output)
  --baseline <n>           1-based index of the dump to use as the trend baseline (default: 1)
  --prefix <p>             Prefix for dump labels (default: D → D1, D2, D3).
                           E.g. --prefix W → W1, W2, W3
  --ignore-event <type>    Exclude publisher types whose name contains <type> (repeatable)
  -o, --output <f>         Write report to file (.html / .md / .txt)
                           .json  -- saves raw snapshot data (re-render any time with 'render')
                           .bin   -- saves Brotli-compressed raw snapshot data
                           Repeatable: -o trends.html -o trends.bin  writes both files
  --format <fmt>           Format shorthand: html|md|json|bin
                           Repeatable: --format html --format bin  writes both files
                           Combined: -o trends.html --format bin  auto-adds trends.bin
  Default: writes <command>.html in the current directory

Report sections:

# Section
0 Dump Timeline
1 Incident Summary -- signal status table, per-dump findings accordions, executive paragraph
2 Overall Growth Summary
3 Thread and Application Pressure
4 Event Leak Analysis
5 Finalizer Queue Detail
6 Highly Referenced Objects
7 Rooted Objects Analysis
8 Duplicate String Analysis

Examples:

DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --output trends.html
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --baseline 2 --output report.html
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --full --output snapshots.json
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --full --output snapshots.bin  # compressed
DumpDetective trend-analysis C:\dumps\ --full --output report.html
DumpDetective trend-analysis --list dumps.txt --full --output report.md
DumpDetective trend-analysis d1.dmp d2.dmp --full --ignore-event SNINativeMethodWrapper
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --prefix W --output week1.html

diff

Compares two saved report files (.json or .bin) and produces a diff report. No dump file required.

DumpDetective diff <before.json|before.bin> <after.json|after.bin> [options]

Supported input formats:
  report     Produced by any single-dump command with -o *.json or -o *.bin
  trend-raw  Produced by trend-analysis -o *.json or -o *.bin

What is diffed:
  Tables       Rows matched by key column (default: col 0). Changed cells: before → after.
               Per-dump tables (Dump Timeline, Rooted Objects, etc.) are matched positionally.
  Alerts       Matched by title. Level and detail changes highlighted.
  Key-Values   Matched by key. Changed values: before → after.
  Details      Accordion blocks included from the "after" file as-is.

Options:
  --key-col <n>        Column index (0-based) used as the row key for tables (default: 0)
  --changed-only       Omit chapters/sections with no changes
  --show-same          Include unchanged rows in diff tables (default: omitted)
  --command <name>     For trend-raw: diff only this command's sub-report chapters (repeatable).
                       Dumps matched by filename; if no filenames overlap (different dump sets),
                       falls back to positional matching (Dump 1 ↔ Dump 1, etc.)
  --ignore-event <t>   For trend-raw: exclude event publisher types containing <t> (repeatable)
  -o, --output <file>  Output path (.html / .md / .txt / .json / .bin)
                       Default: <before>-vs-<after>.html
  -h, --help           Show this help

Examples:

# Single-dump report diff
DumpDetective analyze before.dmp --full -o before.bin
DumpDetective analyze after.dmp  --full -o after.bin
DumpDetective diff before.bin after.bin -o delta.html

# Trend-raw diff (week-over-week)
DumpDetective diff week1.bin week2.bin -o trend-delta.html
DumpDetective diff week1.bin week2.bin --changed-only -o delta.html

# Diff only the memory-leak sub-report across two trend files
DumpDetective diff week1.bin week2.bin --command memory-leak -o memleak-delta.html

# Multiple commands at once
DumpDetective diff week1.bin week2.bin --command memory-leak --command heap-stats -o subset.html

render

Converts any DumpDetective JSON or compressed binary file to HTML, Markdown, or plain text -- no dump file required. (Previously also available as trend-render; that alias has been removed — use render for all file conversions.)

DumpDetective render <file.json|file.bin> [options]

Accepted input:
  report     JSON or .bin produced by any single-dump command with --output *.json / *.bin
  trend-raw  JSON or .bin produced by trend-analysis --output *.json / *.bin

Options:
  --baseline <n>         Trend baseline (trend-raw only; default: 1 = first dump)
  --ignore-event <type>  Filter event types (trend-raw only; repeatable)
  --mini                 Trend summary only -- suppress per-dump sub-reports even
                         when they are present in the file (trend-raw only)
  --from <n>             Extract dump #N's full sub-report as a standalone file.
                         Requires the file to have been saved with --full. 1-based.
  --command <name>       Extract only the named command's chapter(s).
                         Combine with --from to target a single dump.
                         Repeatable: --command memory-leak --command heap-stats
                         Valid names: any command that runs in analyze --full
  -o, --output <file>    Output file (.html / .md / .txt / .json / .bin)
                         Repeatable: -o report.html -o report.bin  writes both files
  --format <fmt>         Format shorthand: html|md|json|bin
                         Repeatable: --format html --format bin  writes both files
                         Combined: -o report.html --format bin  auto-adds report.bin
  Default: writes <input-name>.html

Examples:

# Default: renders to report.html
DumpDetective render report.bin
DumpDetective render snapshots.json

# Explicit output format
DumpDetective render snapshots.bin --output report.html
DumpDetective render snapshots.bin --format md

# Trend summary only (no per-dump sub-reports)
DumpDetective render snapshots.bin --mini --output trend-only.html

# Re-render at a different baseline
DumpDetective render snapshots.bin --baseline 2 --output report-d2base.html

# Extract dump #4's full sub-report as a standalone file
DumpDetective render snapshots.bin --from 4 --output d4-full.html

# Extract just the memory-leak chapter from dump #4
DumpDetective render snapshots.bin --from 4 --command memory-leak --output d4-memleak.html

# Extract memory-leak from every dump, stacked in one file
DumpDetective render snapshots.bin --command memory-leak --output all-memleak.html

# Multiple commands from dump #2
DumpDetective render snapshots.bin --from 2 --command memory-leak --command heap-stats --output d2-subset.html

# JSON is still supported when needed
DumpDetective render snapshots.json --output report.html

# Convert a single-dump report BIN / JSON to HTML
DumpDetective render heap-stats.bin
DumpDetective render heap-stats.json

Note: --from and --command require trend-raw data saved with --full (from .json or .bin). If the source was saved without --full, sub-reports are not present and extraction will fail with a clear error message.


Memory Dump Commands

Each command accepts <dump-file> and --help. By default every command writes <dump-name>.html alongside the dump file. Use --output <file> or --format <fmt> to change this. Both -o and --format are repeatable: -o report.html -o report.bin or --format html --format bin writes both files simultaneously. You can also mix them: -o report.html --format bin auto-adds report.bin.

Command Incl. in --full Description
heap-stats Yes Heap object counts and sizes grouped by type
gen-summary Yes Object counts and sizes by GC generation
heap-fragmentation Yes Segment free space and fragmentation percentage
large-objects Yes Large objects on LOH / POH / Gen heap
pinned-objects Yes Pinned GC handles causing heap fragmentation
memory-leak Yes Suspect types with root-chain BFS traces
high-refs Yes Highly-referenced "hub" objects -- caches, shared state
string-duplicates Yes Duplicate strings and wasted memory
finalizer-queue Yes Objects waiting in the finalizer queue
handle-table Yes GC handles grouped by kind
static-refs Yes Non-null static reference fields with retained-size analysis
weak-refs Yes WeakReference handles -- alive vs collected
thread-analysis Yes Thread states, blocking objects, stack traces
thread-pool Yes ThreadPool state and queued work items
deadlock-detection Yes Deadlock cycles in the wait graph
async-stacks Yes Suspended async state machines at await points
exception-analysis Yes Exception objects on heap and active threads
event-analysis Yes Event handler leaks -- publisher types, field names, subscriber counts, retained memory
http-requests Yes In-flight HTTP request objects
connection-pool Yes Database connection objects and leak detection
wcf-channels Yes WCF service/channel objects and their state
timer-leaks Yes Timer objects and their callback targets
module-list Yes Loaded assemblies with path and size
gc-roots No GC roots and referrers for a given type (too slow for --full)
type-instances No All instances of a given type (--type <name> required)
object-inspect No All field values of an object with optional retained-size BFS (--address <hex> required)
build-bfs No Pre-build the BFS retained-size index cache (.bfs.idx) for a dump file or every dump in a directory

object-inspect

Inspects all fields of a single managed object. With --retained it computes the exclusive retained size of every reference field using BFS.

DumpDetective object-inspect <dump-file> --address <hex> [options]

Options:
  -x, --address <hex>        Object address (hex, e.g. 0x00000276DB084170)  [required]
  -d, --depth <N>            Recursion depth into references (default: 1)
  --max-array <N>            Max array elements to show (default: 10)
  --retained, -r             Compute retained size per reference field
  --retained-cap <N>         Max BFS nodes per field (0 = unlimited, default: 0)
  --no-cache                 Ignore existing .bfs.idx cache; use in-request BFS
  --no-save                  Do not save a new .bfs.idx cache after building
  -h, --help                 Show this help

Examples:

# Inspect a single object (no retained sizes)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170

# Inspect with retained-size BFS per field (builds cache on first run)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained

# Inspect with cache loaded (fast — no BFS rebuild)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained

# Recurse 3 levels deep, all fields use cache
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained -d 3

# Cap BFS per field to 1M nodes (fast estimate for very deep graphs)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained --retained-cap 1000000

Tip: Run build-bfs once before object-inspect --retained so the first retained-size run is instant.


Trace Commands

These commands accept a trace file, not a memory dump. Supported trace inputs are .nettrace and .etl.

Command Description
trace-analyze Combined trace report that opens the trace once and runs the supported trace analyzers in sequence
cpu-trace CPU hot path, top methods, and call tree analysis
alloc-trace Allocation hotspot analysis based on GCAllocationTick events
gc-trace GC pause analysis, trigger reasons, and per-collection heap metrics
exceptions-trace First-chance exception volume, type breakdown, and flood detection
contention-trace Lock contention hotspot and wait-time analysis
threadpool-starvation ThreadPool starvation detection from wait and adjustment events

trace-analyze

Opens a trace file once and runs the trace analyzers as a single combined report. This is the trace equivalent of a combined dump analysis run.

DumpDetective trace-analyze <trace-file> [options]

Supported input formats:
  .nettrace    EventPipe trace collected with a suitable profile
  .etl         Windows ETW trace

Options:
  -n, --top <N>            Top N items per section (default: 20)
  --process <name>         Filter to a specific process name
  --show-system            Include system/kernel frames in CPU tree (default: hidden)
  -o, --output <file>      Write report to file (.html / .md / .txt / .json / .bin)
  --format <fmt>           Output format shorthand: html|md|json|bin
  -h, --help               Show this help

Included sub-reports:

  • cpu-trace for CPU hot paths and call trees.
  • alloc-trace for allocation hotspots.
  • gc-trace for GC pause timing and trigger reasons.
  • exceptions-trace for exception flood detection.
  • contention-trace for lock hotspots and wait times.
  • threadpool-starvation for ThreadPool starvation signals.

Examples:

DumpDetective trace-analyze app.nettrace
DumpDetective trace-analyze perf.etl --process w3wp --output trace-report.html
DumpDetective trace-analyze app.nettrace --top 30 --show-system

Individual trace analyzers

Use these when you want a single signal instead of the combined trace-analyze report:

DumpDetective cpu-trace app.nettrace --top 40 --output cpu.html
DumpDetective alloc-trace app.nettrace --process w3wp --output alloc.html
DumpDetective gc-trace perf.etl --process w3wp --top 50 --output gc.html
DumpDetective exceptions-trace app.nettrace --output exceptions.html
DumpDetective contention-trace perf.etl --process w3wp --output contention.html
DumpDetective threadpool-starvation perf.nettrace --top 50 --output starvation.html

Common trace use cases:

  • Use cpu-trace when you need hot methods, hot paths, and a call tree.
  • Use alloc-trace when the problem is allocation churn or GC pressure.
  • Use gc-trace when you need pause distributions, trigger reasons, or explicit GC.Collect() detection.
  • Use exceptions-trace when a service is throwing at high volume or hiding error floods.
  • Use contention-trace when threads are blocked on locks and you need hotspot call sites.
  • Use threadpool-starvation when the runtime is under worker-thread pressure or sync-over-async blocking is suspected.

build-bfs

Pre-builds and saves a BFS forward-reference index (.bfs.idx) alongside each dump file. Once built, object-inspect --retained loads it in seconds instead of re-walking the entire heap.

Accepts either a single dump file or a directory containing multiple dumps. When a directory is given, each .dmp/.mdmp file is processed sequentially — one at a time so peak memory stays bounded.

DumpDetective build-bfs <dump-file-or-directory> [options]

Options:
  --force, -f    Rebuild even if a valid cache already exists
  --recurse, -r  When input is a directory, also search subdirectories
  -h, --help     Show this help

How it works:

The builder runs a parallel 3-pass algorithm over the managed heap:

Pass What it does
1 — enumerate Assigns a stable integer index to every live object; records shallow size
2 — count edges Counts outbound references per node (determines CSR array sizes)
3 — fill edges Fills the CSR edge arrays with child node indices

The resulting graph is a Compressed Sparse Row (CSR) structure stored as a Brotli-compressed binary file next to the dump (<dump>.bfs.idx). The cache is validated against the dump's file size and last-write timestamp — a stale or mismatched cache is automatically ignored and rebuilt.

Once loaded, ComputeRetained runs a pure in-memory BFS with zero ClrMD I/O, completing in milliseconds per field regardless of heap size.

This cache is now intentionally optimized for report speed rather than minimum memory footprint. A small change in the retained-size caching path improved estimate precision and made repeated retained-size work much cheaper, but the loaded cache can consume noticeably more RAM on very large heaps.

Observed tradeoff on a large heap:

Metric Example value
Managed objects ~100 M
.bfs.idx file size ~725 MB
Additional RAM after cache load ~3 GB
Repeated retained-size work avoided ~3-7 min
Typical full-report improvement ~600-800s → ~200-400s

If you have enough memory headroom, this is usually a net win: faster reports, less repeated BFS work, and better retained-size estimate accuracy. If RAM is tight, use targeted commands or skip cache loading with --no-cache.

Typical timings (22 GB / 63 M node heap):

Phase Time
Pass 1 — enumerate (8 parallel segments) ~35 s
Pass 2 — count edges (8 parallel segments) ~40 s
Pass 3 — fill edges (8 parallel segments) ~38 s
Save (Brotli Optimal, chunked) ~30 s
Total build ~2.5 min
Load (subsequent runs) ~6 s
Retained BFS per field (post-load) < 2 s

Examples:

# Build and save for a single dump (one-time setup)
DumpDetective build-bfs app.dmp

# Force rebuild (e.g. after a code update)
DumpDetective build-bfs app.dmp --force

# Build caches for all dumps in a directory (skips already-valid caches)
DumpDetective build-bfs D:\dumps

# Build recursively, rebuild all even if caches exist
DumpDetective build-bfs D:\dumps --recurse --force

# Then use instantly in object-inspect
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained

Output Formats

Specify an output file with -o / --output, or use --format without a filename:

Extension / keyword --format value Format
.html html Interactive HTML — sticky sidebar nav, grouped/collapsible sub-report navigation, built-in charts, sortable/filterable paged tables, dark mode toggle, styled alert cards
.md md Markdown — suitable for wiki pages or GitHub
.json json Structured JSON — full report data, re-renderable to any other format with render
.bin bin Brotli-compressed JSON — same structure as .json, ~50–70% smaller, non-human-readable
.txt txt Plain text

Default output

When -o / --output and --format are both omitted, every command writes <dump-name>.html alongside the dump file.

Multi-output and --format

Both -o / --output and --format are repeatable — you can write multiple formats in one run:

# Two explicit output files
DumpDetective heap-stats app.dmp -o report.html -o report.bin

# Two formats — files auto-named from dump name
DumpDetective heap-stats app.dmp --format html --format bin   # -> app.html + app.bin

# Mix -o and --format: explicit path plus extra format(s)
DumpDetective analyze app.dmp --full -o report.html --format bin   # -> report.html + report.bin

# trend-analysis: write snapshot data AND the rendered report in one pass
DumpDetective trend-analysis d1.dmp d2.dmp --full --format html --format bin  # -> trend-analysis.html + trend-analysis.bin

# render: convert to two formats at once
DumpDetective render snapshots.json -o report.html -o report.md

--format without a full filename auto-names the file after the dump (or input file for render):

DumpDetective heap-stats app.dmp --format md      # -> app.md
DumpDetective heap-stats app.dmp --format bin     # -> app.bin
DumpDetective render snapshots.json --format md   # -> snapshots.md

Dark mode (HTML output)

The HTML report includes a 🌙 Dark mode toggle button in the sidebar. Your preference is saved in localStorage and respected on subsequent opens. The initial theme follows your OS prefers-color-scheme setting.

HTML report UX

The HTML renderer is designed for large real-world dumps and full combined reports. Current capabilities include:

  • Grouped sub-report navigation in the sidebar for analyze --full and trend-style combined outputs.
  • Collapsible nav groups with stable active-section highlighting while scrolling through large reports.
  • Self-contained charts and summary visuals embedded directly into the generated HTML.
  • Sortable, filterable tables with paging.
  • Global rows-per-page control plus per-table override for especially large sections.
  • A single output file with embedded CSS and JavaScript, so reports remain portable and easy to share.

JSON / binary output and re-rendering

Use --output report.json or --output report.bin (Brotli-compressed) with any command to capture a fully structured report. Both formats preserve all report data — chapters, sections, tables, key-value pairs, alerts, findings, and details accordions — including chapter nav levels and polymorphic element types.

There are two structured formats:

format field Produced by Contents
"report" Any single-dump command with --output *.json / *.bin Full rendered report document
"trend-raw" trend-analysis --output *.json / *.bin Raw snapshot metrics + optional captured sub-reports

Both are handled transparently by DumpDetective render — it auto-detects the format and decompresses .bin automatically:

DumpDetective render heap-stats.json
DumpDetective render heap-stats.bin
DumpDetective render analyze-full.json  --output report.md
DumpDetective render snapshots.json     --baseline 2 --output report.html
DumpDetective render snapshots.bin      --baseline 2 --output report.html

The trend-raw format is especially useful: save once with --full, then re-render at any baseline, format, or time without touching the original dump files. Use .bin for long-term archival — it is typically 50–70% smaller than the equivalent .json.


Project Structure

DumpDetective.slnx

DumpDetective.Core/               Models, interfaces, shared utilities
  Interfaces/
    ICommand.cs                   Name, Description, IncludeInFullAnalyze, Run, BuildReport
    IRenderSink.cs                Format-agnostic output interface
    IHeapObjectConsumer.cs        Heap-walk consumer interface
  Models/
    DumpSnapshot.cs               All collected metrics for one dump (AOT JSON-serialisable)
    Finding.cs                    Scored finding (severity, category, headline, advice)
    ReportDoc.cs                  Replayable report document model (chapters > sections > elements)
    ThresholdConfig.cs            Configurable scoring / trend thresholds
  Runtime/
    DumpContext.cs                ClrMD DataTarget + ClrRuntime wrapper
    HeapSnapshot.cs               TypeStats, InboundCounts, StringGroups, gen counters
  Utilities/
    CliArgs.cs                    Shared argument parser (--help, --output, DD_DUMP, flags)
    CommandBase.cs                Execute lifecycle, TryHelp, RunStatus
    ExecutionContext.cs           Thread-local verbose suppression and parameter overrides
    OperationTrace.cs             Per-thread operation timing for full-analyze progress display
    DumpHelpers.cs                FormatSize, IsSystemType, OpenDump, SegmentKindLabel
    HealthScorer.cs               Score(DumpSnapshot, ScoringThresholds) -> (Findings, score)
    ProgressLogger.cs             Live spinner + [SCAN] completion lines via Spectre.Console
    SinkFactory.cs                Creates single or multi-output IRenderSink from path lists
    ThresholdLoader.cs            Lazy-loads dd-thresholds.json; silent fallback to defaults

DumpDetective.Analysis.Memory/    ClrMD data collection and heap walking
  DumpCollector.cs                CollectFull / CollectLightweight orchestration
  HeapWalker.cs                   Single EnumerateObjects() call feeding all consumers
  HeapObjectCollector.cs          Manages consumer registration and walk execution
  RuntimeSubCollectors.cs         Thread, handle, module, finalizer-queue sub-collectors
  SnapshotPopulator.cs            Writes consumer results back into DumpSnapshot
  TrendRawSerializer.cs           DumpSnapshot JSON storage for trend analysis
  BfsIndexBuilder.cs              3-pass parallel CSR graph builder for .bfs.idx cache
  BfsIndexCache.cs                Load / validate / save the .bfs.idx cache
  Consumers/                      IHeapObjectConsumer implementations (one concern each)
    TypeStatsConsumer.cs
    InboundRefConsumer.cs
    StringGroupConsumer.cs
    GenCounterConsumer.cs
    AsyncMethodConsumer.cs
    ExceptionCountConsumer.cs
    HttpRequestsConsumer.cs
    ThreadNameConsumer.cs
    ThreadPoolConsumer.cs
    LightweightStatsConsumer.cs
    ReferrerConsumer.cs
    FragmentationConsumer.cs
    EventDetailConsumer.cs
    ConditionalWeakTableConsumer.cs
  Analyzers/                      Per-command analysis logic (pure POCO in / POCO out)
    HeapStatsAnalyzer.cs          ... (one file per memory-dump command)
    SharedReferrerCache.cs        Reverse-reference graph shared by MemoryLeak + HighRefs

DumpDetective.Analysis.Trace/     .nettrace / ETL data collection
  Analyzers/
    CpuTraceAnalyzer.cs           ... (one file per trace command)

DumpDetective.Reporting/          Output format implementations
  Sinks/
    HtmlSink.cs                   Self-contained HTML; inline CSS/JS; sticky nav; virtual scroll
    MarkdownSink.cs
    TextSink.cs
    ConsoleSink.cs
    JsonSink.cs
    BinSink.cs                    Brotli-compressed JSON
    CaptureSink.cs
  Reports/                        Per-command report builders (one file per command)
    AnalyzeReport.cs
    TrendAnalysisReport.cs
    HeapStatsReport.cs            ... (one file per command)
  ReportDocReplay.cs              Replays a ReportDoc through any IRenderSink
  ReportDiffer.cs                 Produces diff ReportDoc from two ReportDoc inputs
  ReportDocSlicer.cs              Extracts sub-chapters by command name or dump index
  ObjectInspectRenderer.cs        Field-level retained-size table builder
  ToolMemoryDiagnostic.cs         Peak working-set / managed-heap / private-bytes poller

DumpDetective.Commands/           ICommand implementations
  Memory/                         Memory-dump commands (namespace DumpDetective.Commands.Memory)
    AnalyzeCommand.cs             Orchestrator; runs FullAnalyzeCommands in parallel
    TrendAnalysisCommand.cs       Cross-dump trend report
    HeapStatsCommand.cs
    MemoryLeakCommand.cs          ... (one file per memory command; 28 total)
  Trace/                          Trace commands (namespace DumpDetective.Commands.Trace)
    TraceAnalyzeCommand.cs        Orchestrator; runs all six trace sub-analyzers
    CpuTraceCommand.cs
    AllocTraceCommand.cs          ... (one file per trace command; 7 total)
  RenderCommand.cs                Convert any JSON/BIN report to any output format
  DiffCommand.cs                  Compare two saved report files
  BuildBfsCommand.cs              Pre-build BFS retained-size index cache

DumpDetective.Cli/                Entry point -- the AOT executable
  Program.cs                      Top-level statements; --debug flag; default HTML output injection
  CommandRegistry.cs              Single source of truth for all ICommand instances
  HelpPrinter.cs                  Formats --help output

DumpDetective.Tests/              xUnit test project (no AOT)

dd-thresholds.json                Override default scoring/trend thresholds (place next to exe)

Dependency graph

Cli ──────────────────────────────────────► Commands
 │                                              │
 │                                              ▼
 │                               Analysis.Memory ──────┐
 │                               Analysis.Trace  ──────┤
 │                                                     │
 └──────────────────► Reporting ──────────► Core ◄─────┘

Health Score

The analyze command produces a score from 0-100 for the dump, deducting points for each finding:

Signal Deduction
Event leak > 1000 subscribers on a single field -20
Thread pool saturated -15
Heap > 2 GB -15
Finalizer queue > 500 objects -15
Async backlog > 500 continuations -10
Heap fragmentation >= 40% -10
DB connections > 50 -10
LOH > 500 MB -10
WCF faulted channels -10
Event leaks (moderate) -10
Blocked threads > 20 -10
Heap fragmentation 20-40% -5
Blocked threads 5-20 -5
Finalizer queue 100-500 -5
Async backlog 100-500 -5
Thread pool near capacity -5
Exception threads > 5 -5
String duplication > 100 MB -5
Pinned handles > 2000 -5
Timer objects > 500 -5

Score labels: Healthy (>=85) · Stable (>=70) · Degraded (>=50) · Critical (<50)

Thresholds are fully configurable via dd-thresholds.json placed alongside the executable.


Performance & Resource Expectations

DumpDetective processes dumps by walking every managed object on the heap. Run time and memory scale with object count, not dump file size. Dump file size is only a rough guide — a largely native-memory process can produce a multi-GB dump file with very few managed objects.

Measured Full-Analyze Benchmark

The numbers below come from a real DumpDetective analyze --full run on a production-style IIS worker dump of ~25 GB with 110,472,530 managed objects.

End-to-end timings

Stage Time
Dump load ~1s
Collection total 112.9s
BFS index load 12.3s
All 23 sub-reports (parallel) 281.5s
Total execution time 409.0s

Collection breakdown

Step Objects Time Throughput
Thread scan 155 419ms ~369/s
Handle scan 19,418 2.5s ~7,782/s
Heap walk 110,472,530 77.0s ~1,434,559/s
Finalizer queue scan 4,273,410 32.9s ~129,839/s

Peak tool memory usage

Metric Start Peak Growth
Working set 10.6 MB 15.53 GB +15.52 GB
Managed heap 245.6 KB 15.65 GB +15.65 GB
Private bytes 5.9 MB 16.68 GB +16.68 GB

Memory growth by stage

Stage Working Set Delta Working Set After Managed Delta
Load dump +733.8 MB 746.4 MB +719.5 MB
Heap walk + scoring (full) +8.25 GB 8.97 GB +4.94 GB
BFS cache load +4.18 GB 13.15 GB +5.71 GB
Sub-reports (all) +1.82 GB 14.97 GB -2.55 GB

Slowest analyzers in this run

Analyzer Time
memory-leak 281.5s
high-refs 280.6s
event-analysis 265.4s
heap-fragmentation 264.4s
large-objects 204.0s
finalizer-queue 200.1s

What this means in practice

  • On a ~25 GB dump, the single heap walk is fast enough to process ~110.5M objects in about 77 seconds on a healthy machine.
  • analyze --full is dominated by the retention-heavy analyzers (memory-leak, high-refs, event-analysis, heap-fragmentation, large-objects, finalizer-queue), not by dump load time.
  • BFS cache load is a major but predictable memory spike. If you are short on RAM, prefer targeted commands before running the full combined report.
  • The current BFS cache path is intentionally more aggressive about caching forward-graph data up front. In exchange for a larger in-memory cache, retained-size work is faster and size-estimate precision is better.
  • On very large heaps, this tradeoff is material: a roughly 100M-object dump can produce a .bfs.idx file around 725 MB, add about 3 GB of RAM while loaded, and remove roughly 3-7 minutes of repeated retained-size work from the overall report.
  • In practice, that change can pull a large full-report run from roughly 600-800 seconds down into the 200-400 second range, while also improving retained-size estimate accuracy.
  • For a dump of this scale (~25 GB), NVMe storage and at least 16 GB free RAM are strongly recommended. If you regularly run full analysis on similar dumps, plan for 24 GB+ free RAM.

Heap walk throughput

The single-pass heap walk (which feeds all analysis consumers simultaneously) typically runs at roughly 1,000,000–2,000,000 objects/second on production machines, with the lower end more representative for very large heaps.

See the measured benchmark above for a concrete large-dump example.

Combined estimates per dump

Dump file size Typical object count analyze --full (wall clock) Peak working set
< 500 MB < 1 M < 5 s < 300 MB
500 MB – 4 GB 1 – 15 M 10–30 s < 2 GB
4 – 15 GB ~15 – 50 M 1–3 min 2–6 GB
15 – 30 GB ~50 – 120 M 5–8 min 12–17 GB

Object count is what actually drives analysis time, not file size. Use --debug on a first run to see the exact object count for your dump.

analyze --full includes all 23 sub-reports. analyze without --full finishes right after collection — the table above shows --full times.

What drives --full time

analyze --full runs all 23 sub-reports in parallel; wall-clock time equals the slowest sub-report, not their sum. The five slow ones each do additional heap traversals:

Sub-report ~10 M objects ~100 M objects
static-refs ~6 s ~5.8 min
heap-fragmentation ~5 s ~4.0 min
event-analysis ~6 s ~4.0 min
memory-leak / high-refs (shared BFS) ~6 s ~4.4 min
finalizer-queue ~0.5 s ~2.5 min
All others < 0.3 s usually < 30 s (large-objects can exceed that on very large heaps)

Recent measured sub-report timings on a 110 M object production dump:

Sub-report Time Notes
static-refs 349.9 s Exact full BFS retained-size traversal
high-refs 262.0 s Builds shared referrer map
memory-leak 262.5 s GC roots map + shared referrer map + root tracing
event-analysis 240.1 s Static root map + detailed event scan
heap-fragmentation 239.4 s Fragmentation measurement dominates
finalizer-queue 149.6 s 134.9 s queue read + 14.7 s resurrection scan

Memory usage

Peak working set is driven by the referrer-graph BFS (memory-leak + high-refs), which builds a full reverse-reference map in memory while the other sub-reports run in parallel.

Rule of thumb: plan for roughly 0.5–0.6× the dump file size in free RAM when running --full.

Verified against real dumps:

Dump size Object count Peak working set Ratio
3.65 GB 10.7 M 2.09 GB 0.57×
~25 GB 110.5 M 15.53 GB 0.62×

The ratio stays well below 1× because:

  • ClrMD memory-maps the dump rather than loading it — only touched pages are resident.
  • The BFS map stores only 1 parent address per object (16 B/entry) rather than the full object graph.
  • Large structures (InboundCounts, StringGroups, BfsMap) are released as soon as their last consumer finishes, not held for the full run.

trend-analysis --full across multiple dumps

Each dump is processed sequentially — loaded, analysed, fully released — before the next one begins. Peak memory therefore equals the most expensive single dump in the set, not the sum of all dumps.

Example runtimes from real runs:

Scenario Dump size Object count Total time Peak RAM
Load-test w3wp 3.65 GB 10.7 M 12.5 s 2.09 GB
Production w3wp ~25 GB 110.5 M 409 s (~6.8 min) 15.53 GB

For large production dumps with roughly 100 M+ managed objects, budget roughly 5–8 minutes and 15–17 GB RAM at peak.

Offline render

render on a pre-saved .json completes in under a second for any output format. No dump file or ClrMD overhead is involved.

BFS retained-size cache (.bfs.idx)

object-inspect --retained computes exclusive retained sizes per reference field using BFS. Without a cache this re-walks the full 168 M-edge graph for every field — on a 22 GB heap that takes several minutes.

Run build-bfs once to build and save a Brotli-compressed CSR graph index alongside the dump. On subsequent object-inspect --retained runs the index loads in ~6 s and each per-field BFS completes in under 2 s regardless of heap depth or object count.

Scenario Time
First run (no cache) — 22 GB heap, 63 M objects ~5–15 min
build-bfs (one-time, 8 parallel workers) ~2.5 min
Load existing cache ~6 s
Per-field retained BFS (post-load) < 2 s

Thresholds

Place a dd-thresholds.json file next to DumpDetective.Cli.exe to override any scoring or trend threshold. Missing or invalid files silently fall back to built-in defaults. See the included dd-thresholds.json for the full schema.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version Downloads Last Updated
3.3.0 93 5/25/2026
3.1.0 106 5/16/2026
3.0.1 110 5/14/2026
2.2.0 100 5/2/2026
2.1.2 113 4/26/2026
2.1.1 105 4/25/2026
2.1.0 107 4/24/2026