DumpDetective.Cli
2.2.0
See the version list below for details.
dotnet tool install --global DumpDetective.Cli --version 2.2.0
dotnet new tool-manifest
dotnet tool install --local DumpDetective.Cli --version 2.2.0
#tool dotnet:?package=DumpDetective.Cli&version=2.2.0
nuke :add-package DumpDetective.Cli --version 2.2.0
DumpDetective
A command-line tool for understanding .NET production incidents from dumps and traces.
DumpDetective analyzes .dmp / .mdmp memory dumps and .nettrace / .etl traces, then generates human-readable reports that help you answer practical questions quickly:
- Why is memory growing?
- What is retaining objects?
- Are we seeing thread pool pressure, deadlocks, or heavy contention?
- Are exception rates, allocations, or GC pauses abnormal?
Every command writes an HTML report alongside the dump file by default. Use --output report.bin to save a compact structured report, then DumpDetective render report.bin to convert it to any format at any time without re-opening the dump.
Features
- One-command health report (
analyze) with a score and prioritized findings. - Deep memory diagnostics (
memory-leak,high-refs,gc-roots,object-inspect). - Combined trace diagnostics (
trace-analyze) plus focused trace commands (cpu-trace,alloc-trace,gc-trace,contention-trace,exceptions-trace,threadpool-starvation). - Multi-dump trend analysis for comparing behavior over time.
- Interactive HTML reports with grouped navigation, charts, dark mode, and paged tables.
- Export and replay support across HTML, Markdown, text, JSON, and compressed binary.
- Optional BFS cache (
.bfs.idx) for faster repeated retained-size analysis on large heaps.
Start Here
If you are new, use this path:
- Run one full report on your dump:
DumpDetective analyze app.dmp --full. - Open the generated HTML and check top findings, memory-leak, and high-refs sections first.
- If needed, zoom in with targeted commands like
object-inspect,gc-roots, or trace commands on.nettrace/.etlfiles.
Contents
- Start Here
- Requirements
- Installation
- Build
- Quick Start
- Environment Variables
- Commands
- Output Formats
- Project Structure
- Health Score
- Performance & Resource Expectations
- Thresholds
Requirements
Software
| Requirement | Version |
|---|---|
| .NET SDK | 10.0+ |
| Target dump runtime | .NET Framework 4.x / .NET Core / .NET 5+ |
| OS | Windows (WinDbg-style dumps) |
Hardware
Hardware requirements scale with the dump you are analysing. The numbers below are based on measured runs.
Minimum (small dumps, < 4 GB)
| Component | Minimum |
|---|---|
| RAM | 4 GB free |
| Storage | SSD required — dump is memory-mapped with random I/O patterns; HDD will be severely slow |
| CPU | 4 physical cores (8 logical) — the heap walk uses 8 parallel workers; fewer cores will time-slice them and increase wall-clock time significantly |
Recommended (production dumps, 4–30 GB)
| Component | Recommended | Why |
|---|---|---|
| RAM | 16 GB free minimum, 24 GB preferred for analyze --full on very large dumps |
Heap walk, BFS cache load, and the heaviest sub-reports can temporarily push peak working set into the 15-17 GB range on 100M+ object dumps; newer BFS caching intentionally trades more RAM for less repeated retained-size work |
| Storage | NVMe SSD | Random I/O across entire dump file; faster SSD = faster heap walk |
| CPU | 8 physical cores (16 logical) | Heap walk uses 8 workers; a second concurrent walk (event-analysis or heap-fragmentation) can spin up another 8 — 16 logical cores prevents contention |
Rule of thumb: free RAM should scale with both dump size and analysis mode. Lightweight or single-command runs are usually much cheaper than
analyze --full. For very large dumps, keep at least 16 GB free; 24 GB+ is safer if you want all sub-reports, BFS-heavy retention analysis, and fragmentation metrics in one run.
SSD vs HDD: ClrMD memory-maps the dump and accesses it with highly random I/O during the heap walk, BFS, and fragmentation scan. An NVMe SSD completes a 25 GB / 110 M object dump in ~6–8 minutes. A spinning disk will typically take 20–40 minutes for the same dump and may cause the OS to thrash swap.
Installation
Install as a global .NET tool from NuGet.org:
dotnet tool install --global DumpDetective.Cli --version 2.2.1
Once installed, the tool is available as:
DumpDetective <command>
To update to the latest version:
dotnet tool update --global DumpDetective.Cli
To uninstall:
dotnet tool uninstall --global DumpDetective.Cli
Build
dotnet build
For a self-contained, AOT-compiled single executable:
dotnet publish DumpDetective.Cli -r win-x64 -c Release
The output is a single native binary: DumpDetective.Cli.exe.
Quick Start
Choose the path that matches your input type.
Memory Dump Quick Start (.dmp, .mdmp)
- Run one full report:
DumpDetective analyze app.dmp --full - Open the generated HTML report.
- Use targeted dump commands only where needed.
# Optional: set a default dump path
$env:DD_DUMP = "C:\dumps\w3wp.dmp"
# Fast first run
DumpDetective analyze app.dmp --full
# Include peak memory diagnostics
DumpDetective analyze app.dmp --full --debug
# Save as .bin for replay later (recommended)
DumpDetective analyze app.dmp --full --output report.bin
DumpDetective render report.bin
# Write both HTML and .bin in one pass
DumpDetective analyze app.dmp --full -o report.html -o report.bin
# Run a focused dump command
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained
# Trend across multiple dumps
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --full --output week1.bin
DumpDetective trend-analysis d4.dmp d5.dmp d6.dmp --full --output week2.bin
DumpDetective diff week1.bin week2.bin -o delta.html
.NET Trace Quick Start (.nettrace, .etl)
- Start with a combined trace report:
DumpDetective trace-analyze app.nettrace - Open the HTML report to identify hotspots.
- Run a single trace command for deeper drill-down.
# Combined trace analysis (CPU + alloc + GC + exceptions + contention + starvation)
DumpDetective trace-analyze app.nettrace
# Focused trace commands
DumpDetective cpu-trace app.nettrace --output cpu-report.html
DumpDetective gc-trace perf.etl --process w3wp --top 50 --output gc-report.html
DumpDetective threadpool-starvation perf.etl --top 50 --output starvation.html
Environment Variables
| Variable | Description |
|---|---|
DD_DUMP |
Default dump file path. Used automatically when no .dmp argument is provided. |
Commands
Detailed command references:
Command Families
| Family | Commands | Input |
|---|---|---|
| Health / orchestration | analyze, trend-analysis |
dump files |
| Report replay / comparison | render, diff |
saved .json / .bin |
| Memory dump analysis | heap-stats, gen-summary, memory-leak, gc-roots, object-inspect, build-bfs, and related dump commands |
.dmp, .mdmp |
| Trace analysis | trace-analyze, cpu-trace, alloc-trace, gc-trace, contention-trace, exceptions-trace, threadpool-starvation |
.nettrace, .etl |
The sections below follow that same split: high-level workflows first, then dump-only commands, then trace-only commands.
analyze
Scored health report for a single dump.
DumpDetective analyze <dump-file> [options]
Options:
--full Full combined report (scored summary + all sub-reports in parallel)
--debug Print peak working set / managed heap / private bytes at exit
-o, --output <file> Write report to file (.html / .md / .txt / .json / .bin)
Repeatable: -o report.html -o report.bin writes both files
--format <fmt> Output format shorthand: html|md|json|bin
Repeatable: --format html --format bin writes both files
Combined: -o report.html --format bin auto-adds report.bin
Default: writes <dump-name>.html alongside the dump
What it covers:
- Health score (0-100) with per-finding deductions
- Findings grouped as Critical / Warning / Info with recommendations
- Memory: heap by generation (SOH / LOH / POH), fragmentation
- Threads: blocked, async backlog, thread pool saturation
- Exceptions, finalizer queue, GC handles (pinned / strong / weak)
- Event handler leaks, string duplication, timer objects
- WCF channels, DB connections, top types by size
Examples:
DumpDetective analyze app.dmp
DumpDetective analyze app.dmp --full
DumpDetective analyze app.dmp --full --output full-report.html
DumpDetective analyze app.dmp --full --output full-report.html --format bin
DumpDetective analyze app.dmp --full --output full-report.html --debug
DumpDetective analyze app.dmp --format bin # Brotli-compressed output
trend-analysis
Cross-dump trend report comparing two or more snapshots over time.
DumpDetective trend-analysis <dump1> <dump2> [<dump3>...] [options]
DumpDetective trend-analysis <directory> [options]
DumpDetective trend-analysis --list <file.txt> [options]
Options:
--full Full collection per dump (event leaks, string duplicates,
and per-dump sub-reports embedded in .json/.bin output)
--baseline <n> 1-based index of the dump to use as the trend baseline (default: 1)
--prefix <p> Prefix for dump labels (default: D → D1, D2, D3).
E.g. --prefix W → W1, W2, W3
--ignore-event <type> Exclude publisher types whose name contains <type> (repeatable)
-o, --output <f> Write report to file (.html / .md / .txt)
.json -- saves raw snapshot data (re-render any time with 'render')
.bin -- saves Brotli-compressed raw snapshot data
Repeatable: -o trends.html -o trends.bin writes both files
--format <fmt> Format shorthand: html|md|json|bin
Repeatable: --format html --format bin writes both files
Combined: -o trends.html --format bin auto-adds trends.bin
Default: writes <command>.html in the current directory
Report sections:
| # | Section |
|---|---|
| 0 | Dump Timeline |
| 1 | Incident Summary -- signal status table, per-dump findings accordions, executive paragraph |
| 2 | Overall Growth Summary |
| 3 | Thread and Application Pressure |
| 4 | Event Leak Analysis |
| 5 | Finalizer Queue Detail |
| 6 | Highly Referenced Objects |
| 7 | Rooted Objects Analysis |
| 8 | Duplicate String Analysis |
Examples:
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --output trends.html
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --baseline 2 --output report.html
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --full --output snapshots.json
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --full --output snapshots.bin # compressed
DumpDetective trend-analysis C:\dumps\ --full --output report.html
DumpDetective trend-analysis --list dumps.txt --full --output report.md
DumpDetective trend-analysis d1.dmp d2.dmp --full --ignore-event SNINativeMethodWrapper
DumpDetective trend-analysis d1.dmp d2.dmp d3.dmp --prefix W --output week1.html
diff
Compares two saved report files (.json or .bin) and produces a diff report. No dump file required.
DumpDetective diff <before.json|before.bin> <after.json|after.bin> [options]
Supported input formats:
report Produced by any single-dump command with -o *.json or -o *.bin
trend-raw Produced by trend-analysis -o *.json or -o *.bin
What is diffed:
Tables Rows matched by key column (default: col 0). Changed cells: before → after.
Per-dump tables (Dump Timeline, Rooted Objects, etc.) are matched positionally.
Alerts Matched by title. Level and detail changes highlighted.
Key-Values Matched by key. Changed values: before → after.
Details Accordion blocks included from the "after" file as-is.
Options:
--key-col <n> Column index (0-based) used as the row key for tables (default: 0)
--changed-only Omit chapters/sections with no changes
--show-same Include unchanged rows in diff tables (default: omitted)
--command <name> For trend-raw: diff only this command's sub-report chapters (repeatable).
Dumps matched by filename; if no filenames overlap (different dump sets),
falls back to positional matching (Dump 1 ↔ Dump 1, etc.)
--ignore-event <t> For trend-raw: exclude event publisher types containing <t> (repeatable)
-o, --output <file> Output path (.html / .md / .txt / .json / .bin)
Default: <before>-vs-<after>.html
-h, --help Show this help
Examples:
# Single-dump report diff
DumpDetective analyze before.dmp --full -o before.bin
DumpDetective analyze after.dmp --full -o after.bin
DumpDetective diff before.bin after.bin -o delta.html
# Trend-raw diff (week-over-week)
DumpDetective diff week1.bin week2.bin -o trend-delta.html
DumpDetective diff week1.bin week2.bin --changed-only -o delta.html
# Diff only the memory-leak sub-report across two trend files
DumpDetective diff week1.bin week2.bin --command memory-leak -o memleak-delta.html
# Multiple commands at once
DumpDetective diff week1.bin week2.bin --command memory-leak --command heap-stats -o subset.html
render
Converts any DumpDetective JSON or compressed binary file to HTML, Markdown, or plain text -- no dump file required. (Previously also available as trend-render; that alias has been removed — use render for all file conversions.)
DumpDetective render <file.json|file.bin> [options]
Accepted input:
report JSON or .bin produced by any single-dump command with --output *.json / *.bin
trend-raw JSON or .bin produced by trend-analysis --output *.json / *.bin
Options:
--baseline <n> Trend baseline (trend-raw only; default: 1 = first dump)
--ignore-event <type> Filter event types (trend-raw only; repeatable)
--mini Trend summary only -- suppress per-dump sub-reports even
when they are present in the file (trend-raw only)
--from <n> Extract dump #N's full sub-report as a standalone file.
Requires the file to have been saved with --full. 1-based.
--command <name> Extract only the named command's chapter(s).
Combine with --from to target a single dump.
Repeatable: --command memory-leak --command heap-stats
Valid names: any command that runs in analyze --full
-o, --output <file> Output file (.html / .md / .txt / .json / .bin)
Repeatable: -o report.html -o report.bin writes both files
--format <fmt> Format shorthand: html|md|json|bin
Repeatable: --format html --format bin writes both files
Combined: -o report.html --format bin auto-adds report.bin
Default: writes <input-name>.html
Examples:
# Default: renders to report.html
DumpDetective render report.bin
DumpDetective render snapshots.json
# Explicit output format
DumpDetective render snapshots.bin --output report.html
DumpDetective render snapshots.bin --format md
# Trend summary only (no per-dump sub-reports)
DumpDetective render snapshots.bin --mini --output trend-only.html
# Re-render at a different baseline
DumpDetective render snapshots.bin --baseline 2 --output report-d2base.html
# Extract dump #4's full sub-report as a standalone file
DumpDetective render snapshots.bin --from 4 --output d4-full.html
# Extract just the memory-leak chapter from dump #4
DumpDetective render snapshots.bin --from 4 --command memory-leak --output d4-memleak.html
# Extract memory-leak from every dump, stacked in one file
DumpDetective render snapshots.bin --command memory-leak --output all-memleak.html
# Multiple commands from dump #2
DumpDetective render snapshots.bin --from 2 --command memory-leak --command heap-stats --output d2-subset.html
# JSON is still supported when needed
DumpDetective render snapshots.json --output report.html
# Convert a single-dump report BIN / JSON to HTML
DumpDetective render heap-stats.bin
DumpDetective render heap-stats.json
Note:
--fromand--commandrequiretrend-rawdata saved with--full(from.jsonor.bin). If the source was saved without--full, sub-reports are not present and extraction will fail with a clear error message.
Memory Dump Commands
Each command accepts <dump-file> and --help.
By default every command writes <dump-name>.html alongside the dump file. Use --output <file> or --format <fmt> to change this.
Both -o and --format are repeatable: -o report.html -o report.bin or --format html --format bin writes both files simultaneously. You can also mix them: -o report.html --format bin auto-adds report.bin.
| Command | Incl. in --full |
Description |
|---|---|---|
heap-stats |
Yes | Heap object counts and sizes grouped by type |
gen-summary |
Yes | Object counts and sizes by GC generation |
heap-fragmentation |
Yes | Segment free space and fragmentation percentage |
large-objects |
Yes | Large objects on LOH / POH / Gen heap |
pinned-objects |
Yes | Pinned GC handles causing heap fragmentation |
memory-leak |
Yes | Suspect types with root-chain BFS traces |
high-refs |
Yes | Highly-referenced "hub" objects -- caches, shared state |
string-duplicates |
Yes | Duplicate strings and wasted memory |
finalizer-queue |
Yes | Objects waiting in the finalizer queue |
handle-table |
Yes | GC handles grouped by kind |
static-refs |
Yes | Non-null static reference fields with retained-size analysis |
weak-refs |
Yes | WeakReference handles -- alive vs collected |
thread-analysis |
Yes | Thread states, blocking objects, stack traces |
thread-pool |
Yes | ThreadPool state and queued work items |
deadlock-detection |
Yes | Deadlock cycles in the wait graph |
async-stacks |
Yes | Suspended async state machines at await points |
exception-analysis |
Yes | Exception objects on heap and active threads |
event-analysis |
Yes | Event handler leaks -- publisher types, field names, subscriber counts, retained memory |
http-requests |
Yes | In-flight HTTP request objects |
connection-pool |
Yes | Database connection objects and leak detection |
wcf-channels |
Yes | WCF service/channel objects and their state |
timer-leaks |
Yes | Timer objects and their callback targets |
module-list |
Yes | Loaded assemblies with path and size |
gc-roots |
No | GC roots and referrers for a given type (too slow for --full) |
type-instances |
No | All instances of a given type (--type <name> required) |
object-inspect |
No | All field values of an object with optional retained-size BFS (--address <hex> required) |
build-bfs |
No | Pre-build the BFS retained-size index cache (.bfs.idx) for a dump file or every dump in a directory |
object-inspect
Inspects all fields of a single managed object. With --retained it computes the exclusive retained size of every reference field using BFS.
DumpDetective object-inspect <dump-file> --address <hex> [options]
Options:
-x, --address <hex> Object address (hex, e.g. 0x00000276DB084170) [required]
-d, --depth <N> Recursion depth into references (default: 1)
--max-array <N> Max array elements to show (default: 10)
--retained, -r Compute retained size per reference field
--retained-cap <N> Max BFS nodes per field (0 = unlimited, default: 0)
--no-cache Ignore existing .bfs.idx cache; use in-request BFS
--no-save Do not save a new .bfs.idx cache after building
-h, --help Show this help
Examples:
# Inspect a single object (no retained sizes)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170
# Inspect with retained-size BFS per field (builds cache on first run)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained
# Inspect with cache loaded (fast — no BFS rebuild)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained
# Recurse 3 levels deep, all fields use cache
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained -d 3
# Cap BFS per field to 1M nodes (fast estimate for very deep graphs)
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained --retained-cap 1000000
Tip: Run
build-bfsonce beforeobject-inspect --retainedso the first retained-size run is instant.
Trace Commands
These commands accept a trace file, not a memory dump. Supported trace inputs are .nettrace and .etl.
| Command | Description |
|---|---|
trace-analyze |
Combined trace report that opens the trace once and runs the supported trace analyzers in sequence |
cpu-trace |
CPU hot path, top methods, and call tree analysis |
alloc-trace |
Allocation hotspot analysis based on GCAllocationTick events |
gc-trace |
GC pause analysis, trigger reasons, and per-collection heap metrics |
exceptions-trace |
First-chance exception volume, type breakdown, and flood detection |
contention-trace |
Lock contention hotspot and wait-time analysis |
threadpool-starvation |
ThreadPool starvation detection from wait and adjustment events |
trace-analyze
Opens a trace file once and runs the trace analyzers as a single combined report. This is the trace equivalent of a combined dump analysis run.
DumpDetective trace-analyze <trace-file> [options]
Supported input formats:
.nettrace EventPipe trace collected with a suitable profile
.etl Windows ETW trace
Options:
-n, --top <N> Top N items per section (default: 20)
--process <name> Filter to a specific process name
--show-system Include system/kernel frames in CPU tree (default: hidden)
-o, --output <file> Write report to file (.html / .md / .txt / .json / .bin)
--format <fmt> Output format shorthand: html|md|json|bin
-h, --help Show this help
Included sub-reports:
cpu-tracefor CPU hot paths and call trees.alloc-tracefor allocation hotspots.gc-tracefor GC pause timing and trigger reasons.exceptions-tracefor exception flood detection.contention-tracefor lock hotspots and wait times.threadpool-starvationfor ThreadPool starvation signals.
Examples:
DumpDetective trace-analyze app.nettrace
DumpDetective trace-analyze perf.etl --process w3wp --output trace-report.html
DumpDetective trace-analyze app.nettrace --top 30 --show-system
Individual trace analyzers
Use these when you want a single signal instead of the combined trace-analyze report:
DumpDetective cpu-trace app.nettrace --top 40 --output cpu.html
DumpDetective alloc-trace app.nettrace --process w3wp --output alloc.html
DumpDetective gc-trace perf.etl --process w3wp --top 50 --output gc.html
DumpDetective exceptions-trace app.nettrace --output exceptions.html
DumpDetective contention-trace perf.etl --process w3wp --output contention.html
DumpDetective threadpool-starvation perf.nettrace --top 50 --output starvation.html
Common trace use cases:
- Use
cpu-tracewhen you need hot methods, hot paths, and a call tree. - Use
alloc-tracewhen the problem is allocation churn or GC pressure. - Use
gc-tracewhen you need pause distributions, trigger reasons, or explicitGC.Collect()detection. - Use
exceptions-tracewhen a service is throwing at high volume or hiding error floods. - Use
contention-tracewhen threads are blocked on locks and you need hotspot call sites. - Use
threadpool-starvationwhen the runtime is under worker-thread pressure or sync-over-async blocking is suspected.
build-bfs
Pre-builds and saves a BFS forward-reference index (.bfs.idx) alongside each dump file. Once built, object-inspect --retained loads it in seconds instead of re-walking the entire heap.
Accepts either a single dump file or a directory containing multiple dumps. When a directory is given, each .dmp/.mdmp file is processed sequentially — one at a time so peak memory stays bounded.
DumpDetective build-bfs <dump-file-or-directory> [options]
Options:
--force, -f Rebuild even if a valid cache already exists
--recurse, -r When input is a directory, also search subdirectories
-h, --help Show this help
How it works:
The builder runs a parallel 3-pass algorithm over the managed heap:
| Pass | What it does |
|---|---|
| 1 — enumerate | Assigns a stable integer index to every live object; records shallow size |
| 2 — count edges | Counts outbound references per node (determines CSR array sizes) |
| 3 — fill edges | Fills the CSR edge arrays with child node indices |
The resulting graph is a Compressed Sparse Row (CSR) structure stored as a Brotli-compressed binary file next to the dump (<dump>.bfs.idx). The cache is validated against the dump's file size and last-write timestamp — a stale or mismatched cache is automatically ignored and rebuilt.
Once loaded, ComputeRetained runs a pure in-memory BFS with zero ClrMD I/O, completing in milliseconds per field regardless of heap size.
This cache is now intentionally optimized for report speed rather than minimum memory footprint. A small change in the retained-size caching path improved estimate precision and made repeated retained-size work much cheaper, but the loaded cache can consume noticeably more RAM on very large heaps.
Observed tradeoff on a large heap:
| Metric | Example value |
|---|---|
| Managed objects | ~100 M |
.bfs.idx file size |
~725 MB |
| Additional RAM after cache load | ~3 GB |
| Repeated retained-size work avoided | ~3-7 min |
| Typical full-report improvement | ~600-800s → ~200-400s |
If you have enough memory headroom, this is usually a net win: faster reports, less repeated BFS work, and better retained-size estimate accuracy. If RAM is tight, use targeted commands or skip cache loading with --no-cache.
Typical timings (22 GB / 63 M node heap):
| Phase | Time |
|---|---|
| Pass 1 — enumerate (8 parallel segments) | ~35 s |
| Pass 2 — count edges (8 parallel segments) | ~40 s |
| Pass 3 — fill edges (8 parallel segments) | ~38 s |
| Save (Brotli Optimal, chunked) | ~30 s |
| Total build | ~2.5 min |
| Load (subsequent runs) | ~6 s |
| Retained BFS per field (post-load) | < 2 s |
Examples:
# Build and save for a single dump (one-time setup)
DumpDetective build-bfs app.dmp
# Force rebuild (e.g. after a code update)
DumpDetective build-bfs app.dmp --force
# Build caches for all dumps in a directory (skips already-valid caches)
DumpDetective build-bfs D:\dumps
# Build recursively, rebuild all even if caches exist
DumpDetective build-bfs D:\dumps --recurse --force
# Then use instantly in object-inspect
DumpDetective object-inspect app.dmp -x 0x00000276DB084170 --retained
Output Formats
Specify an output file with -o / --output, or use --format without a filename:
| Extension / keyword | --format value |
Format |
|---|---|---|
.html |
html |
Interactive HTML — sticky sidebar nav, grouped/collapsible sub-report navigation, built-in charts, sortable/filterable paged tables, dark mode toggle, styled alert cards |
.md |
md |
Markdown — suitable for wiki pages or GitHub |
.json |
json |
Structured JSON — full report data, re-renderable to any other format with render |
.bin |
bin |
Brotli-compressed JSON — same structure as .json, ~50–70% smaller, non-human-readable |
.txt |
txt |
Plain text |
Default output
When -o / --output and --format are both omitted, every command writes <dump-name>.html alongside the dump file.
Multi-output and --format
Both -o / --output and --format are repeatable — you can write multiple formats in one run:
# Two explicit output files
DumpDetective heap-stats app.dmp -o report.html -o report.bin
# Two formats — files auto-named from dump name
DumpDetective heap-stats app.dmp --format html --format bin # -> app.html + app.bin
# Mix -o and --format: explicit path plus extra format(s)
DumpDetective analyze app.dmp --full -o report.html --format bin # -> report.html + report.bin
# trend-analysis: write snapshot data AND the rendered report in one pass
DumpDetective trend-analysis d1.dmp d2.dmp --full --format html --format bin # -> trend-analysis.html + trend-analysis.bin
# render: convert to two formats at once
DumpDetective render snapshots.json -o report.html -o report.md
--format without a full filename auto-names the file after the dump (or input file for render):
DumpDetective heap-stats app.dmp --format md # -> app.md
DumpDetective heap-stats app.dmp --format bin # -> app.bin
DumpDetective render snapshots.json --format md # -> snapshots.md
Dark mode (HTML output)
The HTML report includes a 🌙 Dark mode toggle button in the sidebar. Your preference is saved in localStorage and respected on subsequent opens. The initial theme follows your OS prefers-color-scheme setting.
HTML report UX
The HTML renderer is designed for large real-world dumps and full combined reports. Current capabilities include:
- Grouped sub-report navigation in the sidebar for
analyze --fulland trend-style combined outputs. - Collapsible nav groups with stable active-section highlighting while scrolling through large reports.
- Self-contained charts and summary visuals embedded directly into the generated HTML.
- Sortable, filterable tables with paging.
- Global rows-per-page control plus per-table override for especially large sections.
- A single output file with embedded CSS and JavaScript, so reports remain portable and easy to share.
JSON / binary output and re-rendering
Use --output report.json or --output report.bin (Brotli-compressed) with any command to capture a fully structured report. Both formats preserve all report data — chapters, sections, tables, key-value pairs, alerts, findings, and details accordions — including chapter nav levels and polymorphic element types.
There are two structured formats:
format field |
Produced by | Contents |
|---|---|---|
"report" |
Any single-dump command with --output *.json / *.bin |
Full rendered report document |
"trend-raw" |
trend-analysis --output *.json / *.bin |
Raw snapshot metrics + optional captured sub-reports |
Both are handled transparently by DumpDetective render — it auto-detects the format and decompresses .bin automatically:
DumpDetective render heap-stats.json
DumpDetective render heap-stats.bin
DumpDetective render analyze-full.json --output report.md
DumpDetective render snapshots.json --baseline 2 --output report.html
DumpDetective render snapshots.bin --baseline 2 --output report.html
The trend-raw format is especially useful: save once with --full, then re-render at any baseline, format, or time without touching the original dump files. Use .bin for long-term archival — it is typically 50–70% smaller than the equivalent .json.
Project Structure
DumpDetective.slnx
DumpDetective.Core/ Models, interfaces, shared utilities
Interfaces/
ICommand.cs Name, Description, IncludeInFullAnalyze, Run, BuildReport
IRenderSink.cs Format-agnostic output interface
IHeapObjectConsumer.cs Heap-walk consumer interface
Models/
DumpSnapshot.cs All collected metrics for one dump (AOT JSON-serialisable)
Finding.cs Scored finding (severity, category, headline, advice)
ReportDoc.cs Replayable report document model (chapters > sections > elements)
ThresholdConfig.cs Configurable scoring / trend thresholds
Runtime/
DumpContext.cs ClrMD DataTarget + ClrRuntime wrapper
HeapSnapshot.cs TypeStats, InboundCounts, StringGroups, gen counters
Utilities/
CliArgs.cs Shared argument parser (--help, --output, DD_DUMP, flags)
CommandBase.cs Execute lifecycle, TryHelp, RunStatus
ExecutionContext.cs Thread-local verbose suppression and parameter overrides
OperationTrace.cs Per-thread operation timing for full-analyze progress display
DumpHelpers.cs FormatSize, IsSystemType, OpenDump, SegmentKindLabel
HealthScorer.cs Score(DumpSnapshot, ScoringThresholds) -> (Findings, score)
ProgressLogger.cs Live spinner + [SCAN] completion lines via Spectre.Console
SinkFactory.cs Creates single or multi-output IRenderSink from path lists
ThresholdLoader.cs Lazy-loads dd-thresholds.json; silent fallback to defaults
DumpDetective.Analysis.Memory/ ClrMD data collection and heap walking
DumpCollector.cs CollectFull / CollectLightweight orchestration
HeapWalker.cs Single EnumerateObjects() call feeding all consumers
HeapObjectCollector.cs Manages consumer registration and walk execution
RuntimeSubCollectors.cs Thread, handle, module, finalizer-queue sub-collectors
SnapshotPopulator.cs Writes consumer results back into DumpSnapshot
TrendRawSerializer.cs DumpSnapshot JSON storage for trend analysis
BfsIndexBuilder.cs 3-pass parallel CSR graph builder for .bfs.idx cache
BfsIndexCache.cs Load / validate / save the .bfs.idx cache
Consumers/ IHeapObjectConsumer implementations (one concern each)
TypeStatsConsumer.cs
InboundRefConsumer.cs
StringGroupConsumer.cs
GenCounterConsumer.cs
AsyncMethodConsumer.cs
ExceptionCountConsumer.cs
HttpRequestsConsumer.cs
ThreadNameConsumer.cs
ThreadPoolConsumer.cs
LightweightStatsConsumer.cs
ReferrerConsumer.cs
FragmentationConsumer.cs
EventDetailConsumer.cs
ConditionalWeakTableConsumer.cs
Analyzers/ Per-command analysis logic (pure POCO in / POCO out)
HeapStatsAnalyzer.cs ... (one file per memory-dump command)
SharedReferrerCache.cs Reverse-reference graph shared by MemoryLeak + HighRefs
DumpDetective.Analysis.Trace/ .nettrace / ETL data collection
Analyzers/
CpuTraceAnalyzer.cs ... (one file per trace command)
DumpDetective.Reporting/ Output format implementations
Sinks/
HtmlSink.cs Self-contained HTML; inline CSS/JS; sticky nav; virtual scroll
MarkdownSink.cs
TextSink.cs
ConsoleSink.cs
JsonSink.cs
BinSink.cs Brotli-compressed JSON
CaptureSink.cs
Reports/ Per-command report builders (one file per command)
AnalyzeReport.cs
TrendAnalysisReport.cs
HeapStatsReport.cs ... (one file per command)
ReportDocReplay.cs Replays a ReportDoc through any IRenderSink
ReportDiffer.cs Produces diff ReportDoc from two ReportDoc inputs
ReportDocSlicer.cs Extracts sub-chapters by command name or dump index
ObjectInspectRenderer.cs Field-level retained-size table builder
ToolMemoryDiagnostic.cs Peak working-set / managed-heap / private-bytes poller
DumpDetective.Commands/ ICommand implementations
Memory/ Memory-dump commands (namespace DumpDetective.Commands.Memory)
AnalyzeCommand.cs Orchestrator; runs FullAnalyzeCommands in parallel
TrendAnalysisCommand.cs Cross-dump trend report
HeapStatsCommand.cs
MemoryLeakCommand.cs ... (one file per memory command; 28 total)
Trace/ Trace commands (namespace DumpDetective.Commands.Trace)
TraceAnalyzeCommand.cs Orchestrator; runs all six trace sub-analyzers
CpuTraceCommand.cs
AllocTraceCommand.cs ... (one file per trace command; 7 total)
RenderCommand.cs Convert any JSON/BIN report to any output format
DiffCommand.cs Compare two saved report files
BuildBfsCommand.cs Pre-build BFS retained-size index cache
DumpDetective.Cli/ Entry point -- the AOT executable
Program.cs Top-level statements; --debug flag; default HTML output injection
CommandRegistry.cs Single source of truth for all ICommand instances
HelpPrinter.cs Formats --help output
DumpDetective.Tests/ xUnit test project (no AOT)
dd-thresholds.json Override default scoring/trend thresholds (place next to exe)
Dependency graph
Cli ──────────────────────────────────────► Commands
│ │
│ ▼
│ Analysis.Memory ──────┐
│ Analysis.Trace ──────┤
│ │
└──────────────────► Reporting ──────────► Core ◄─────┘
Health Score
The analyze command produces a score from 0-100 for the dump, deducting points for each finding:
| Signal | Deduction |
|---|---|
| Event leak > 1000 subscribers on a single field | -20 |
| Thread pool saturated | -15 |
| Heap > 2 GB | -15 |
| Finalizer queue > 500 objects | -15 |
| Async backlog > 500 continuations | -10 |
| Heap fragmentation >= 40% | -10 |
| DB connections > 50 | -10 |
| LOH > 500 MB | -10 |
| WCF faulted channels | -10 |
| Event leaks (moderate) | -10 |
| Blocked threads > 20 | -10 |
| Heap fragmentation 20-40% | -5 |
| Blocked threads 5-20 | -5 |
| Finalizer queue 100-500 | -5 |
| Async backlog 100-500 | -5 |
| Thread pool near capacity | -5 |
| Exception threads > 5 | -5 |
| String duplication > 100 MB | -5 |
| Pinned handles > 2000 | -5 |
| Timer objects > 500 | -5 |
Score labels: Healthy (>=85) · Stable (>=70) · Degraded (>=50) · Critical (<50)
Thresholds are fully configurable via dd-thresholds.json placed alongside the executable.
Performance & Resource Expectations
DumpDetective processes dumps by walking every managed object on the heap. Run time and memory scale with object count, not dump file size. Dump file size is only a rough guide — a largely native-memory process can produce a multi-GB dump file with very few managed objects.
Measured Full-Analyze Benchmark
The numbers below come from a real DumpDetective analyze --full run on a production-style IIS worker dump of ~25 GB with 110,472,530 managed objects.
End-to-end timings
| Stage | Time |
|---|---|
| Dump load | ~1s |
| Collection total | 112.9s |
| BFS index load | 12.3s |
| All 23 sub-reports (parallel) | 281.5s |
| Total execution time | 409.0s |
Collection breakdown
| Step | Objects | Time | Throughput |
|---|---|---|---|
| Thread scan | 155 | 419ms | ~369/s |
| Handle scan | 19,418 | 2.5s | ~7,782/s |
| Heap walk | 110,472,530 | 77.0s | ~1,434,559/s |
| Finalizer queue scan | 4,273,410 | 32.9s | ~129,839/s |
Peak tool memory usage
| Metric | Start | Peak | Growth |
|---|---|---|---|
| Working set | 10.6 MB | 15.53 GB | +15.52 GB |
| Managed heap | 245.6 KB | 15.65 GB | +15.65 GB |
| Private bytes | 5.9 MB | 16.68 GB | +16.68 GB |
Memory growth by stage
| Stage | Working Set Delta | Working Set After | Managed Delta |
|---|---|---|---|
| Load dump | +733.8 MB | 746.4 MB | +719.5 MB |
| Heap walk + scoring (full) | +8.25 GB | 8.97 GB | +4.94 GB |
| BFS cache load | +4.18 GB | 13.15 GB | +5.71 GB |
| Sub-reports (all) | +1.82 GB | 14.97 GB | -2.55 GB |
Slowest analyzers in this run
| Analyzer | Time |
|---|---|
| memory-leak | 281.5s |
| high-refs | 280.6s |
| event-analysis | 265.4s |
| heap-fragmentation | 264.4s |
| large-objects | 204.0s |
| finalizer-queue | 200.1s |
What this means in practice
- On a ~25 GB dump, the single heap walk is fast enough to process ~110.5M objects in about 77 seconds on a healthy machine.
analyze --fullis dominated by the retention-heavy analyzers (memory-leak,high-refs,event-analysis,heap-fragmentation,large-objects,finalizer-queue), not by dump load time.- BFS cache load is a major but predictable memory spike. If you are short on RAM, prefer targeted commands before running the full combined report.
- The current BFS cache path is intentionally more aggressive about caching forward-graph data up front. In exchange for a larger in-memory cache, retained-size work is faster and size-estimate precision is better.
- On very large heaps, this tradeoff is material: a roughly 100M-object dump can produce a
.bfs.idxfile around 725 MB, add about 3 GB of RAM while loaded, and remove roughly 3-7 minutes of repeated retained-size work from the overall report. - In practice, that change can pull a large full-report run from roughly 600-800 seconds down into the 200-400 second range, while also improving retained-size estimate accuracy.
- For a dump of this scale (~25 GB), NVMe storage and at least 16 GB free RAM are strongly recommended. If you regularly run full analysis on similar dumps, plan for 24 GB+ free RAM.
Heap walk throughput
The single-pass heap walk (which feeds all analysis consumers simultaneously) typically runs at roughly 1,000,000–2,000,000 objects/second on production machines, with the lower end more representative for very large heaps.
See the measured benchmark above for a concrete large-dump example.
Combined estimates per dump
| Dump file size | Typical object count | analyze --full (wall clock) |
Peak working set |
|---|---|---|---|
| < 500 MB | < 1 M | < 5 s | < 300 MB |
| 500 MB – 4 GB | 1 – 15 M | 10–30 s | < 2 GB |
| 4 – 15 GB | ~15 – 50 M | 1–3 min | 2–6 GB |
| 15 – 30 GB | ~50 – 120 M | 5–8 min | 12–17 GB |
Object count is what actually drives analysis time, not file size. Use
--debugon a first run to see the exact object count for your dump.
analyze --fullincludes all 23 sub-reports.analyzewithout--fullfinishes right after collection — the table above shows--fulltimes.
What drives --full time
analyze --full runs all 23 sub-reports in parallel; wall-clock time equals the slowest sub-report, not their sum. The five slow ones each do additional heap traversals:
| Sub-report | ~10 M objects | ~100 M objects |
|---|---|---|
static-refs |
~6 s | ~5.8 min |
heap-fragmentation |
~5 s | ~4.0 min |
event-analysis |
~6 s | ~4.0 min |
memory-leak / high-refs (shared BFS) |
~6 s | ~4.4 min |
finalizer-queue |
~0.5 s | ~2.5 min |
| All others | < 0.3 s | usually < 30 s (large-objects can exceed that on very large heaps) |
Recent measured sub-report timings on a 110 M object production dump:
| Sub-report | Time | Notes |
|---|---|---|
static-refs |
349.9 s | Exact full BFS retained-size traversal |
high-refs |
262.0 s | Builds shared referrer map |
memory-leak |
262.5 s | GC roots map + shared referrer map + root tracing |
event-analysis |
240.1 s | Static root map + detailed event scan |
heap-fragmentation |
239.4 s | Fragmentation measurement dominates |
finalizer-queue |
149.6 s | 134.9 s queue read + 14.7 s resurrection scan |
Memory usage
Peak working set is driven by the referrer-graph BFS (memory-leak + high-refs), which builds a full reverse-reference map in memory while the other sub-reports run in parallel.
Rule of thumb: plan for roughly 0.5–0.6× the dump file size in free RAM when running --full.
Verified against real dumps:
| Dump size | Object count | Peak working set | Ratio |
|---|---|---|---|
| 3.65 GB | 10.7 M | 2.09 GB | 0.57× |
| ~25 GB | 110.5 M | 15.53 GB | 0.62× |
The ratio stays well below 1× because:
- ClrMD memory-maps the dump rather than loading it — only touched pages are resident.
- The BFS map stores only 1 parent address per object (16 B/entry) rather than the full object graph.
- Large structures (
InboundCounts,StringGroups,BfsMap) are released as soon as their last consumer finishes, not held for the full run.
trend-analysis --full across multiple dumps
Each dump is processed sequentially — loaded, analysed, fully released — before the next one begins. Peak memory therefore equals the most expensive single dump in the set, not the sum of all dumps.
Example runtimes from real runs:
| Scenario | Dump size | Object count | Total time | Peak RAM |
|---|---|---|---|---|
| Load-test w3wp | 3.65 GB | 10.7 M | 12.5 s | 2.09 GB |
| Production w3wp | ~25 GB | 110.5 M | 409 s (~6.8 min) | 15.53 GB |
For large production dumps with roughly 100 M+ managed objects, budget roughly 5–8 minutes and 15–17 GB RAM at peak.
Offline render
render on a pre-saved .json completes in under a second for any output format. No dump file or ClrMD overhead is involved.
BFS retained-size cache (.bfs.idx)
object-inspect --retained computes exclusive retained sizes per reference field using BFS. Without a cache this re-walks the full 168 M-edge graph for every field — on a 22 GB heap that takes several minutes.
Run build-bfs once to build and save a Brotli-compressed CSR graph index alongside the dump. On subsequent object-inspect --retained runs the index loads in ~6 s and each per-field BFS completes in under 2 s regardless of heap depth or object count.
| Scenario | Time |
|---|---|
| First run (no cache) — 22 GB heap, 63 M objects | ~5–15 min |
build-bfs (one-time, 8 parallel workers) |
~2.5 min |
| Load existing cache | ~6 s |
| Per-field retained BFS (post-load) | < 2 s |
Thresholds
Place a dd-thresholds.json file next to DumpDetective.Cli.exe to override any scoring or trend threshold. Missing or invalid files silently fall back to built-in defaults. See the included dd-thresholds.json for the full schema.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.