KreuzbergDev.HtmlToMarkdown 2.25.2

dotnet add package KreuzbergDev.HtmlToMarkdown --version 2.25.2
                    
NuGet\Install-Package KreuzbergDev.HtmlToMarkdown -Version 2.25.2
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="KreuzbergDev.HtmlToMarkdown" Version="2.25.2" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="KreuzbergDev.HtmlToMarkdown" Version="2.25.2" />
                    
Directory.Packages.props
<PackageReference Include="KreuzbergDev.HtmlToMarkdown" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add KreuzbergDev.HtmlToMarkdown --version 2.25.2
                    
#r "nuget: KreuzbergDev.HtmlToMarkdown, 2.25.2"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package KreuzbergDev.HtmlToMarkdown@2.25.2
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=KreuzbergDev.HtmlToMarkdown&version=2.25.2
                    
Install as a Cake Addin
#tool nuget:?package=KreuzbergDev.HtmlToMarkdown&version=2.25.2
                    
Install as a Cake Tool

html-to-markdown

<div align="center" style="display: flex; flex-wrap: wrap; gap: 8px; justify-content: center; margin: 20px 0;">

<a href="https://crates.io/crates/html-to-markdown-rs"> <img src="https://img.shields.io/crates/v/html-to-markdown-rs?label=Rust&color=007ec6" alt="Rust"> </a> <a href="https://pypi.org/project/html-to-markdown/"> <img src="https://img.shields.io/pypi/v/html-to-markdown?label=Python&color=007ec6" alt="Python"> </a> <a href="https://www.npmjs.com/package/@kreuzberg/html-to-markdown-node"> <img src="https://img.shields.io/npm/v/@kreuzberg/html-to-markdown-node?label=Node.js&color=007ec6" alt="Node.js"> </a> <a href="https://www.npmjs.com/package/@kreuzberg/html-to-markdown-wasm"> <img src="https://img.shields.io/npm/v/@kreuzberg/html-to-markdown-wasm?label=WASM&color=007ec6" alt="WASM"> </a> <a href="https://central.sonatype.com/artifact/dev.kreuzberg/html-to-markdown"> <img src="https://img.shields.io/maven-central/v/dev.kreuzberg/html-to-markdown?label=Java&color=007ec6" alt="Java"> </a> <a href="https://pkg.go.dev/github.com/kreuzberg-dev/html-to-markdown/packages/go/v2/htmltomarkdown"> <img src="https://img.shields.io/badge/Go-v2.24.1-007ec6" alt="Go"> </a> <a href="https://www.nuget.org/packages/KreuzbergDev.HtmlToMarkdown/"> <img src="https://img.shields.io/nuget/v/KreuzbergDev.HtmlToMarkdown?label=C%23&color=007ec6" alt="C#"> </a> <a href="https://packagist.org/packages/kreuzberg-dev/html-to-markdown"> <img src="https://img.shields.io/packagist/v/kreuzberg-dev/html-to-markdown?label=PHP&color=007ec6" alt="PHP"> </a> <a href="https://rubygems.org/gems/html-to-markdown"> <img src="https://img.shields.io/gem/v/html-to-markdown?label=Ruby&color=007ec6" alt="Ruby"> </a> <a href="https://hex.pm/packages/html_to_markdown"> <img src="https://img.shields.io/hexpm/v/html_to_markdown?label=Elixir&color=007ec6" alt="Elixir"> </a> <a href="https://kreuzberg-dev.r-universe.dev/htmltomarkdown"> <img src="https://img.shields.io/badge/R-htmltomarkdown-007ec6" alt="R"> </a>

<a href="https://github.com/kreuzberg-dev/html-to-markdown/blob/main/LICENSE"> <img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License"> </a> </div>

<img width="3384" height="573" alt="Linkedin- Banner" src="https://github.com/user-attachments/assets/478a83da-237b-446b-b3a8-e564c13e00a8" />

<div align="center" style="margin-top: 20px;"> <a href="https://discord.gg/pXxagNK2zN"> <img height="22" src="https://img.shields.io/badge/Discord-Join%20our%20community-7289da?logo=discord&logoColor=white" alt="Discord"> </a> </div>

High-performance HTML → Markdown conversion powered by Rust. Shipping as a Rust crate, Python package, PHP extension, Ruby gem, Elixir Rustler NIF, Node.js bindings, WebAssembly, and standalone CLI with identical rendering behavior across all runtimes.

Key Features

  • Blazing Fast – Rust-powered core delivers 10-80× faster conversion than pure Python alternatives (150–280 MB/s)
  • Polyglot – Native bindings for Rust, Python, TypeScript/Node.js, Ruby, PHP, Go, Java, C#, and Elixir
  • Smart Conversion – Handles complex documents including nested tables, code blocks, task lists, and hOCR OCR output
  • Metadata Extraction – Extract document metadata (title, description, headers, links, images, structured data) alongside conversion
  • Visitor Pattern – Custom callbacks for domain-specific dialects, content filtering, URL rewriting, accessibility validation
  • Highly Configurable – Control heading styles, code block fences, list formatting, whitespace handling, and HTML sanitization
  • Tag Preservation – Keep specific HTML tags unconverted when markdown isn't expressive enough
  • Secure by Default – Built-in HTML sanitization prevents malicious content
  • Consistent Output – Identical markdown rendering across all language bindings

Try the Live Demo →

Installation

Each language binding provides comprehensive documentation with installation instructions, examples, and best practices. Choose your platform to get started:

Scripting Languages:

  • Python – PyPI package, metadata extraction, visitor pattern, CLI included
  • Ruby – RubyGems package, RBS type definitions, Steep checking
  • PHP – Composer package + PIE extension, PHP 8.4+, PHPStan level 9
  • Elixir – Hex package, Rustler NIF bindings, Elixir 1.19+
  • R – r-universe package, extendr bindings, R 4.3+

JavaScript/TypeScript:

  • Node.js / TypeScript – Native NAPI-RS bindings for Node.js/Bun, fastest performance, WebAssembly for browsers/Deno

Compiled Languages:

  • Go – Go module with FFI bindings, automatic library download
  • Java – Maven Central, Panama Foreign Function & Memory API, Java 24+
  • C# – NuGet package, .NET 8.0+, P/Invoke FFI bindings

Native:

  • Rust – Core library, flexible feature flags, zero-copy APIs

Command-Line:

  • CLI – Cross-platform binary via cargo install html-to-markdown-cli or Homebrew: brew install kreuzberg-dev/tap/html-to-markdown

<details> <summary><strong>Metadata Extraction</strong></summary>

Extract comprehensive metadata during conversion: title, description, headers, links, images, structured data (JSON-LD, Microdata, RDFa). Use cases: SEO extraction, table-of-contents generation, link validation, accessibility auditing, content migration.

Metadata Extraction Guide →

</details>

<details> <summary><strong>Visitor Pattern</strong></summary>

Customize HTML→Markdown conversion with callbacks for specific elements. Intercept links, images, headings, lists, and more. Use cases: domain-specific Markdown dialects (Obsidian, Notion), content filtering, URL rewriting, accessibility validation, analytics.

Supported in: Rust, Python (sync & async), TypeScript/Node.js (sync & async), Ruby, and PHP.

Visitor Pattern Guide →

Visitor Support Matrix

Binding Visitor Support Async Support Best For
Rust ✅ Yes ✅ Tokio Core library, performance-critical code
Python ✅ Yes ✅ asyncio Server-side, bulk processing
TypeScript/Node.js ✅ Yes ✅ Promise-based Server-side Node.js/Bun, best performance
Ruby ✅ Yes ❌ No Server-side Ruby on Rails, Sinatra
PHP ✅ Yes ❌ No Server-side PHP, content management
Go ❌ No Basic conversion only
Java ❌ No Basic conversion only
C# ❌ No Basic conversion only
Elixir ❌ No Basic conversion only
WebAssembly ❌ No Browser, Edge, Deno (FFI limitations)

For WASM users needing visitor functionality, see WASM Visitor Alternatives for recommended approaches.

</details>

<details> <summary><strong>Performance & Benchmarking</strong></summary>

Rust-powered core delivers 150–280 MB/s throughput (10-80× faster than pure Python alternatives). Includes benchmarking tools, memory profiling, streaming strategies, and optimization tips.

Performance Guide →

</details>

<details> <summary><strong>Tag Preservation</strong></summary>

Keep specific HTML tags unconverted when Markdown isn't expressive enough. Useful for tables, SVG, custom elements, or when you need mixed HTML/Markdown output.

See language-specific documentation for preserveTags configuration.

</details>

<details> <summary><strong>Skipping Images</strong></summary>

Skip all images during conversion using the skip_images option. Useful for text-only extraction or when you want to filter out visual content.

Rust:

use html_to_markdown_rs::{convert, ConversionOptions};

let options = ConversionOptions {
    skip_images: true,
    ..Default::default()
};

let html = r#"<p>Text with <img src="image.jpg" alt="pic"> image</p>"#;
let markdown = convert(html, Some(options))?;
// Output: "Text with  image" (image tags are removed)

Python:

from html_to_markdown import convert, ConversionOptions

options = ConversionOptions(skip_images=True)
markdown = convert(html, options)

TypeScript/Node.js:

import { convert, ConversionOptions } from '@kreuzberg/html-to-markdown-node';

const options: ConversionOptions = {
    skipImages: true,
};

const markdown = convert(html, options);

Ruby:

require 'html_to_markdown'

options = HtmlToMarkdown::ConversionOptions.new(skip_images: true)
markdown = HtmlToMarkdown.convert(html, options)

PHP:

use Goldziher\HtmlToMarkdown\HtmlToMarkdown;
use Goldziher\HtmlToMarkdown\Options;

$options = new Options(['skip_images' => true]);
$markdown = HtmlToMarkdown::convert($html, $options);

This option is available across all language bindings. When enabled, all <img> tags and their associated markdown image syntax are removed from the output.

</details>

<details> <summary><strong>Secure by Default</strong></summary>

Built-in HTML sanitization prevents XSS attacks and malicious content. Powered by ammonia with safe defaults. Configurable via sanitize options.

</details>

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines on:

  • Setting up the development environment
  • Running tests locally (Rust 95%+ coverage, language bindings 80%+)
  • Submitting pull requests
  • Reporting issues

All contributions must follow code quality standards enforced via pre-commit hooks (prek).

License

MIT License – see LICENSE for details. You can use html-to-markdown freely in both commercial and closed-source products with no obligations, no viral effects, and no licensing restrictions.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net10.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
2.25.2 32 2/25/2026
2.25.1 87 2/17/2026
2.25.0 88 2/15/2026
2.24.6 90 2/14/2026
2.24.5 94 2/1/2026
2.24.4 98 1/31/2026
2.24.3 91 1/31/2026
2.24.2 89 1/30/2026
2.24.1 90 1/28/2026
2.24.0 101 1/24/2026
2.23.6 100 1/21/2026
2.23.5 96 1/21/2026
2.23.4 96 1/20/2026
2.23.3 96 1/20/2026
2.23.2 93 1/20/2026
2.23.1 101 1/20/2026
2.23.0 90 1/18/2026
2.22.6 92 1/16/2026
2.22.5 89 1/16/2026
2.22.2 95 1/13/2026
Loading failed