MigraDocCore.Extensions.Html 1.0.2

dotnet add package MigraDocCore.Extensions.Html --version 1.0.2                
NuGet\Install-Package MigraDocCore.Extensions.Html -Version 1.0.2                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="MigraDocCore.Extensions.Html" Version="1.0.2" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add MigraDocCore.Extensions.Html --version 1.0.2                
#r "nuget: MigraDocCore.Extensions.Html, 1.0.2"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install MigraDocCore.Extensions.Html as a Cake Addin
#addin nuget:?package=MigraDocCore.Extensions.Html&version=1.0.2

// Install MigraDocCore.Extensions.Html as a Cake Tool
#tool nuget:?package=MigraDocCore.Extensions.Html&version=1.0.2                

MigraDocCore.Extensions

Extensions for MigraDocCore/PDFSharpCore. This project was ported from the hivvetech/migradoc-extensions project. It also adds support for HTML and Markdown in header or footer sections.

Quick Start

The biggest feature provided by this library is the ability to convert from HTML and Markdown to PDF, via MigraDocCore's Document Object Model.

MigraDocCore.Extensions makes use of Markdig (replacing MarkdownSharp in the original library) to convert from Markdown to HTML and the Html Agility Pack to convert from HTML to PDF.

Since the MigraDoc DOM is pretty basic, much of the conversion involves setting the Style of generated MigraDoc Paragraph instances. You can then configure these styles however you like. See the example project for more details.

Converting from Markdown to PDF

Import the MigraDocCore.Extensions.Markdown namespace and call AddMarkdown on a MigraDoc Section, HeaderFooter, Cell, or Paragraph instance:

var markdown = @"
	# This is a heading

	This is some **bold** ass text with a [link](https://www.example.com).

	- List Item 1
	- List Item 2
	- List Item 3

	Pretty cool huh?
";

section.AddMarkdown(markdown);
Converting from HTML to PDF

Import the MigraDocCore.Extensions.Html namespace and call AddHtml on a MigraDoc Section, HeaderFooter, Cell, or Paragraph instance:

var html = @"
	<h1>This is a heading</h1>

	<p>This is some **bold** ass text with a <a href='https://www.example.com'>link</a>.<p>

	<ul>
		<li>List Item 1</li>
		<li>List Item 2</li>
		<li>List Item 3</li>
	</ul>

	<p>Pretty cool huh?</p>
";

section.AddHtml(html);
What is supported?

The HTML converter currently supports the following:

  • Headings (H1 → H6) - Sets a "HeadingX" style on the generated paragraph
  • Paragraphs
  • Hyperlinks containing plain text or supported inline elements
  • Lists - Adds a paragraph with style "ListStart" before the list and one with style "ListEnd" after the list.
    • Unordered Lists - Each list item has the style "UnorderedList"
    • Ordered Lists - Each list item has the style "Ordered List"
  • Line breaks
  • Inline elements <strong>, <em>, <i>, <u>
  • Horizontal Rules - Adds a paragraph with style "HorizontalRule"

For more details, check out the specs.

Extending the HTML converter

To add a custom handler, create a new instance of HtmlConverter and add to its NodeHandlers dictionary. The key is the HTML element you wish to handle and the value is a Func<HtmlNode, DocumentObject, DocumentObject>.

The DocumentObject instance passed to the handler is the parent object in the MigraDoc DOM, usually a Section or Paragraph (you may need to cater for both). The return value should be the DocumentObject that was created. This will be passed as the parent for any child elements.

Here is the handler for processing a <strong> element:

nodeHandlers.Add("strong", (node, parent) => {
    var format = TextFormat.Bold;
    
    var formattedText = parent as FormattedText;
    if (formattedText != null)
    {
        return formattedText.Format(format);
    }

    // otherwise parent is paragraph or section
    return GetParagraph(parent).AddFormattedText(format);
});

In the above handler, we need to cater for nested format tags (e.g. <strong><em>some text</em></strong>) so we first attempt to cast the parent as FormattedText, otherwise fall back to adding formatted text to a Paragraph. Unfortunately such type checks are fairly frequent due to the limited relationships between objects in the MigraDoc DOM.

To use a custom converter instance use the Section.Add(string content, IConverter converter) extension in the MigraDocCore.Extensions namespace.

Note that an element handler should not process any inner HTML. For example the handler for a <h1> tag only adds a paragraph with a the style "Heading1", it does not add the text (there is a separate handler for processing text nodes).

Known issues

As of PdfSharpCore v1.3.62 there appears to be a bug where multiple paragraphs in footer sections are written in the same position, overlapping each other. This extension works around the bug by using a single paragraph in footer sections, adding newlines as needed. In practice, this means some paragraph-level formatting may not be fully supported in footers (e.g. headings, lists).

I have opened an issue in the PdfSharpCore repo to track the progress of this bug.

License

Licensed under the MIT License.

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on MigraDocCore.Extensions.Html:

Package Downloads
MigraDocCore.Extensions.Markdown

Adds support for Markdown in MigraDocCore documents using Markdig, ported from the MigraDoc.Extensions library by Vertigo Ventures.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.2 803 12/11/2023
1.0.1 161 12/11/2023
1.0.0 168 12/8/2023